Science.gov

Sample records for kegg pathways database

  1. KEGG: new perspectives on genomes, pathways, diseases and drugs

    PubMed Central

    Kanehisa, Minoru; Furumichi, Miho; Tanabe, Mao; Sato, Yoko; Morishima, Kanae

    2017-01-01

    KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes. Assigning functional meanings to genes and genomes both at the molecular and higher levels is the primary objective of the KEGG database project. Molecular-level functions are stored in the KO (KEGG Orthology) database, where each KO is defined as a functional ortholog of genes and proteins. Higher-level functions are represented by networks of molecular interactions, reactions and relations in the forms of KEGG pathway maps, BRITE hierarchies and KEGG modules. In the past the KO database was developed for the purpose of defining nodes of molecular networks, but now the content has been expanded and the quality improved irrespective of whether or not the KOs appear in the three molecular network databases. The newly introduced addendum category of the GENES database is a collection of individual proteins whose functions are experimentally characterized and from which an increasing number of KOs are defined. Furthermore, the DISEASE and DRUG databases have been improved by systematic analysis of drug labels for better integration of diseases and drugs with the KEGG molecular networks. KEGG is moving towards becoming a comprehensive knowledge base for both functional interpretation and practical application of genomic information. PMID:27899662

  2. Putting The Plant Metabolic Network pathway databases to work: going offline to gain new capabilities.

    PubMed

    Dreher, Kate

    2014-01-01

    Metabolic databases such as The Plant Metabolic Network/MetaCyc and KEGG PATHWAY are publicly accessible resources providing organism-specific information on reactions and metabolites. KEGG PATHWAY depicts metabolic networks as wired, electronic circuit-like maps, whereas the MetaCyc family of databases uses a canonical textbook-like representation. The first MetaCyc-based database for a plant species was AraCyc, which describes metabolism in the model plant Arabidopsis. This database was created over 10 years ago and has since then undergone extensive manual curation to reflect updated information on enzymes and pathways in Arabidopsis. This chapter describes accessing and using AraCyc and its underlying Pathway Tools software. Specifically, methods for (1) navigating Pathway Tools, (2) visualizing omics data and superimposing the data on a metabolic pathway map, and (3) creating pathways and pathway components are discussed.

  3. Drug-Path: a database for drug-induced pathways.

    PubMed

    Zeng, Hui; Qiu, Chengxiang; Cui, Qinghua

    2015-01-01

    Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches.

  4. Putative drug and vaccine target protein identification using comparative genomic analysis of KEGG annotated metabolic pathways of Mycoplasma hyopneumoniae.

    PubMed

    Damte, Dereje; Suh, Joo-Won; Lee, Seung-Jin; Yohannes, Sileshi Belew; Hossain, Md Akil; Park, Seung-Chun

    2013-07-01

    In the present study, a computational comparative and subtractive genomic/proteomic analysis aimed at the identification of putative therapeutic target and vaccine candidate proteins from Kyoto Encyclopedia of Genes and Genomes (KEGG) annotated metabolic pathways of Mycoplasma hyopneumoniae was performed for drug design and vaccine production pipelines against M.hyopneumoniae. The employed comparative genomic and metabolic pathway analysis with a predefined computational systemic workflow extracted a total of 41 annotated metabolic pathways from KEGG among which five were unique to M. hyopneumoniae. A total of 234 proteins were identified to be involved in these metabolic pathways. Although 125 non homologous and predicted essential proteins were found from the total that could serve as potential drug targets and vaccine candidates, additional prioritizing parameters characterize 21 proteins as vaccine candidate while druggability of each of the identified proteins evaluated by the DrugBank database prioritized 42 proteins suitable for drug targets.

  5. MPW : the metabolic pathways database.

    SciTech Connect

    Selkov, E., Jr.; Grechkin, Y.; Mikhailova, N.; Selkov, E.; Mathematics and Computer Science; Russian Academy of Sciences

    1998-01-01

    The Metabolic Pathways Database (MPW) (www.biobase.com/emphome.html/homepage. html.pags/pathways.html) a derivative of EMP (www.biobase.com/EMP) plays a fundamental role in the technology of metabolic reconstructions from sequenced genomes under the PUMA (www.mcs.anl.gov/home/compbio/PUMA/Production/ ReconstructedMetabolism/reconstruction.html), WIT (www.mcs.anl.gov/home/compbio/WIT/wit.html ) and WIT2 (beauty.isdn.msc.anl.gov/WIT2.pub/CGI/user.cgi) systems. In October 1997, it included some 2800 pathway diagrams covering primary and secondary metabolism, membrane transport, signal transduction pathways, intracellular traffic, translation and transcription. In the current public release of MPW (beauty.isdn.mcs.anl.gov/MPW), the encoding is based on the logical structure of the pathways and is represented by the objects commonly used in electronic circuit design. This facilitates drawing and editing the diagrams and makes possible automation of the basic simulation operations such as deriving stoichiometric matrices, rate laws, and, ultimately, dynamic models of metabolic pathways. Individual pathway diagrams, automatically derived from the original ASCII records, are stored as SGML instances supplemented by relational indices. An auxiliary database of compound names and structures, encoded in the SMILES format, is maintained to unambiguously connect the pathways to the chemical structures of their intermediates.

  6. Microarray and synchronization of neuronal differentiation with pathway changes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) databank in nerve growth factor-treated PC12 cells.

    PubMed

    Lin, Chih-Ming; Feng, Wayne

    2012-08-01

    The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database creates networks from interrelations between molecular biology and underlying chemical elements. This allows for analysis of biologic networks, genomic information, and higher-order functional information at a systems level. We performed microarray experiments and used the KEGG database, systems biology analysis, and annotation of pathway function to study nerve growth factor (NGF)-induced differentiation of PC12 cells. Cells were cultured to 70%-80% confluence, treated with NGF for 1 or 3 hours (h), and RNA was extracted. Stage 1 data analysis involved analysis of variance (ANOVA), and stage 2 involved cluster analysis and heat map generation. We identified 2020 NGF-induced PC12 genes (1038 at 1 h and 1554 at 3 h). Results showed changes in gene expression over time. We compared these genes with 6035 genes from the KEGG database. Cross-matching resulted in 830 genes. Among these, we identified 395 altered genes (155 at 1 h and 301 at 3 h; 2-fold increase from 1 h to 3 h). We identified 191 biologic pathways in the KEGG database; the top 15 showed correlations with neuronal differentiation (mitogen-activated protein kinase [MAPK] pathway: 35 genes at 1 h, 54 genes at 3 h; genes associated with axonal guidance: 12 at 1 h, 26 at 3 h; Wnt pathway: 16 at 1 h, 25 at 3 h; neurotrophin pathway: 4 at 1 h, 14 at 3 h). Thus, we identified changes in neuronal differentiation pathways with the KEGG database, which were synchronized with NGF-induced differentiation.

  7. Construction of gene/protein interaction networks for primary myelofibrosis and KEGG pathway-enrichment analysis of molecular compounds.

    PubMed

    Sun, C G; Cao, X J; Zhou, C; Liu, L J; Feng, F B; Liu, R J; Zhuang, J; Li, Y J

    2015-12-08

    The objective of this study was the development of a gene/protein interaction network for primary myelofibrosis based on gene expression, and the enrichment analysis of KEGG pathways underlying the molecular complexes in this network. To achieve this, genes involved in primary myelofibrosis were selected from the OMIM database. A gene/protein interaction network for primary myelofibrosis was obtained through Cytoscape with the literature mining performed using the Agilent Literature Search plugin. The molecular complexes in the network were detected by ClusterViz plugin and KEGG pathway enrichment of molecular complexes was performed using DAVID online. We found 75 genes associated with primary myelofibrosis in the OMIM database. The gene/protein interaction network of primary myelofibrosis contained 608 nodes, 2086 edges, and 4 molecular complexes with a correlation integral value greater than 4. Molecular complexes involved in KEGG pathways are related to cytokine regulation, immune function regulation, ECM-receptor interaction, focal adhesion, actin cytoskeleton regulation, cell adhesion molecules, and other biological behavior of tumors, which can provide a reliable direction for the treatment of primary myelofibrosis and the bioinformatic foundation for further understanding the molecular mechanisms of this disease.

  8. KEGG: Kyoto Encyclopedia of Genes and Genomes.

    PubMed

    Ogata, H; Goto, S; Sato, K; Fujibuchi, W; Bono, H; Kanehisa, M

    1999-01-01

    Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

  9. KEGG as a reference resource for gene and protein annotation

    PubMed Central

    Kanehisa, Minoru; Sato, Yoko; Kawashima, Masayuki; Furumichi, Miho; Tanabe, Mao

    2016-01-01

    KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks. PMID:26476454

  10. KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model.

    PubMed

    Du, Junli; Yuan, Zhifa; Ma, Ziwei; Song, Jiuzhou; Xie, Xiaoli; Chen, Yulin

    2014-07-29

    The dynamic impact approach (DIA) represents an alternative to overrepresentation analysis (ORA) for functional analysis of time-course experiments or those involving multiple treatments. The DIA can be used to estimate the biological impact of the differentially expressed genes (DEGs) associated with particular biological functions, for example, as represented by the Kyoto encyclopedia of genes and genomes (KEGG) annotations. However, the DIA does not take into account the correlated dependence structure of the KEGG pathway hierarchy. We have developed herein a path analysis model (KEGG-PATH) to subdivide the total effect of each KEGG pathway into the direct effect and indirect effect by taking into account not only each KEGG pathway itself, but also the correlation with its related pathways. In addition, this work also attempts to preliminarily estimate the impact direction of each KEGG pathway by a gradient analysis method from principal component analysis (PCA). As a result, the advantage of the KEGG-PATH model is demonstrated through the functional analysis of the bovine mammary transcriptome during lactation.

  11. KEGG: kyoto encyclopedia of genes and genomes.

    PubMed

    Kanehisa, M; Goto, S

    2000-01-01

    KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).

  12. EXPath: a database of comparative expression analysis inferring metabolic pathways for plants

    PubMed Central

    2015-01-01

    Background In general, the expression of gene alters conditionally to catalyze a specific metabolic pathway. Microarray-based datasets have been massively produced to monitor gene expression levels in parallel with numerous experimental treatments. Although several studies facilitated the linkage of gene expression data and metabolic pathways, none of them are amassed for plants. Moreover, advanced analysis such as pathways enrichment or how genes express under different conditions is not rendered. Description Therefore, EXPath was developed to not only comprehensively congregate the public microarray expression data from over 1000 samples in biotic stress, abiotic stress, and hormone secretion but also allow the usage of this abundant resource for coexpression analysis and differentially expression genes (DEGs) identification, finally inferring the enriched KEGG pathways and gene ontology (GO) terms of three model plants: Arabidopsis thaliana, Oryza sativa, and Zea mays. Users can access the gene expression patterns of interest under various conditions via five main functions (Gene Search, Pathway Search, DEGs Search, Pathways/GO Enrichment, and Coexpression analysis) in EXPath, which are presented by a user-friendly interface and valuable for further research. Conclusions In conclusion, EXPath, freely available at http://expath.itps.ncku.edu.tw, is a database resource that collects and utilizes gene expression profiles derived from microarray platforms under various conditions to infer metabolic pathways for plants. PMID:25708775

  13. The Use of Gene Ontology Term and KEGG Pathway Enrichment for Analysis of Drug Half-Life

    PubMed Central

    Chen, Lei; Lu, Jing; Kong, XiangYin; Huang, Tao; Li, HaiPeng

    2016-01-01

    A drug’s biological half-life is defined as the time required for the human body to metabolize or eliminate 50% of the initial drug dosage. Correctly measuring the half-life of a given drug is helpful for the safe and accurate usage of the drug. In this study, we investigated which gene ontology (GO) terms and biological pathways were highly related to the determination of drug half-life. The investigated drugs, with known half-lives, were analyzed based on their enrichment scores for associated GO terms and KEGG pathways. These scores indicate which GO terms or KEGG pathways the drug targets. The feature selection method, minimum redundancy maximum relevance, was used to analyze these GO terms and KEGG pathways and to identify important GO terms and pathways, such as sodium-independent organic anion transmembrane transporter activity (GO:0015347), monoamine transmembrane transporter activity (GO:0008504), negative regulation of synaptic transmission (GO:0050805), neuroactive ligand-receptor interaction (hsa04080), serotonergic synapse (hsa04726), and linoleic acid metabolism (hsa00591), among others. This analysis confirmed our results and may show evidence for a new method in studying drug half-lives and building effective computational methods for the prediction of drug half-lives. PMID:27780226

  14. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database.

  15. Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data.

    PubMed

    Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/.

  16. Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data

    PubMed Central

    Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020

  17. Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

    PubMed

    Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

    2013-08-01

    Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association

  18. XTalkDB: a database of signaling pathway crosstalk

    PubMed Central

    Sam, Sarah A.; Teel, Joelle; Tegge, Allison N.; Bharadwaj, Aditya; Murali, T.M.

    2017-01-01

    Analysis of signaling pathways and their crosstalk is a cornerstone of systems biology. Thousands of papers have been published on these topics. Surprisingly, there is no database that carefully and explicitly documents crosstalk between specific pairs of signaling pathways. We have developed XTalkDB (http://www.xtalkdb.org) to fill this very important gap. XTalkDB contains curated information for 650 pairs of pathways from over 1600 publications. In addition, the database reports the molecular components (e.g. proteins, hormones, microRNAs) that mediate crosstalk between a pair of pathways and the species and tissue in which the crosstalk was observed. The XTalkDB website provides an easy-to-use interface for scientists to browse crosstalk information by querying one or more pathways or molecules of interest. PMID:27899583

  19. Carcinogenic effects of oil dispersants: A KEGG pathway-based RNA-seq study of human airway epithelial cells.

    PubMed

    Liu, Yao-Zhong; Zhang, Lei; Roy-Engel, Astrid M; Saito, Shigeki; Lasky, Joseph A; Wang, Guangdi; Wang, He

    2017-02-20

    The health impacts of the BP oil spill are yet to be further revealed as the toxicological effects of oil products and dispersants on human respiratory system may be latent and complex, and hence difficult to study and follow up. Here we performed RNA-seq analyses of a system of human airway epithelial cells treated with the BP crude oil and/or dispersants Corexit 9500 and Corexit 9527 that were used to help break up the oil spill. Based on the RNA-seq data, we then systemically analyzed the transcriptomic perturbations of the cells at the KEGG pathway level using two pathway-based analysis tools, GAGE (generally applicable gene set enrichment) and GSNCA (Gene Sets Net Correlations Analysis). Our results suggested a pattern of change towards carcinogenesis for the treated cells marked by upregulation of ribosomal biosynthesis (hsa03008) (p=1.97E-13), protein processing (hsa04141) (p=4.09E-7), Wnt signaling (hsa04310) (p=6.76E-3), neurotrophin signaling (hsa04722) (p=7.73E-3) and insulin signaling (hsa04910) (p=1.16E-2) pathways under the dispersant Corexit 9527 treatment, as identified by GAGE analysis. Furthermore, through GSNCA analysis, we identified gene co-expression changes for several KEGG cancer pathways, including small cell lung cancer pathway (hsa05222, p=9.99E-5), under various treatments of oil/dispersant, especially the mixture of oil and Corexit 9527. Overall, our results suggested carcinogenic effects of dispersants (in particular Corexit 9527) and their mixtures with the BP crude oil, and provided further support for more stringent safety precautions and regulations for operations involving long-term respiratory exposure to oil and dispersants.

  20. DEOP: a database on osmoprotectants and associated pathways

    PubMed Central

    Bougouffa, Salim; Radovanovic, Aleksandar; Essack, Magbubah; Bajic, Vladimir B.

    2014-01-01

    Microorganisms are known to counteract salt stress through salt influx or by the accumulation of osmoprotectants (also called compatible solutes). Understanding the pathways that synthesize and/or breakdown these osmoprotectants is of interest to studies of crops halotolerance and to biotechnology applications that use microbes as cell factories for production of biomass or commercial chemicals. To facilitate the exploration of osmoprotectants, we have developed the first online resource, ‘Dragon Explorer of Osmoprotection associated Pathways’ (DEOP) that gathers and presents curated information about osmoprotectants, complemented by information about reactions and pathways that use or affect them. A combined total of 141 compounds were confirmed osmoprotectants, which were matched to 1883 reactions and 834 pathways. DEOP can also be used to map genes or microbial genomes to potential osmoprotection-associated pathways, and thus link genes and genomes to other associated osmoprotection information. Moreover, DEOP provides a text-mining utility to search deeper into the scientific literature for supporting evidence or for new associations of osmoprotectants to pathways, reactions, enzymes, genes or organisms. Two case studies are provided to demonstrate the usefulness of DEOP. The system can be accessed at. Database URL: http://www.cbrc.kaust.edu.sa/deop/ PMID:25326239

  1. Chemogenomic analysis of neuronal differentiation with pathway changes in PC12 cells.

    PubMed

    Lin, Jack Yu-Shih; Wu, Chien Liang; Liao, Chia Nan; Higuchi, Akon; Ling, Qing-Dong

    2016-01-01

    The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database creates networks from interrelations between molecular biology and underlying chemical elements. This allows for analysis of biologic networks, genomic information, and higher-order functional information at a system level. Through high throughput experiments and system biology analysis, we investigated the genes and pathways associated with NGF induced neuronal differentiation. We performed microarray experiments and used the KEGG database, system biology analysis, and annotation of pathway functions to study NGF-induced differentiation in PC12 cells. We identified 2020 NGF-induced genes with altered expressions over time. Cross-matching with the KEGG database revealed 830 genes; among which, 395 altered genes were found to have a 2-fold increase in gene expression over a two-hour period. We then identified 191 associated biologic pathways in the KEGG database; the top 15 pathways showed correlation with neural differentiation. These included the neurotrophin pathways, mitogen-activated protein kinase (MAPK) pathways, genes associated with axonal guidance and the Wnt pathways. The activation of these pathways synchronized with nerve growth factor (NGF)-induced differentiation in PC12 cells. In summary, we have established a model system that allows one to systematically characterize the functional pathway changes in a group of neuronal population after an external stimulus.

  2. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.)

    PubMed Central

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars. PMID:27434138

  3. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

    PubMed Central

    Caspi, Ron; Altman, Tomer; Dreher, Kate; Fulcher, Carol A.; Subhraveti, Pallavi; Keseler, Ingrid M.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Ong, Quang; Paley, Suzanne; Pujar, Anuradha; Shearer, Alexander G.; Travers, Michael; Weerasinghe, Deepika; Zhang, Peifen; Karp, Peter D.

    2012-01-01

    The MetaCyc database (http://metacyc.org/) provides a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains more than 1800 pathways derived from more than 30 000 publications, and is the largest curated collection of metabolic pathways currently available. Most reactions in MetaCyc pathways are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes and literature citations. BioCyc (http://biocyc.org/) is a collection of more than 1700 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference database, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs contain additional features, including predicted operons, transport systems and pathway-hole fillers. The BioCyc website and Pathway Tools software offer many tools for querying and analysis of PGDBs, including Omics Viewers and comparative analysis. New developments include a zoomable web interface for diagrams; flux-balance analysis model generation from PGDBs; web services; and a new tool called Web Groups. PMID:22102576

  4. Automated workflow-based exploitation of pathway databases provides new insights into genetic associations of metabolite profiles

    PubMed Central

    2013-01-01

    Background Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets. Results Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression. Conclusions We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS

  5. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

    PubMed Central

    Caspi, Ron; Billington, Richard; Ferrer, Luciana; Foerster, Hartmut; Fulcher, Carol A.; Keseler, Ingrid M.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S.; Karp, Peter D.

    2016-01-01

    The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life. The majority of MetaCyc pathways are small-molecule metabolic pathways that have been experimentally determined. MetaCyc contains more than 2400 pathways derived from >46 000 publications, and is the largest curated collection of metabolic pathways. BioCyc (BioCyc.org) is a collection of 5700 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems, and pathway-hole fillers. The BioCyc website offers a variety of tools for querying and analyzing PGDBs, including Omics Viewers and tools for comparative analysis. This article provides an update of new developments in MetaCyc and BioCyc during the last two years, including addition of Gibbs free energy values for compounds and reactions; redesign of the primary gene/protein page; addition of a tool for creating diagrams containing multiple linked pathways; several new search capabilities, including searching for genes based on sequence patterns, searching for databases based on an organism's phenotypes, and a cross-organism search; and a metabolite identifier translation service. PMID:26527732

  6. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.

    PubMed

    Caspi, Ron; Billington, Richard; Ferrer, Luciana; Foerster, Hartmut; Fulcher, Carol A; Keseler, Ingrid M; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S; Karp, Peter D

    2016-01-04

    The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life. The majority of MetaCyc pathways are small-molecule metabolic pathways that have been experimentally determined. MetaCyc contains more than 2400 pathways derived from >46,000 publications, and is the largest curated collection of metabolic pathways. BioCyc (BioCyc.org) is a collection of 5700 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems, and pathway-hole fillers. The BioCyc website offers a variety of tools for querying and analyzing PGDBs, including Omics Viewers and tools for comparative analysis. This article provides an update of new developments in MetaCyc and BioCyc during the last two years, including addition of Gibbs free energy values for compounds and reactions; redesign of the primary gene/protein page; addition of a tool for creating diagrams containing multiple linked pathways; several new search capabilities, including searching for genes based on sequence patterns, searching for databases based on an organism's phenotypes, and a cross-organism search; and a metabolite identifier translation service.

  7. KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

    PubMed Central

    2013-01-01

    Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of

  8. Database Constraints Applied to Metabolic Pathway Reconstruction Tools

    PubMed Central

    Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

    2014-01-01

    Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745

  9. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases

    PubMed Central

    Caspi, Ron; Altman, Tomer; Billington, Richard; Dreher, Kate; Foerster, Hartmut; Fulcher, Carol A.; Holland, Timothy A.; Keseler, Ingrid M.; Kothari, Anamika; Kubo, Aya; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S.; Weerasinghe, Deepika; Zhang, Peifen; Karp, Peter D.

    2014-01-01

    The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible database describing metabolic pathways and enzymes from all domains of life. MetaCyc pathways are experimentally determined, mostly small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains >2100 pathways derived from >37 000 publications, and is the largest curated collection of metabolic pathways currently available. BioCyc (BioCyc.org) is a collection of >3000 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems and pathway-hole fillers. Additions to BioCyc over the past 2 years include YeastCyc, a PGDB for Saccharomyces cerevisiae, and 891 new genomes from the Human Microbiome Project. The BioCyc Web site offers a variety of tools for querying and analysis of PGDBs, including Omics Viewers and tools for comparative analysis. New developments include atom mappings in reactions, a new representation of glycan degradation pathways, improved compound structure display, better coverage of enzyme kinetic data, enhancements of the Web Groups functionality, improvements to the Omics viewers, a new representation of the Enzyme Commission system and, for the desktop version of the software, the ability to save display states. PMID:24225315

  10. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases.

    PubMed

    Caspi, Ron; Altman, Tomer; Billington, Richard; Dreher, Kate; Foerster, Hartmut; Fulcher, Carol A; Holland, Timothy A; Keseler, Ingrid M; Kothari, Anamika; Kubo, Aya; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S; Weerasinghe, Deepika; Zhang, Peifen; Karp, Peter D

    2014-01-01

    The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible database describing metabolic pathways and enzymes from all domains of life. MetaCyc pathways are experimentally determined, mostly small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains >2100 pathways derived from >37,000 publications, and is the largest curated collection of metabolic pathways currently available. BioCyc (BioCyc.org) is a collection of >3000 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems and pathway-hole fillers. Additions to BioCyc over the past 2 years include YeastCyc, a PGDB for Saccharomyces cerevisiae, and 891 new genomes from the Human Microbiome Project. The BioCyc Web site offers a variety of tools for querying and analysis of PGDBs, including Omics Viewers and tools for comparative analysis. New developments include atom mappings in reactions, a new representation of glycan degradation pathways, improved compound structure display, better coverage of enzyme kinetic data, enhancements of the Web Groups functionality, improvements to the Omics viewers, a new representation of the Enzyme Commission system and, for the desktop version of the software, the ability to save display states.

  11. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

    PubMed Central

    Caspi, Ron; Altman, Tomer; Dale, Joseph M.; Dreher, Kate; Fulcher, Carol A.; Gilham, Fred; Kaipa, Pallavi; Karthikeyan, Athikkattuvalasu S.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Paley, Suzanne; Popescu, Liviu; Pujar, Anuradha; Shearer, Alexander G.; Zhang, Peifen; Karp, Peter D.

    2010-01-01

    The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. With more than 1400 pathways, MetaCyc is the largest collection of metabolic pathways currently available. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations. BioCyc (BioCyc.org) is a collection of more than 500 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs also contain additional features, such as predicted operons, transport systems, and pathway hole-fillers. The BioCyc Web site offers several tools for the analysis of the PGDBs, including Omics Viewers that enable visualization of omics datasets on two different genome-scale diagrams and tools for comparative analysis. The BioCyc PGDBs generated by SRI are offered for adoption by any party interested in curation of metabolic, regulatory, and genome-related information about an organism. PMID:19850718

  12. SMPDB 2.0: big improvements to the Small Molecule Pathway Database.

    PubMed

    Jewison, Timothy; Su, Yilu; Disfany, Fatemeh Miri; Liang, Yongjie; Knox, Craig; Maciejewski, Adam; Poelzer, Jenna; Huynh, Jessica; Zhou, You; Arndt, David; Djoumbou, Yannick; Liu, Yifeng; Deng, Lu; Guo, An Chi; Han, Beomsoo; Pon, Allison; Wilson, Michael; Rafatnia, Shahrzad; Liu, Philip; Wishart, David S

    2014-01-01

    The Small Molecule Pathway Database (SMPDB, http://www.smpdb.ca) is a comprehensive, colorful, fully searchable and highly interactive database for visualizing human metabolic, drug action, drug metabolism, physiological activity and metabolic disease pathways. SMPDB contains >600 pathways with nearly 75% of its pathways not found in any other database. All SMPDB pathway diagrams are extensively hyperlinked and include detailed information on the relevant tissues, organs, organelles, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Since its last release in 2010, SMPDB has undergone substantial upgrades and significant expansion. In particular, the total number of pathways in SMPDB has grown by >70%. Additionally, every previously entered pathway has been completely redrawn, standardized, corrected, updated and enhanced with additional molecular or cellular information. Many SMPDB pathways now include transporter proteins as well as much more physiological, tissue, target organ and reaction compartment data. Thanks to the development of a standardized pathway drawing tool (called PathWhiz) all SMPDB pathways are now much more easily drawn and far more rapidly updated. PathWhiz has also allowed all SMPDB pathways to be saved in a BioPAX format. Significant improvements to SMPDB's visualization interface now make the browsing, selection, recoloring and zooming of pathways far easier and far more intuitive. Because of its utility and breadth of coverage, SMPDB is now integrated into several other databases including HMDB and DrugBank.

  13. annot8r: GO, EC and KEGG annotation of EST datasets

    PubMed Central

    Schmid, Ralf; Blaxter, Mark L

    2008-01-01

    Background The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways. Results annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools. Conclusion annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST

  14. VisANT 3.0: new modules for pathway visualization, editing, prediction and construction.

    PubMed

    Hu, Zhenjun; Ng, David M; Yamada, Takuji; Chen, Chunnuan; Kawashima, Shuichi; Mellor, Joe; Linghu, Bolan; Kanehisa, Minoru; Stuart, Joshua M; DeLisi, Charles

    2007-07-01

    With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.

  15. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases.

    PubMed

    Xie, Chen; Mao, Xizeng; Huang, Jiaju; Ding, Yang; Wu, Jianmin; Dong, Shan; Kong, Lei; Gao, Ge; Li, Chuan-Yun; Wei, Liping

    2011-07-01

    High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.

  16. Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

    PubMed Central

    Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

    2003-01-01

    Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355

  17. Creation of a genome-wide metabolic pathway database for Populus trichocarpa using a new approach for reconstruction and curation of metabolic pathways for plants.

    PubMed

    Zhang, Peifen; Dreher, Kate; Karthikeyan, A; Chi, Anjo; Pujar, Anuradha; Caspi, Ron; Karp, Peter; Kirkup, Vanessa; Latendresse, Mario; Lee, Cynthia; Mueller, Lukas A; Muller, Robert; Rhee, Seung Yon

    2010-08-01

    Metabolic networks reconstructed from sequenced genomes or transcriptomes can help visualize and analyze large-scale experimental data, predict metabolic phenotypes, discover enzymes, engineer metabolic pathways, and study metabolic pathway evolution. We developed a general approach for reconstructing metabolic pathway complements of plant genomes. Two new reference databases were created and added to the core of the infrastructure: a comprehensive, all-plant reference pathway database, PlantCyc, and a reference enzyme sequence database, RESD, for annotating metabolic functions of protein sequences. PlantCyc (version 3.0) includes 714 metabolic pathways and 2,619 reactions from over 300 species. RESD (version 1.0) contains 14,187 literature-supported enzyme sequences from across all kingdoms. We used RESD, PlantCyc, and MetaCyc (an all-species reference metabolic pathway database), in conjunction with the pathway prediction software Pathway Tools, to reconstruct a metabolic pathway database, PoplarCyc, from the recently sequenced genome of Populus trichocarpa. PoplarCyc (version 1.0) contains 321 pathways with 1,807 assigned enzymes. Comparing PoplarCyc (version 1.0) with AraCyc (version 6.0, Arabidopsis [Arabidopsis thaliana]) showed comparable numbers of pathways distributed across all domains of metabolism in both databases, except for a higher number of AraCyc pathways in secondary metabolism and a 1.5-fold increase in carbohydrate metabolic enzymes in PoplarCyc. Here, we introduce these new resources and demonstrate the feasibility of using them to identify candidate enzymes for specific pathways and to analyze metabolite profiling data through concrete examples. These resources can be searched by text or BLAST, browsed, and downloaded from our project Web site (http://plantcyc.org).

  18. Data, information, knowledge and principle: back to metabolism in KEGG.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Sato, Yoko; Kawashima, Masayuki; Furumichi, Miho; Tanabe, Mao

    2014-01-01

    In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

  19. FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca

    PubMed Central

    Naithani, Sushma; Partipilo, Christina M.; Raja, Rajani; Elser, Justin L.; Jaiswal, Pankaj

    2016-01-01

    FragariaCyc is a strawberry-specific cellular metabolic network based on the annotated genome sequence of Fragaria vesca L. ssp. vesca, accession Hawaii 4. It was built on the Pathway-Tools platform using MetaCyc as the reference. The experimental evidences from published literature were used for supporting/editing existing entities and for the addition of new pathways, enzymes, reactions, compounds, and small molecules in the database. To date, FragariaCyc comprises 66 super-pathways, 488 unique pathways, 2348 metabolic reactions, 3507 enzymes, and 2134 compounds. In addition to searching and browsing FragariaCyc, researchers can compare pathways across various plant metabolic networks and analyze their data using Omics Viewer tool. We view FragariaCyc as a resource for the community of researchers working with strawberry and related fruit crops. It can help understanding the regulation of overall metabolism of strawberry plant during development and in response to diseases and abiotic stresses. FragariaCyc is available online at http://pathways.cgrb.oregonstate.edu. PMID:26973684

  20. FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca.

    PubMed

    Naithani, Sushma; Partipilo, Christina M; Raja, Rajani; Elser, Justin L; Jaiswal, Pankaj

    2016-01-01

    FragariaCyc is a strawberry-specific cellular metabolic network based on the annotated genome sequence of Fragaria vesca L. ssp. vesca, accession Hawaii 4. It was built on the Pathway-Tools platform using MetaCyc as the reference. The experimental evidences from published literature were used for supporting/editing existing entities and for the addition of new pathways, enzymes, reactions, compounds, and small molecules in the database. To date, FragariaCyc comprises 66 super-pathways, 488 unique pathways, 2348 metabolic reactions, 3507 enzymes, and 2134 compounds. In addition to searching and browsing FragariaCyc, researchers can compare pathways across various plant metabolic networks and analyze their data using Omics Viewer tool. We view FragariaCyc as a resource for the community of researchers working with strawberry and related fruit crops. It can help understanding the regulation of overall metabolism of strawberry plant during development and in response to diseases and abiotic stresses. FragariaCyc is available online at http://pathways.cgrb.oregonstate.edu.

  1. A New Methodology to Associate SNPs with Human Diseases According to Their Pathway Related Context

    PubMed Central

    Bakir-Gungor, Burcu; Sezerman, Osman Ugur

    2011-01-01

    Genome-wide association studies (GWAS) with hundreds of żthousands of single nucleotide polymorphisms (SNPs) are popular strategies to reveal the genetic basis of human complex diseases. Despite many successes of GWAS, it is well recognized that new analytical approaches have to be integrated to achieve their full potential. Starting with a list of SNPs, found to be associated with disease in GWAS, here we propose a novel methodology to devise functionally important KEGG pathways through the identification of genes within these pathways, where these genes are obtained from SNP analysis. Our methodology is based on functionalization of important SNPs to identify effected genes and disease related pathways. We have tested our methodology on WTCCC Rheumatoid Arthritis (RA) dataset and identified: i) previously known RA related KEGG pathways (e.g., Toll-like receptor signaling, Jak-STAT signaling, Antigen processing, Leukocyte transendothelial migration and MAPK signaling pathways); ii) additional KEGG pathways (e.g., Pathways in cancer, Neurotrophin signaling, Chemokine signaling pathways) as associated with RA. Furthermore, these newly found pathways included genes which are targets of RA-specific drugs. Even though GWAS analysis identifies 14 out of 83 of those drug target genes; newly found functionally important KEGG pathways led to the discovery of 25 out of 83 genes, known to be used as drug targets for the treatment of RA. Among the previously known pathways, we identified additional genes associated with RA (e.g. Antigen processing and presentation, Tight junction). Importantly, within these pathways, the associations between some of these additionally found genes, such as HLA-C, HLA-G, PRKCQ, PRKCZ, TAP1, TAP2 and RA were verified by either OMIM database or by literature retrieved from the NCBI PubMed module. With the whole-genome sequencing on the horizon, we show that the full potential of GWAS can be achieved by integrating pathway and network

  2. Exploring consumer exposure pathways and patterns of use for chemicals in the environment through the Chemical/Product Categories Database

    EPA Pesticide Factsheets

    Exploring consumer exposure pathways and patterns of use for chemicals in the environment through the Chemical/Product Categories Database (CPCat) (Presented by: Kathie Dionisio, Sc.D., NERL, US EPA, Research Triangle Park, NC (1/23/2014).

  3. A Novel Method for Pathway Identification Based on Attractor and Crosstalk in Polyarticular Juvenile Idiopathic Arthritis

    PubMed Central

    Wang, Yuanji; Lin, Shunhua; Li, Changhui; Li, Yizhao; Chen, Lei; Wang, Yingzhen

    2016-01-01

    Background Juvenile idiopathic arthritis (JIA) is one of the most common inflammatory disorders of unknown etiology. We introduced a novel method to identify dysregulated pathways associated with polyarticular JIA (pJIA). Material/Methods Gene expression profiling of 61 children with pJIA and 59 healthy controls were collected from E-GEOD-13849; 300 pathways were obtained from Kyoto Encyclopedia of Genes and Genomes (KEGG) database and 787,896 protein-protein interaction sets were gathered from the Retrieval of Interacting Genes. Attractor and crosstalk were designed to complement each other to increase the integrity of pathways assessment. Then, impact factor was used to assess the interactions inter-pathways, and RP-value was used to evaluate the comprehensive influential ability of attractors. Results There were seven attractors with p<0.01 and 14 pathways with RP<0.01. Finally, two significantly dysfunctional pathways were found, which were related to pJIA progression: p53 signaling pathway (KEGG ID: 04115) and non-alcoholic fatty liver disease (NAFLD) (KEGG ID: 04932). Conclusions A novel approach that identified the dysregulated pathways in pJIA was constructed based on attractor and crosstalk. The new process is expected to be efficient in the upcoming era of medicine. PMID:27804927

  4. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-05

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

  5. miRPathDB: a new dictionary on microRNAs and target pathways.

    PubMed

    Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas

    2017-01-04

    In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/ With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways.

  6. miRPathDB: a new dictionary on microRNAs and target pathways

    PubMed Central

    Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas

    2017-01-01

    In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/. With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways. PMID:27742822

  7. DemaDb: an integrated dematiaceous fungal genomes database.

    PubMed

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my.

  8. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  9. Hedgehog Signaling Pathway Database: a repository of current annotation efforts and resources for the Hh research community.

    PubMed

    Hervold, Kieran; Martin, Andrew; Kirkpatrick, Roger A; Mc Kenna, Paul F; Ramirez-Weber, F A

    2007-01-01

    The Hedgehog Signaling Pathway Database is a curated repository of information pertaining to the Hedgehog developmental pathway. It was designed to provide centralized access to a wide range of relevant information in an organism-agnostic manner. Data are provided for all genes and gene targets known to be involved in the Hh pathway across various organisms. The data provided include DNA and protein sequences as well as domain structure motifs. All known human diseases associated with the Hh pathway are indexed including experimental data on therapeutic agents and their molecular targets. Hh researchers will find useful information on relevant protocols, tissue cell lines and reagents used in current Hh research projects. Curated content is also provided for publications, grants and patents relating to the Hh pathway. The database can be accessed at http://www.hedgehog.sfsu.edu.

  10. Enhancing a Pathway-Genome Database (PGDB) to Capture Subcellular Localization of Metabolites and Enzymes: The Nucleotide-Sugar Biosynthetic Pathways of Populus trichocarpa

    SciTech Connect

    Nag, A.; Karpinets, T. V.; Chang, C. H.; Bar-Peled, M.

    2012-01-01

    Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s).

  11. aglgenes, A curated and searchable database of archaeal N-glycosylation pathway components.

    PubMed

    Godin, Noa; Eichler, Jerry

    2014-01-01

    Whereas N-glycosylation is a posttranslational modification performed across evolution, the archaeal version of this protein-processing event presents a degree of diversity not seen in either bacteria or eukarya. Accordingly, archaeal N-glycosylation relies on a large number of enzymes that are often species-specific or restricted to a select group of species. As such, there is a need for an organized platform upon which amassing information about archaeal glycosylation (agl) genes can rest. Accordingly, the aglgenes database provides detailed descriptions of experimentally characterized archaeal N-glycosyation pathway components. For each agl gene, genomic information, supporting literature and relevant external links are provided at a functional intuitive web-interface designed for data browsing. Routine updates ensure that novel experimental information on genes and proteins contributing to archaeal N-glycosylation is incorporated into aglgenes in a timely manner. As such, aglgenes represents a specialized resource for sharing validated experimental information online, providing support for workers in the field of archaeal protein glycosylation. Database URL: www.bgu.ac.il/aglgenes.

  12. Multiomics in Grape Berry Skin Revealed Specific Induction of the Stilbene Synthetic Pathway by Ultraviolet-C Irradiation1

    PubMed Central

    Suzuki, Mami; Nakabayashi, Ryo; Ogata, Yoshiyuki; Sakurai, Nozomu; Tokimatsu, Toshiaki; Goto, Susumu; Suzuki, Makoto; Jasinski, Michal; Martinoia, Enrico; Otagaki, Shungo; Matsumoto, Shogo; Saito, Kazuki; Shiratake, Katsuhiro

    2015-01-01

    Grape (Vitis vinifera) accumulates various polyphenolic compounds, which protect against environmental stresses, including ultraviolet-C (UV-C) light and pathogens. In this study, we looked at the transcriptome and metabolome in grape berry skin after UV-C irradiation, which demonstrated the effectiveness of omics approaches to clarify important traits of grape. We performed transcriptome analysis using a genome-wide microarray, which revealed 238 genes up-regulated more than 5-fold by UV-C light. Enrichment analysis of Gene Ontology terms showed that genes encoding stilbene synthase, a key enzyme for resveratrol synthesis, were enriched in the up-regulated genes. We performed metabolome analysis using liquid chromatography-quadrupole time-of-flight mass spectrometry, and 2,012 metabolite peaks, including unidentified peaks, were detected. Principal component analysis using the peaks showed that only one metabolite peak, identified as resveratrol, was highly induced by UV-C light. We updated the metabolic pathway map of grape in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and in the KaPPA-View 4 KEGG system, then projected the transcriptome and metabolome data on a metabolic pathway map. The map showed specific induction of the resveratrol synthetic pathway by UV-C light. Our results showed that multiomics is a powerful tool to elucidate the accumulation mechanisms of secondary metabolites, and updated systems, such as KEGG and KaPPA-View 4 KEGG for grape, can support such studies. PMID:25761715

  13. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  14. Xtalk: a path-based approach for identifying crosstalk between signaling pathways

    PubMed Central

    Tegge, Allison N.; Sharp, Nicholas; Murali, T. M.

    2016-01-01

    Motivation: Cells communicate with their environment via signal transduction pathways. On occasion, the activation of one pathway can produce an effect downstream of another pathway, a phenomenon known as crosstalk. Existing computational methods to discover such pathway pairs rely on simple overlap statistics. Results: We present Xtalk, a path-based approach for identifying pairs of pathways that may crosstalk. Xtalk computes the statistical significance of the average length of multiple short paths that connect receptors in one pathway to the transcription factors in another. By design, Xtalk reports the precise interactions and mechanisms that support the identified crosstalk. We applied Xtalk to signaling pathways in the KEGG and NCI-PID databases. We manually curated a gold standard set of 132 crosstalking pathway pairs and a set of 140 pairs that did not crosstalk, for which Xtalk achieved an area under the receiver operator characteristic curve of 0.65, a 12% improvement over the closest competing approach. The area under the receiver operator characteristic curve varied with the pathway, suggesting that crosstalk should be evaluated on a pathway-by-pathway level. We also analyzed an extended set of 658 pathway pairs in KEGG and to a set of more than 7000 pathway pairs in NCI-PID. For the top-ranking pairs, we found substantial support in the literature (81% for KEGG and 78% for NCI-PID). We provide examples of networks computed by Xtalk that accurately recovered known mechanisms of crosstalk. Availability and implementation: The XTALK software is available at http://bioinformatics.cs.vt.edu/~murali/software. Crosstalk networks are available at http://graphspace.org/graphs?tags=2015-bioinformatics-xtalk. Contact: ategge@vt.edu, murali@cs.vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26400040

  15. iPathCons and iPathDB: an improved insect pathway construction tool and the database

    PubMed Central

    Zhang, Zan; Yin, Chuanlin; Liu, Ying; Jie, Wencai; Lei, Wenjie; Li, Fei

    2014-01-01

    Insects are one of the most successful animal groups on earth. Some insects, such as the silkworm and honeybee, are beneficial to humans, whereas others are notorious pests of crops. At present, the genomes of 38 insects have been sequenced and made publically available. In addition, the transcriptomes of dozens of insects have been sequenced. As gene data rapidly accumulate, constructing the pathway of molecular interactions becomes increasingly important for entomological research. Here, we developed an improved tool, iPathCons, for knowledge-based construction of pathways from the transcriptomes or the official gene sets of genomes. Considering the high evolution diversity in insects, iPathCons uses a voting system for Kyoto Encyclopedia of Genes and Genomes Orthology assignment. Both stand-alone software and a web server of iPathCons are provided. Using iPathCons, we constructed the pathways of molecular interactions of 52 insects, including 37 genome-sequenced and 15 transcriptome-sequenced ones. These pathways are available in the iPathDB, which provides searches, web server, data downloads, etc. This database will be highly useful for the insect research community. Database URL: http://ento.njau.edu.cn/ipath/ PMID:25388589

  16. The use of functional chemical-protein associations to identify multi-pathway renoprotectants.

    PubMed

    Xu, Jia; Meng, Kexin; Zhang, Rui; Yang, He; Liao, Chang; Zhu, Wenliang; Jiao, Jundong

    2014-01-01

    Typically, most nephropathies can be categorized as complex human diseases in which the cumulative effect of multiple minor genes, combined with environmental and lifestyle factors, determines the disease phenotype. Thus, multi-target drugs would be more likely to facilitate comprehensive renoprotection than single-target agents. In this study, functional chemical-protein association analysis was performed to retrieve multi-target drugs of high pathway wideness from the STITCH 3.1 database. Pathway wideness of a drug evaluated the efficiency of regulation of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways in quantity. We identified nine experimentally validated renoprotectants that exerted remarkable impact on KEGG pathways by targeting a limited number of proteins. We selected curcumin as an illustrative compound to display the advantage of multi-pathway drugs on renoprotection. We compared curcumin with hemin, an agonist of heme oxygenase-1 (HO-1), which significantly affects only one KEGG pathway, porphyrin and chlorophyll metabolism (adjusted p = 1.5×10-5). At the same concentration (10 µM), both curcumin and hemin equivalently mitigated oxidative stress in H2O2-treated glomerular mesangial cells. The benefit of using hemin was derived from its agonistic effect on HO-1, providing relief from oxidative stress. Selective inhibition of HO-1 completely blocked the action of hemin but not that of curcumin, suggesting simultaneous multi-pathway intervention by curcumin. Curcumin also increased cellular autophagy levels, enhancing its protective effect; however, hemin had no effects. Based on the fact that the dysregulation of multiple pathways is implicated in the etiology of complex diseases, we proposed a feasible method for identifying multi-pathway drugs from compounds with validated targets. Our efforts will help identify multi-pathway agents capable of providing comprehensive protection against renal injuries.

  17. NLDB: a database for 3D protein-ligand interactions in enzymatic reactions.

    PubMed

    Murakami, Yoichi; Omori, Satoshi; Kinoshita, Kengo

    2016-12-01

    NLDB (Natural Ligand DataBase; URL: http://nldb.hgc.jp ) is a database of automatically collected and predicted 3D protein-ligand interactions for the enzymatic reactions of metabolic pathways registered in KEGG. Structural information about these reactions is important for studying the molecular functions of enzymes, however a large number of the 3D interactions are still unknown. Therefore, in order to complement such missing information, we predicted protein-ligand complex structures, and constructed a database of the 3D interactions in reactions. NLDB provides three different types of data resources; the natural complexes are experimentally determined protein-ligand complex structures in PDB, the analog complexes are predicted based on known protein structures in a complex with a similar ligand, and the ab initio complexes are predicted by docking simulations. In addition, NLDB shows the known polymorphisms found in human genome on protein structures. The database has a flexible search function based on various types of keywords, and an enrichment analysis function based on a set of KEGG compound IDs. NLDB will be a valuable resource for experimental biologists studying protein-ligand interactions in specific reactions, and for theoretical researchers wishing to undertake more precise simulations of interactions.

  18. Transcriptome and Metabolite analysis reveal candidate genes of the cardiac glycoside biosynthetic pathway from Calotropis procera

    PubMed Central

    Pandey, Akansha; Swarnkar, Vishakha; Pandey, Tushar; Srivastava, Piush; Kanojiya, Sanjeev; Mishra, Dipak Kumar; Tripathi, Vineeta

    2016-01-01

    Calotropis procera is a medicinal plant of immense importance due to its pharmaceutical active components, especially cardiac glycosides (CG). As genomic resources for this plant are limited, the genes involved in CG biosynthetic pathway remain largely unknown till date. Our study on stage and tissue specific metabolite accumulation showed that CG’s were maximally accumulated in stems of 3 month old seedlings. De novo transcriptome sequencing of same was done using high throughput Illumina HiSeq platform generating 44074 unigenes with average mean length of 1785 base pair. Around 66.6% of unigenes were annotated by using various public databases and 5324 unigenes showed significant match in the KEGG database involved in 133 different pathways of plant metabolism. Further KEGG analysis resulted in identification of 336 unigenes involved in cardenolide biosynthesis. Tissue specific expression analysis of 30 putative transcripts involved in terpenoid, steroid and cardenolide pathways showed a positive correlation between metabolite and transcript accumulation. Wound stress elevated CG levels as well the levels of the putative transcripts involved in its biosynthetic pathways. This result further validated the involvement of identified transcripts in CGs biosynthesis. The identified transcripts will lay a substantial foundation for further research on metabolic engineering and regulation of cardiac glycosides biosynthesis pathway genes. PMID:27703261

  19. RNApathwaysDB—a database of RNA maturation and decay pathways

    PubMed Central

    Milanowska, Kaja; Mikolajczak, Katarzyna; Lukasik, Anna; Skorupski, Marcin; Balcer, Zuzanna; Machnicka, Magdalena A.; Nowacka, Martyna; Rother, Kristian M.; Bujnicki, Janusz M.

    2013-01-01

    Many RNA molecules undergo complex maturation, involving e.g. excision from primary transcripts, removal of introns, post-transcriptional modification and polyadenylation. The level of mature, functional RNAs in the cell is controlled not only by the synthesis and maturation but also by degradation, which proceeds via many different routes. The systematization of data about RNA metabolic pathways and enzymes taking part in RNA maturation and degradation is essential for the full understanding of these processes. RNApathwaysDB, available online at http://iimcb.genesilico.pl/rnapathwaysdb, is an online resource about maturation and decay pathways involving RNA as the substrate. The current release presents information about reactions and enzymes that take part in the maturation and degradation of tRNA, rRNA and mRNA, and describes pathways in three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. RNApathwaysDB can be queried with keywords, and sequences of protein enzymes involved in RNA processing can be searched with BLAST. Options for data presentation include pathway graphs and tables with enzymes and literature data. Structures of macromolecular complexes involving RNA and proteins that act on it are presented as ‘potato models’ using DrawBioPath—a new javascript tool. PMID:23155061

  20. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies.

    PubMed

    Schnoes, Alexandra M; Brown, Shoshana D; Dodevski, Igor; Babbitt, Patricia C

    2009-12-01

    Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.

  1. TrypanoCyc: a community-led biochemical pathways database for Trypanosoma brucei

    PubMed Central

    Shameer, Sanu; Logan-Klumpler, Flora J.; Vinson, Florence; Cottret, Ludovic; Merlet, Benjamin; Achcar, Fiona; Boshart, Michael; Berriman, Matthew; Breitling, Rainer; Bringaud, Frédéric; Bütikofer, Peter; Cattanach, Amy M.; Bannerman-Chukualim, Bridget; Creek, Darren J.; Crouch, Kathryn; de Koning, Harry P.; Denise, Hubert; Ebikeme, Charles; Fairlamb, Alan H.; Ferguson, Michael A. J.; Ginger, Michael L.; Hertz-Fowler, Christiane; Kerkhoven, Eduard J.; Mäser, Pascal; Michels, Paul A. M.; Nayak, Archana; Nes, David W.; Nolan, Derek P.; Olsen, Christian; Silva-Franco, Fatima; Smith, Terry K.; Taylor, Martin C.; Tielens, Aloysius G. M.; Urbaniak, Michael D.; van Hellemond, Jaap J.; Vincent, Isabel M.; Wilkinson, Shane R.; Wyllie, Susan; Opperdoes, Fred R.; Barrett, Michael P.; Jourdan, Fabien

    2015-01-01

    The metabolic network of a cell represents the catabolic and anabolic reactions that interconvert small molecules (metabolites) through the activity of enzymes, transporters and non-catalyzed chemical reactions. Our understanding of individual metabolic networks is increasing as we learn more about the enzymes that are active in particular cells under particular conditions and as technologies advance to allow detailed measurements of the cellular metabolome. Metabolic network databases are of increasing importance in allowing us to contextualise data sets emerging from transcriptomic, proteomic and metabolomic experiments. Here we present a dynamic database, TrypanoCyc (http://www.metexplore.fr/trypanocyc/), which describes the generic and condition-specific metabolic network of Trypanosoma brucei, a parasitic protozoan responsible for human and animal African trypanosomiasis. In addition to enabling navigation through the BioCyc-based TrypanoCyc interface, we have also implemented a network-based representation of the information through MetExplore, yielding a novel environment in which to visualise the metabolism of this important parasite. PMID:25300491

  2. FMM: a web server for metabolic pathway reconstruction and comparative analysis.

    PubMed

    Chou, Chih-Hung; Chang, Wen-Chi; Chiu, Chih-Min; Huang, Chih-Chang; Huang, Hsien-Da

    2009-07-01

    Synthetic Biology, a multidisciplinary field, is growing rapidly. Improving the understanding of biological systems through mimicry and producing bio-orthogonal systems with new functions are two complementary pursuits in this field. A web server called FMM (From Metabolite to Metabolite) was developed for this purpose. FMM can reconstruct metabolic pathways form one metabolite to another metabolite among different species, based mainly on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and other integrated biological databases. Novel presentation for connecting different KEGG maps is newly provided. Both local and global graphical views of the metabolic pathways are designed. FMM has many applications in Synthetic Biology and Metabolic Engineering. For example, the reconstruction of metabolic pathways to produce valuable metabolites or secondary metabolites in bacteria or yeast is a promising strategy for drug production. FMM provides a highly effective way to elucidate the genes from which species should be cloned into those microorganisms based on FMM pathway comparative analysis. Consequently, FMM is an effective tool for applications in synthetic biology to produce both drugs and biofuels. This novel and innovative resource is now freely available at http://FMM.mbc.nctu.edu.tw/.

  3. Pathway Network Analyses for Autism Reveal Multisystem Involvement, Major Overlaps with Other Diseases and Convergence upon MAPK and Calcium Signaling

    PubMed Central

    Wen, Ya; Alshikho, Mohamad J.; Herbert, Martha R.

    2016-01-01

    We used established databases in standard ways to systematically characterize gene ontologies, pathways and functional linkages in the large set of genes now associated with autism spectrum disorders (ASDs). These conditions are particularly challenging—they lack clear pathognomonic biological markers, they involve great heterogeneity across multiple levels (genes, systemic biological and brain characteristics, and nuances of behavioral manifestations)—and yet everyone with this diagnosis meets the same defining behavioral criteria. Using the human gene list from Simons Foundation Autism Research Initiative (SFARI) we performed gene set enrichment analysis with the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database, and then derived a pathway network from pathway-pathway functional interactions again in reference to KEGG. Through identifying the GO (Gene Ontology) groups in which SFARI genes were enriched, mapping the coherence between pathways and GO groups, and ranking the relative strengths of representation of pathway network components, we 1) identified 10 disease-associated and 30 function-associated pathways 2) revealed calcium signaling pathway and neuroactive ligand-receptor interaction as the most enriched, statistically significant pathways from the enrichment analysis, 3) showed calcium signaling pathways and MAPK signaling pathway to be interactive hubs with other pathways and also to be involved with pervasively present biological processes, 4) found convergent indications that the process “calcium-PRC (protein kinase C)-Ras-Raf-MAPK/ERK” is likely a major contributor to ASD pathophysiology, and 5) noted that perturbations associated with KEGG’s category of environmental information processing were common. These findings support the idea that ASD-associated genes may contribute not only to core features of ASD themselves but also to vulnerability to other chronic and systemic problems potentially including cancer, metabolic

  4. WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data

    PubMed Central

    Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M

    2006-01-01

    Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281

  5. The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice.

    PubMed

    Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K

    2015-01-01

    Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer.

  6. Pathway — Using a State-of-the-Art Digital Video Database for Research and Development in Teacher Education

    NASA Astrophysics Data System (ADS)

    Adrian, Brian; Zollman, Dean; Stevens, Scott

    2006-02-01

    To demonstrate how state-of-the-art video databases can address issues related to the lack of preparation of many physics teachers, we have created the prototype Physics Teaching Web Advisory (Pathway). Pathway's Synthetic Interviews and related video materials are beginning to provide pre-service and out-of-field in-service teachers with much-needed professional development and well-prepared teachers with new perspectives on teaching physics. The prototype was limited to a demonstration of the systems. Now, with an additional grant we will extend the system and conduct research and evaluation on its effectiveness. This project will provide virtual expert help on issues of pedagogy and content. In particular, the system will convey, by example and explanation, contemporary ideas about the teaching of physics and applications of physics education research. The research effort will focus on the value of contemporary technology to address the continuing education of teachers who are teaching in a field in which they have not been trained.

  7. Endocrine Disruptors: Data-based survey of in vivo tests, predictive models and the Adverse Outcome Pathway.

    PubMed

    Benigni, Romualdo; Battistelli, Chiara Laura; Bossa, Cecilia; Giuliani, Alessandro; Tcheremenskaia, Olga

    2017-02-20

    The protection from endocrine disruptors is a high regulatory priority. Key issues are the characterization of in vivo assays, and the identification of reference chemicals to validate alternative methods. In this exploration, publicly available databases for in vivo assays for endocrine disruption were collected and compared: Rodent Uterotrophic, Rodent Repeated Dose 28-day Oral Toxicity, 21-Day Fish, and Daphnia magna reproduction assays. Only the Uterotrophic and 21-Day Fish assays results correlated with each other. The in vivo assays data were viewed in relation to the Adverse Outcome Pathway, using as a probe 18 ToxCast in vitro assays for the ER pathway. These are the same data at the basis of the EPA agonist ToxERscore model, whose good predictivity was confirmed. The multivariate comparison of the in vitro/in vivo assays suggests that the interaction with receptors is a major determinant of in vivo results, and is the critical basis for building predictive computational models. In agreement with the above, this work also shows that it is possible to build predictive models for the Uterotrophic and 21-Day Fish assays using a limited selection of Toxcast assays.

  8. SuperToxic: a comprehensive database of toxic compounds

    PubMed Central

    Schmidt, Ulrike; Struck, Swantje; Gruening, Bjoern; Hossbach, Julia; Jaeger, Ines S.; Parol, Roza; Lindequist, Ulrike; Teuscher, Eberhard; Preissner, Robert

    2009-01-01

    Within our everyday life, we are confronted with a variety of toxic substances of natural or artificial origin. Toxins are already used, e.g. in medicine, but there is still an increasing number of toxic compounds, representing a tremendous potential to extract new substances. Since predictive toxicology gains in importance, the careful and extensive investigation of known toxins is the basis to assess the properties of unknown substances. In order to achieve this aim, we have collected toxic compounds from literature and web sources in the database SuperToxic. The current version of this database compiles about 60 000 compounds and their structures. These molecules are classified according to their toxicity, based on more than 2 million measurements. The SuperToxic database provides a variety of search options like name, CASRN, molecular weight and measured values of toxicity. With the aid of implemented similarity searches, information about possible biological interactions can be gained. Furthermore, connections to the Protein Data Bank, UniProt and the KEGG database are available, to allow the identification of targets and those pathways, the searched compounds are involved in. This database is available online at: http://bioinformatics.charite.de/supertoxic. PMID:19004875

  9. PathwAX: a web server for network crosstalk based pathway annotation

    PubMed Central

    Ogris, Christoph; Helleday, Thomas; Sonnhammer, Erik L.L.

    2016-01-01

    Pathway annotation of gene lists is often used to functionally analyse biomolecular data such as gene expression in order to establish which processes are activated in a given experiment. Databases such as KEGG or GO represent collections of how genes are known to be organized in pathways, and the challenge is to compare a given gene list with the known pathways such that all true relations are identified. Most tools apply statistical measures to the gene overlap between the gene list and pathway. It is however problematic to avoid false negatives and false positives when only using the gene overlap. The pathwAX web server (http://pathwAX.sbc.su.se/) applies a different approach which is based on network crosstalk. It uses the comprehensive network FunCoup to analyse network crosstalk between a query gene list and KEGG pathways. PathwAX runs the BinoX algorithm, which employs Monte-Carlo sampling of randomized networks and estimates a binomial distribution, for estimating the statistical significance of the crosstalk. This results in substantially higher accuracy than gene overlap methods. The system was optimized for speed and allows interactive web usage. We illustrate the usage and output of pathwAX. PMID:27151197

  10. dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock

    PubMed Central

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2016-01-01

    Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469

  11. dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2016-01-01

    Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.

  12. ProOpDB: Prokaryotic Operon DataBase.

    PubMed

    Taboada, Blanca; Ciria, Ricardo; Martinez-Guerrero, Cristian E; Merino, Enrique

    2012-01-01

    The Prokaryotic Operon DataBase (ProOpDB, http://operons.ibt.unam.mx/OperonPredictor) constitutes one of the most precise and complete repositories of operon predictions now available. Using our novel and highly accurate operon identification algorithm, we have predicted the operon structures of more than 1200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: (i) organism name, (ii) metabolic pathways, as defined by the KEGG database, (iii) gene orthology, as defined by the COG database, (iv) conserved protein domains, as defined by the Pfam database, (v) reference gene and (vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient method to select the most representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool to visualize their genomic context and retrieve the sequence of their corresponding 5' regulatory regions, as well as the nucleotide or amino acid sequences of their genes.

  13. Pathway modeling of microarray data: A case study of pathway activity changes in the testis following in utero exposure to dibutyl phthalate (DBP)

    SciTech Connect

    Ovacik, Meric A.; Sen, Banalata; Euling, Susan Y.; Gaido, Kevin W.; Ierapetritou, Marianthi G.; Androulakis, Ioannis P.

    2013-09-15

    Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.

  14. A Web Tool for Generating High Quality Machine-readable Biological Pathways.

    PubMed

    Ramirez-Gaona, Miguel; Marcu, Ana; Pon, Allison; Grant, Jason; Wu, Anthony; Wishart, David S

    2017-02-08

    PathWhiz is a web server built to facilitate the creation of colorful, interactive, visually pleasing pathway diagrams that are rich in biological information. The pathways generated by this online application are machine-readable and fully compatible with essentially all web-browsers and computer operating systems. It uses a specially developed, web-enabled pathway drawing interface that permits the selection and placement of different combinations of pre-drawn biological or biochemical entities to depict reactions, interactions, transport processes and binding events. This palette of entities consists of chemical compounds, proteins, nucleic acids, cellular membranes, subcellular structures, tissues, and organs. All of the visual elements in it can be interactively adjusted and customized. Furthermore, because this tool is a web server, all pathways and pathway elements are publicly accessible. This kind of pathway "crowd sourcing" means that PathWhiz already contains a large and rapidly growing collection of previously drawn pathways and pathway elements. Here we describe a protocol for the quick and easy creation of new pathways and the alteration of existing pathways. To further facilitate pathway editing and creation, the tool contains replication and propagation functions. The replication function allows existing pathways to be used as templates to create or edit new pathways. The propagation function allows one to take an existing pathway and automatically propagate it across different species. Pathways created with this tool can be "re-styled" into different formats (KEGG-like or text-book like), colored with different backgrounds, exported to BioPAX, SBGN-ML, SBML, or PWML data exchange formats, and downloaded as PNG or SVG images. The pathways can easily be incorporated into online databases, integrated into presentations, posters or publications, or used exclusively for online visualization and exploration. This protocol has been successfully applied to

  15. ESEA: Discovering the Dysregulated Pathways based on Edge Set Enrichment Analysis

    PubMed Central

    Han, Junwei; Shi, Xinrui; Zhang, Yunpeng; Xu, Yanjun; Jiang, Ying; Zhang, Chunlong; Feng, Li; Yang, Haixiu; Shang, Desi; Sun, Zeguo; Su, Fei; Li, Chunquan; Li, Xia

    2015-01-01

    Pathway analyses are playing an increasingly important role in understanding biological mechanism, cellular function and disease states. Current pathway-identification methods generally focus on only the changes of gene expression levels; however, the biological relationships among genes are also the fundamental components of pathways, and the dysregulated relationships may also alter the pathway activities. We propose a powerful computational method, Edge Set Enrichment Analysis (ESEA), for the identification of dysregulated pathways. This provides a novel way of pathway analysis by investigating the changes of biological relationships of pathways in the context of gene expression data. Simulation studies illustrate the power and performance of ESEA under various simulated conditions. Using real datasets from p53 mutation, Type 2 diabetes and lung cancer, we validate effectiveness of ESEA in identifying dysregulated pathways. We further compare our results with five other pathway enrichment analysis methods. With these analyses, we show that ESEA is able to help uncover dysregulated biological pathways underlying complex traits and human diseases via specific use of the dysregulated biological relationships. We develop a freely available R-based tool of ESEA. Currently, ESEA can support pathway analysis of the seven public databases (KEGG; Reactome; Biocarta; NCI; SPIKE; HumanCyc; Panther). PMID:26267116

  16. Identification of ecdysteroid receptor-mediated signaling pathways in the hepatopancreas of the red swamp crayfish, Procambarus clarkii.

    PubMed

    Zhu, Baojian; Tang, Lin; Yu, Yingying; Yu, Huimin; Wang, Lei; Qian, Cen; Wei, Guoqing; Liu, Chaoliang

    2017-01-06

    The hepatopancreas of crustaceans plays an important role in lipid and carbohydrate metabolism, digestion of food, and biogenesis. In this study, the hepatopancreas transcriptome from the red crayfish Procambarus clarkii was characterized for the first time using high-throughput sequencing, producing approximately 41.4 million reads were obtained. After de novo assembly, 57,363 unigenes with an average length of 725bp were identified, Gene Ontology analysis categorized 22,580 as being involved in biological processes, among which metabolic process and cellular process groups were the most highly enriched. A total of 8034 unigenes were assigned to 223 metabolic pathways following mapping against the Kyoto encyclopedia of genes and genomes (KEGG) database. Ecdysteroid receptor (EcR)-mediated signaling pathways were investigated using digital gene expression (DGE) analysis following RNA interference targeting the EcR. A total of 529 differentially expressed genes (DEGs) were identified, including 322 downregulated and 207 upregulated unigenes. Of these, 445 (84.12%) were annotated successfully by alignment with known sequences, many of which were related to catalytic activity and binding functional categories. Using KEGG enrichment analysis, 183 DEGs were clustered into 78 pathways, and six significantly enriched pathways were predicted. The expression patterns of candidate genes identified by real-time PCR were consistent with the DGE results.

  17. Pathway Analysis for Genome-Wide Association Study of Lung Cancer in Han Chinese Population

    PubMed Central

    Wu, Chen; Jin, Guangfu; Dai, Juncheng; Wang, Cheng; Hu, Lingmin; Gou, Jianwei; Qian, Chen; Bai, Jianling; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng

    2013-01-01

    Genome-wide association studies (GWAS) have identified a number of genetic variants associated with lung cancer risk. However, these loci explain only a small fraction of lung cancer hereditability and other variants with weak effect may be lost in the GWAS approach due to the stringent significance level after multiple comparison correction. In this study, in order to identify important pathways involving the lung carcinogenesis, we performed a two-stage pathway analysis in GWAS of lung cancer in Han Chinese using gene set enrichment analysis (GSEA) method. Predefined pathways by BioCarta and KEGG databases were systematically evaluated on Nanjing study (Discovery stage: 1,473 cases and 1,962 controls) and the suggestive pathways were further to be validated in Beijing study (Replication stage: 858 cases and 1,115 controls). We found that four pathways (achPathway, metPathway, At1rPathway and rac1Pathway) were consistently significant in both studies and the P values for combined dataset were 0.012, 0.010, 0.022 and 0.005 respectively. These results were stable after sensitivity analysis based on gene definition and gene overlaps between pathways. These findings may provide new insights into the etiology of lung cancer. PMID:23469231

  18. Gene-set meta-analysis of lung cancer identifies pathway related to systemic lupus erythematosus

    PubMed Central

    Sohns, Melanie; Friedrichs, Stefanie; Hung, Rayjean J.; Fehringer, Gord; McLaughlin, John; Amos, Christopher I.; Brennan, Paul; Risch, Angela; Brüske, Irene; Caporaso, Neil E.; Landi, Maria Teresa; Christiani, David C.; Wei, Yongyue; Bickeböller, Heike

    2017-01-01

    Introduction Gene-set analysis (GSA) is an approach using the results of single-marker genome-wide association studies when investigating pathways as a whole with respect to the genetic basis of a disease. Methods We performed a meta-analysis of seven GSAs for lung cancer, applying the method META-GSA. Overall, the information taken from 11,365 cases and 22,505 controls from within the TRICL/ILCCO consortia was used to investigate a total of 234 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results META-GSA reveals the systemic lupus erythematosus KEGG pathway hsa05322, driven by the gene region 6p21-22, as also implicated in lung cancer (p = 0.0306). This gene region is known to be associated with squamous cell lung carcinoma. The most important genes driving the significance of this pathway belong to the genomic areas HIST1-H4L, -1BN, -2BN, -H2AK, -H4K and C2/C4A/C4B. Within these areas, the markers most significantly associated with LC are rs13194781 (located within HIST12BN) and rs1270942 (located between C2 and C4A). Conclusions We have discovered a pathway currently marked as specific to systemic lupus erythematosus as being significantly implicated in lung cancer. The gene region 6p21-22 in this pathway appears to be more extensively associated with lung cancer than previously assumed. Given wide-stretched linkage disequilibrium to the area APOM/BAG6/MSH5, there is currently simply not enough information or evidence to conclude whether the potential pleiotropy of lung cancer and systemic lupus erythematosus is spurious, biological, or mediated. Further research into this pathway and gene region will be necessary. PMID:28273134

  19. Analysis of Important Gene Ontology Terms and Biological Pathways Related to Pancreatic Cancer.

    PubMed

    Yin, Hang; Wang, ShaoPeng; Zhang, Yu-Hang; Cai, Yu-Dong; Liu, Hailin

    2016-01-01

    Pancreatic cancer is a serious disease that results in more than thirty thousand deaths around the world per year. To design effective treatments, many investigators have devoted themselves to the study of biological processes and mechanisms underlying this disease. However, it is far from complete. In this study, we tried to extract important gene ontology (GO) terms and KEGG pathways for pancreatic cancer by adopting some existing computational methods. Genes that have been validated to be related to pancreatic cancer and have not been validated were represented by features derived from GO terms and KEGG pathways using the enrichment theory. A popular feature selection method, minimum redundancy maximum relevance, was employed to analyze these features and extract important GO terms and KEGG pathways. An extensive analysis of the obtained GO terms and KEGG pathways was provided to confirm the correlations between them and pancreatic cancer.

  20. Analysis of Important Gene Ontology Terms and Biological Pathways Related to Pancreatic Cancer

    PubMed Central

    Yin, Hang; Wang, ShaoPeng; Zhang, Yu-Hang

    2016-01-01

    Pancreatic cancer is a serious disease that results in more than thirty thousand deaths around the world per year. To design effective treatments, many investigators have devoted themselves to the study of biological processes and mechanisms underlying this disease. However, it is far from complete. In this study, we tried to extract important gene ontology (GO) terms and KEGG pathways for pancreatic cancer by adopting some existing computational methods. Genes that have been validated to be related to pancreatic cancer and have not been validated were represented by features derived from GO terms and KEGG pathways using the enrichment theory. A popular feature selection method, minimum redundancy maximum relevance, was employed to analyze these features and extract important GO terms and KEGG pathways. An extensive analysis of the obtained GO terms and KEGG pathways was provided to confirm the correlations between them and pancreatic cancer. PMID:27957501

  1. KENeV: A web-application for the automated reconstruction and visualization of the enriched metabolic and signaling super-pathways deriving from genomic experiments

    PubMed Central

    Pilalis, Eleftherios; Koutsandreas, Theodoros; Valavanis, Ioannis; Athanasiadis, Emmanouil; Spyrou, George; Chatziioannou, Aristotelis

    2015-01-01

    Gene expression analysis, using high throughput genomic technologies,has become an indispensable step for the meaningful interpretation of the underlying molecular complexity, which shapes the phenotypic manifestation of the investigated biological mechanism. The modularity of the cellular response to different experimental conditions can be comprehended through the exploitation of molecular pathway databases, which offer a controlled, curated background for statistical enrichment analysis. Existing tools enable pathway analysis, visualization, or pathway merging but none integrates a fully automated workflow, combining all above-mentioned modules and destined to non-programmer users. We introduce an online web application, named KEGG Enriched Network Visualizer (KENeV), which enables a fully automated workflow starting from a list of differentially expressed genes and deriving the enriched KEGG metabolic and signaling pathways, merged into two respective, non-redundant super-networks. The final networks can be downloaded as SBML files, for further analysis, or instantly visualized through an interactive visualization module. In conclusion, KENeV (available online at http://www.grissom.gr/kenev) provides an integrative tool, suitable for users with no programming experience, for the functional interpretation, at both the metabolic and signaling level, of differentially expressed gene subsets deriving from genomic experiments. PMID:26925206

  2. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis.

    PubMed

    Liang, Bin; Li, Chunning; Zhao, Jianying

    2016-10-01

    Colorectal cancer (CRC) is the most common malignant tumor of digestive system. The aim of this study was to identify gene signatures during CRC and uncover their potential mechanisms. The gene expression profiles of GSE21815 were downloaded from GEO database. The GSE21815 dataset contained 141 samples, including 132 CRC and 9 normal colon epitheliums. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. In total, 3500 DEGs were identified in CRC, including 1370 up-regulated genes and 2130 down-regulated genes. GO analysis results showed that up-regulated DEGs were significantly enriched in biological processes (BP), including cell cycle, cell division, and cell proliferation; the down-regulated DEGs were significantly enriched in biological processes, including immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated DEGs were enriched in cell cycle and DNA replication, while the down-regulated DEGs were enriched in drug metabolism, metabolism of xenobiotics by cytochrome P450, and retinol metabolism pathways. The top 10 hub genes, GNG2, AGT, SAA1, ADCY5, LPAR1, NMU, IL8, CXCL12, GNAI1, and CCR2 were identified from the PPI network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, gastrin-CREB signaling pathway via PKC and MAPK, and extracellular matrix organization. In conclusion, the present study indicated that the identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of CRC, and might be used as molecular targets and diagnostic biomarkers for the treatment of CRC.

  3. Differentially expressed genes and interacting pathways in bladder cancer revealed by bioinformatic analysis.

    PubMed

    Shen, Yinzhou; Wang, Xuelei; Jin, Yongchao; Lu, Jiasun; Qiu, Guangming; Wen, Xiaofei

    2014-10-01

    The goal of this study was to identify cancer-associated differentially expressed genes (DEGs), analyze their biological functions and investigate the mechanism(s) of cancer occurrence and development, which may provide a theoretical foundation for bladder cancer (BCa) therapy. We downloaded the mRNA expression profiling dataset GSE13507 from the Gene Expression Omnibus database; the dataset includes 165 BCa and 68 control samples. T‑tests were used to identify DEGs. To further study the biological functions of the identified DEGs, we performed a Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. Next, we built a network of potentially interacting pathways to study the synergistic relationships among DEGs. A total of 12,105 genes were identified as DEGs, of which 5,239 were upregulated and 6,866 were downregulated in BCa. The DEGs encoding activator protein 1 (AP-1), nuclear factor of activated T-cells (NFAT) proteins, nuclear factor κ-light-chain-enhancer of activated B cells (NF-κB) and interleukin (IL)-10 were revealed to participate in the significantly enriched immune pathways that were downregulated in BCa. KEGG enrichment analysis revealed 7 significantly upregulated and 47 significantly downregulated pathways enriched among the DEGs. We found a crosstalk interaction among a total of 44 pathways in the network of BCa-affected pathways. In conclusion, our results show that BCa involves dysfunctions in multiple systems. Our study is expected to pave ways for immune and inflammatory research and provide molecular insights for cancer therapy.

  4. HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

    PubMed

    Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

    2015-04-01

    The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks.

  5. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome.

    PubMed

    Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M; Aldana-Montes, José F; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M Gonzalo

    2015-01-01

    Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species.

  6. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome

    PubMed Central

    Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J.; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M.; Aldana-Montes, José F.; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M. Gonzalo

    2015-01-01

    Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066

  7. [A novel biological pathway expansion method based on the knowledge of protein-protein interactions].

    PubMed

    Zhao, Xiaolei; Zuo, Xiaoyu; Qin, Jiheng; Liang, Yan; Zhang, Naizun; Luan, Yizhao; Rao, Shaoqi

    2014-04-01

    Biological pathways have been widely used in gene function studies; however, the current knowledge for biological pathways is per se incomplete and has to be further expanded. Bioinformatics prediction provides us a cheap but effective way for pathway expansion. Here, we proposed a novel method for biological pathway prediction, by intergrating prior knowledge of protein?protein interactions and Gene Ontology (GO) database. First, the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways to which the interacting neighbors of a targe gene (at the level of protein?protein interaction) belong were chosen as the candidate pathways. Then, the pathways to which the target gene belong were determined by testing whether the genes in the candidate pathways were enriched in the GO terms to which the target gene were annotated. The protein?protein interaction data obtained from the Human Protein Reference Database (HPRD) and Biological General Repository for Interaction Datasets (BioGRID) were respectively used to predict the pathway attribution(s) of the target gene. The results demanstrated that both the average accuracy (the ratio of the correctly predicted pathways to the totally pathways to which all the target genes were annotated) and the relative accuracy (of the genes with at least one annotated pathway being successful predicted, the percentage of the genes with all the annotated pathways being correctly predicted) for pathway predictions were increased with the number of the interacting neighbours. When the number of interacting neighbours reached 22, the average accuracy was 96.2% (HPRD) and 96.3% (BioGRID), respectively, and the relative accuracy was 93.3% (HPRD) and 84.1% (BioGRID), respectively. Further validation analysis of 89 genes whose pathway knowledge was updated in a new database release indicated that 50 genes were correctly predicted for at least one updated pathway, and 43 genes were accurately predicted for all the updated pathways, giving an

  8. A systematic analysis of a mi-RNA inter-pathway regulatory motif

    PubMed Central

    2013-01-01

    Background The continuing discovery of new types and functions of small non-coding RNAs is suggesting the presence of regulatory mechanisms far more complex than the ones currently used to study and design Gene Regulatory Networks. Just focusing on the roles of micro RNAs (miRNAs), they have been found to be part of several intra-pathway regulatory motifs. However, inter-pathway regulatory mechanisms have been often neglected and require further investigation. Results In this paper we present the result of a systems biology study aimed at analyzing a high-level inter-pathway regulatory motif called Pathway Protection Loop, not previously described, in which miRNAs seem to play a crucial role in the successful behavior and activation of a pathway. Through the automatic analysis of a large set of public available databases, we found statistical evidence that this inter-pathway regulatory motif is very common in several classes of KEGG Homo Sapiens pathways and concurs in creating a complex regulatory network involving several pathways connected by this specific motif. The role of this motif seems also confirmed by a deeper review of other research activities on selected representative pathways. Conclusions Although previous studies suggested transcriptional regulation mechanism at the pathway level such as the Pathway Protection Loop, a high-level analysis like the one proposed in this paper is still missing. The understanding of higher-level regulatory motifs could, as instance, lead to new approaches in the identification of therapeutic targets because it could unveil new and “indirect” paths to activate or silence a target pathway. However, a lot of work still needs to be done to better uncover this high-level inter-pathway regulation including enlarging the analysis to other small non-coding RNA molecules. PMID:24152805

  9. Analysis of Polygala tenuifolia Transcriptome and Description of Secondary Metabolite Biosynthetic Pathways by Illumina Sequencing

    PubMed Central

    Tian, Hongling; Xu, Xiaoshuang; Zhang, Fusheng; Wang, Yaoqin; Guo, Shuhong; Qin, Xuemei; Du, Guanhua

    2015-01-01

    Radix polygalae, the dried roots of Polygala tenuifolia and P. sibirica, is one of the most well-known traditional Chinese medicinal plants. Radix polygalae contains various saponins, xanthones, and oligosaccharide esters and these compounds are responsible for several pharmacological properties. To provide basic breeding information, enhance molecular biological analysis, and determine secondary metabolite biosynthetic pathways of P. tenuifolia, we applied Illumina sequencing technology and de novo assembly. We also applied this technique to gain an overview of P. tenuifolia transcriptome from samples with different years. Using Illumina sequencing, approximately 67.2% of unique sequences were annotated by basic local alignment search tool similarity searches against public sequence databases. We classified the annotated unigenes by using Nr, Nt, GO, COG, and KEGG databases compared with NCBI. We also obtained many candidates CYP450s and UGTs by the analysis of genes in the secondary metabolite biosynthetic pathways, including putative terpenoid backbone and phenylpropanoid biosynthesis pathway. With this transcriptome sequencing, future genetic and genomics studies related to the molecular mechanisms associated with the chemical composition of P. tenuifolia may be improved. Genes involved in the enrichment of secondary metabolite biosynthesis-related pathways could enhance the potential applications of P. tenuifolia in pharmaceutical industries. PMID:26543847

  10. Pathway-Based Genome-Wide Association Studies for Two Meat Production Traits in Simmental Cattle

    PubMed Central

    Fan, Huizhong; Wu, Yang; Zhou, Xiaojing; Xia, Jiangwei; Zhang, Wengang; Song, Yuxin; Liu, Fei; Chen, Yan; Zhang, Lupei; Gao, Xue; Gao, Huijiang; Li, Junya

    2015-01-01

    Most single nucleotide polymorphisms (SNPs) detected by genome-wide association studies (GWAS), explain only a small fraction of phenotypic variation. Pathway-based GWAS were proposed to improve the proportion of genes for some human complex traits that could be explained by enriching a mass of SNPs within genetic groups. However, few attempts have been made to describe the quantitative traits in domestic animals. In this study, we used a dataset with approximately 7,700,000 SNPs from 807 Simmental cattle and analyzed live weight and longissimus muscle area using a modified pathway-based GWAS method to orthogonalise the highly linked SNPs within each gene using principal component analysis (PCA). As a result, of the 262 biological pathways of cattle collected from the KEGG database, the gamma aminobutyric acid (GABA)ergic synapse pathway and the non-alcoholic fatty liver disease (NAFLD) pathway were significantly associated with the two traits analyzed. The GABAergic synapse pathway was biologically applicable to the traits analyzed because of its roles in feed intake and weight gain. The proposed method had high statistical power and a low false discovery rate, compared to those of the smallest P-value and SNP set enrichment analysis methods. PMID:26672757

  11. Pathways enrichment analysis for differentially expressed genes in squamous lung cancer.

    PubMed

    Qian, Liqiang; Luo, Qingquan; Zhao, Xiaojing; Huang, Jia

    2014-01-01

    Squamous lung cancer (SQLC) is a common type of lung cancer, but its oncogenesis mechanism is not so clear. The aim of this study was to screen the potential pathways changed in SQLC and elucidate the mechanism of it. Published microarray data of GSE3268 series was downloaded from Gene Expression Omnibus (GEO). Significance analysis of microarrays was performed using software R, and differentially expressed genes (DEGs) were harvested. The functions and pathways of DEGs were mapped in Gene Otology and KEGG pathway database, respectively. A total of 2961 genes were filtered as DEGs between normal and SQLC cells. Cell cycle and metabolism were the mainly changed functions of SQLC cells. Meanwhile genes such as MCM, RFC, FEN1, and POLD may induce SQLC through DNA replication pathway, and genes such as PTTG1, CCNB1, CDC6, and PCNA may be involved in SQLC through cell cycle pathway. It is demonstrated that pathway analysis is useful in the identification of target genes in SQLC.

  12. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with

  13. HypoxiaDB: a database of hypoxia-regulated proteins

    PubMed Central

    Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala

    2013-01-01

    There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein–protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for

  14. Transcriptome Analysis Reveals the Genetic Basis of the Resveratrol Biosynthesis Pathway in an Endophytic Fungus (Alternaria sp. MG1) Isolated from Vitis vinifera

    PubMed Central

    Che, Jinxin; Shi, Junling; Gao, Zhenhong; Zhang, Yan

    2016-01-01

    Alternaria sp. MG1, an endophytic fungus previously isolated from Merlot grape, produces resveratrol from glucose, showing similar metabolic flux to the phenylpropanoid biosynthesis pathway, currently found solely in plants. In order to identify the resveratrol biosynthesis pathway in this strain at the gene level, de novo transcriptome sequencing was conducted using Illumina paired-end sequencing. A total of 22,954,434 high-quality reads were assembled into contigs and 18,570 unigenes were identified. Among these unigenes, 14,153 were annotated in the NCBI non-redundant protein database and 5341 were annotated in the Swiss-Prot database. After KEGG mapping, 2701 unigenes were mapped onto 115 pathways. Eighty-four unigenes were annotated in major pathways from glucose to resveratrol, coding 20 enzymes for glycolysis, 10 for phenylalanine biosynthesis, 4 for phenylpropanoid biosynthesis, and 4 for stilbenoid biosynthesis. Chalcone synthase was identified for resveratrol biosynthesis in this strain, due to the absence of stilbene synthase. All the identified enzymes indicated a reasonable biosynthesis pathway from glucose to resveratrol via glycolysis, phenylalanine biosynthesis, phenylpropanoid biosynthesis, and stilbenoid pathways. These results provide essential evidence for the occurrence of resveratrol biosynthesis in Alternaria sp. MG1 at the gene level, facilitating further elucidation of the molecular mechanisms involved in this strain's secondary metabolism. PMID:27588016

  15. Aligning Metabolic Pathways Exploiting Binary Relation of Reactions

    PubMed Central

    Zhong, Cheng; Lin, Hai Xiang; Huang, Jing

    2016-01-01

    Metabolic pathway alignment has been widely used to find one-to-one and/or one-to-many reaction mappings to identify the alternative pathways that have similar functions through different sets of reactions, which has important applications in reconstructing phylogeny and understanding metabolic functions. The existing alignment methods exhaustively search reaction sets, which may become infeasible for large pathways. To address this problem, we present an effective alignment method for accurately extracting reaction mappings between two metabolic pathways. We show that connected relation between reactions can be formalized as binary relation of reactions in metabolic pathways, and the multiplications of zero-one matrices for binary relations of reactions can be accomplished in finite steps. By utilizing the multiplications of zero-one matrices for binary relation of reactions, we efficiently obtain reaction sets in a small number of steps without exhaustive search, and accurately uncover biologically relevant reaction mappings. Furthermore, we introduce a measure of topological similarity of nodes (reactions) by comparing the structural similarity of the k-neighborhood subgraphs of the nodes in aligning metabolic pathways. We employ this similarity metric to improve the accuracy of the alignments. The experimental results on the KEGG database show that when compared with other state-of-the-art methods, in most cases, our method obtains better performance in the node correctness and edge correctness, and the number of the edges of the largest common connected subgraph for one-to-one reaction mappings, and the number of correct one-to-many reaction mappings. Our method is scalable in finding more reaction mappings with better biological relevance in large metabolic pathways. PMID:27936108

  16. Characterization of Differentially Expressed Genes Involved in Pathways Associated with Gastric Cancer

    PubMed Central

    Li, Hao; Yu, Beiqin; Li, Jianfang; Su, Liping; Yan, Min; Zhang, Jun; Li, Chen; Zhu, Zhenggang; Liu, Bingya

    2015-01-01

    To explore the patterns of gene expression in gastric cancer, a total of 26 paired gastric cancer and noncancerous tissues from patients were enrolled for gene expression microarray analyses. Limma methods were applied to analyze the data, and genes were considered to be significantly differentially expressed if the False Discovery Rate (FDR) value was < 0.01, P-value was <0.01 and the fold change (FC) was >2. Subsequently, Gene Ontology (GO) categories were used to analyze the main functions of the differentially expressed genes. According to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we found pathways significantly associated with the differential genes. Gene-Act network and co-expression network were built respectively based on the relationships among the genes, proteins and compounds in the database. 2371 mRNAs and 350 lncRNAs considered as significantly differentially expressed genes were selected for the further analysis. The GO categories, pathway analyses and the Gene-Act network showed a consistent result that up-regulated genes were responsible for tumorigenesis, migration, angiogenesis and microenvironment formation, while down-regulated genes were involved in metabolism. These results of this study provide some novel findings on coding RNAs, lncRNAs, pathways and the co-expression network in gastric cancer which will be useful to guide further investigation and target therapy for this disease. PMID:25928635

  17. Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds.

    PubMed

    Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong

    2016-12-01

    Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.

  18. EXPath tool-a system for comprehensively analyzing regulatory pathways and coexpression networks from high-throughput transcriptome data.

    PubMed

    Zheng, Han-Qin; Wu, Nai-Yun; Chow, Chi-Nga; Tseng, Kuan-Chieh; Chien, Chia-Hung; Hung, Yu-Cheng; Li, Guan-Zhen; Chang, Wen-Chi

    2017-03-13

    Next generation sequencing (NGS) has become the mainstream approach for monitoring gene expression levels in parallel with various experimental treatments. Unfortunately, there is no systematical webserver to comprehensively perform further analysis based on the huge amount of preliminary data that is obtained after finishing the process of gene annotation. Therefore, a user-friendly and effective system is required to mine important genes and regulatory pathways under specific conditions from high-throughput transcriptome data. EXPath Tool (available at: http://expathtool.itps.ncku.edu.tw/) was developed for the pathway annotation and comparative analysis of user-customized gene expression profiles derived from microarray or NGS platforms under various conditions to infer metabolic pathways for all organisms in the KEGG database. EXPath Tool contains several functions: access the gene expression patterns and the candidates of co-expression genes; dissect differentially expressed genes (DEGs) between two conditions (DEGs search), functional grouping with pathway and GO (Pathway/GO enrichment analysis), and correlation networks (co-expression analysis), and view the expression patterns of genes involved in specific pathways to infer the effects of the treatment. Additionally, the effectively of EXPath Tool has been performed by a case study on IAA-responsive genes. The results demonstrated that critical hub genes under IAA treatment could be efficiently identified.

  19. Weighted gene co-expression network analysis in identification of metastasis-related genes of lung squamous cell carcinoma based on the Cancer Genome Atlas database

    PubMed Central

    Tian, Feng; Zhao, Jinlong; Kang, Zhenxing

    2017-01-01

    Background Lung squamous cell carcinoma (lung SCC) is a common type of malignancy. Its pathogenesis mechanism of tumor development is unclear. The aim of this study was to identify key genes for diagnosis biomarkers in lung SCC metastasis. Methods We searched and downloaded mRNA expression data and clinical data from The Cancer Genome Atlas (TCGA) database to identify differences in mRNA expression of primary tumor tissues from lung SCC with and without metastasis. Gene co-expression network analysis, protein-protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and quantitative real-time polymerase chain reactions (qRT-PCR) were used to explore the biological functions of the identified dysregulated genes. Results Four hundred and eighty-two differentially expressed genes (DEGs) were identified between lung SCC with and without metastasis. Nineteen modules were identified in lung SCC through weighted gene co-expression network analysis (WGCNA). Twenty-three DEGs and 26 DEGs were significantly enriched in the respective pink and black module. KEGG pathway analysis displayed that 26 DEGs in the black module were significantly enriched in bile secretion pathway. Forty-nine DEGs in the two gene co-expression module were used to construct PPI network. CFTR in the black module was the hub protein, had the connectivity with 182 genes. The results of qRT-PCR displayed that FIGF, SFTPD, DYNLRB2 were significantly down-regulated in the tumor samples of lung SCC with metastasis and CFTR, SCGB3A2, SSTR1, SCTR, ROPN1L had the down-regulation tendency in lung SCC with metastasis compared to lung SCC without metastasis. Conclusions The dysregulated genes including CFTR, SCTR and FIGF might be involved in the pathology of lung SCC metastasis and could be used as potential diagnosis biomarkers or therapeutic targets for lung SCC. PMID:28203405

  20. MTD: a mammalian transcriptomic database to explore gene expression and regulation

    PubMed Central

    Sun, Qianqian; Li, Xue; Xian, Feng; Sun, Manman; Fang, Wan; Chen, Meili; Yu, Jun; Xiao, Jingfa

    2017-01-01

    A systematic transcriptome survey is essential for the characterization and comprehension of the molecular basis underlying phenotypic variations. Recently developed RNA-seq methodology has facilitated efficient data acquisition and information mining of transcriptomes in multiple tissues/cell lines. Current mammalian transcriptomic databases are either tissue-specific or species-specific, and they lack in-depth comparative features across tissues and species. Here, we present a mammalian transcriptomic database (MTD) that is focused on mammalian transcriptomes, and the current version contains data from humans, mice, rats and pigs. Regarding the core features, the MTD browses genes based on their neighboring genomic coordinates or joint KEGG pathway and provides expression information on exons, transcripts and genes by integrating them into a genome browser. We developed a novel nomenclature for each transcript that considers its genomic position and transcriptional features. The MTD allows a flexible search of genes or isoforms with user-defined transcriptional characteristics and provides both table-based descriptions and associated visualizations. To elucidate the dynamics of gene expression regulation, the MTD also enables comparative transcriptomic analysis in both intraspecies and interspecies manner. The MTD thus constitutes a valuable resource for transcriptomic and evolutionary studies. The MTD is freely accessible at http://mtd.cbi.ac.cn. PMID:26822098

  1. ‘RetinoGenetics’: a comprehensive mutation database for genes related to inherited retinal degeneration

    PubMed Central

    Ran, Xia; Cai, Wei-Jun; Huang, Xiu-Feng; Liu, Qi; Lu, Fan; Qu, Jia; Wu, Jinyu; Jin, Zi-Bing

    2014-01-01

    Inherited retinal degeneration (IRD), a leading cause of human blindness worldwide, is exceptionally heterogeneous with clinical heterogeneity and genetic variety. During the past decades, tremendous efforts have been made to explore the complex heterogeneity, and massive mutations have been identified in different genes underlying IRD with the significant advancement of sequencing technology. In this study, we developed a comprehensive database, ‘RetinoGenetics’, which contains informative knowledge about all known IRD-related genes and mutations for IRD. ‘RetinoGenetics’ currently contains 4270 mutations in 186 genes, with detailed information associated with 164 phenotypes from 934 publications and various types of functional annotations. Then extensive annotations were performed to each gene using various resources, including Gene Ontology, KEGG pathways, protein–protein interaction, mutational annotations and gene–disease network. Furthermore, by using the search functions, convenient browsing ways and intuitive graphical displays, ‘RetinoGenetics’ could serve as a valuable resource for unveiling the genetic basis of IRD. Taken together, ‘RetinoGenetics’ is an integrative, informative and updatable resource for IRD-related genetic predispositions. Database URL: http://www.retinogenetics.org/. PMID:24939193

  2. 'RetinoGenetics': a comprehensive mutation database for genes related to inherited retinal degeneration.

    PubMed

    Ran, Xia; Cai, Wei-Jun; Huang, Xiu-Feng; Liu, Qi; Lu, Fan; Qu, Jia; Wu, Jinyu; Jin, Zi-Bing

    2014-01-01

    Inherited retinal degeneration (IRD), a leading cause of human blindness worldwide, is exceptionally heterogeneous with clinical heterogeneity and genetic variety. During the past decades, tremendous efforts have been made to explore the complex heterogeneity, and massive mutations have been identified in different genes underlying IRD with the significant advancement of sequencing technology. In this study, we developed a comprehensive database, 'RetinoGenetics', which contains informative knowledge about all known IRD-related genes and mutations for IRD. 'RetinoGenetics' currently contains 4270 mutations in 186 genes, with detailed information associated with 164 phenotypes from 934 publications and various types of functional annotations. Then extensive annotations were performed to each gene using various resources, including Gene Ontology, KEGG pathways, protein-protein interaction, mutational annotations and gene-disease network. Furthermore, by using the search functions, convenient browsing ways and intuitive graphical displays, 'RetinoGenetics' could serve as a valuable resource for unveiling the genetic basis of IRD. Taken together, 'RetinoGenetics' is an integrative, informative and updatable resource for IRD-related genetic predispositions. Database URL: http://www.retinogenetics.org/.

  3. Integrated miRNA–risk gene–pathway pair network analysis provides prognostic biomarkers for gastric cancer

    PubMed Central

    Cai, Hui; Xu, Jiping; Han, Yifang; Lu, Zhengmao; Han, Ting; Ding, Yibo; Ma, Liye

    2016-01-01

    Purpose This study aimed to identify molecular prognostic biomarkers for gastric cancer. Methods mRNA and miRNA expression profiles of eligible gastric cancer and control samples were downloaded from Gene Expression Omnibus to screen the differentially expressed genes (DEGs) and differentially expressed miRNAs (DEmiRs), using MetaDE and limma packages, respectively. Target genes of the DEmiRs were also collected from both predictive and experimentally validated target databases of miRNAs. The overlapping genes between selected targets and DEGs were identified as risk genes, followed by functional enrichment analysis. Human pathways and their corresponding genes were downloaded from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database for the expression analysis of each pathway in gastric cancer samples. Next, co-pathway pairs were selected according to the Pearson correlation coefficients. Finally, the co-pathway pairs, miRNA–target pairs, and risk gene–pathway pairs were merged into a complex interaction network, the most important nodes (miRNAs/target genes/co-pathway pairs) of which were selected by calculating their degrees. Results Totally, 1,260 DEGs and 144 DEmiRs were identified. There were 336 risk genes found in the 9,572 miRNA–target pairs. Judging from the pathway expression files, 45 co-pathway pairs were screened out. There were 1,389 interactive pairs and 480 nodes in the integrated network. Among all nodes in the network, focal adhesion/extracellular matrix–receptor interaction pathways, CALM2, miR-19b, and miR-181b were the hub nodes with higher degrees. Conclusion CALM2, hsa-miR-19b, and hsa-miR-181b might be used as potential prognostic targets for gastric cancer. PMID:27284247

  4. Databases for Microbiologists

    DOE PAGES

    Zhulin, Igor B.

    2015-05-26

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.

  5. Databases for Microbiologists

    PubMed Central

    2015-01-01

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493

  6. MRMPath and MRMutation, Facilitating Discovery of Mass Transitions for Proteotypic Peptides in Biological Pathways Using a Bioinformatics Approach

    PubMed Central

    Crasto, Chiquito; Narne, Chandrahas; Kawai, Mikako; Wilson, Landon; Barnes, Stephen

    2013-01-01

    Quantitative proteomics applications in mass spectrometry depend on the knowledge of the mass-to-charge ratio (m/z) values of proteotypic peptides for the proteins under study and their product ions. MRMPath and MRMutation, web-based bioinformatics software that are platform independent, facilitate the recovery of this information by biologists. MRMPath utilizes publicly available information related to biological pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. All the proteins involved in pathways of interest are recovered and processed in silico to extract information relevant to quantitative mass spectrometry analysis. Peptides may also be subjected to automated BLAST analysis to determine whether they are proteotypic. MRMutation catalogs and makes available, following processing, known (mutant) variants of proteins from the current UniProtKB database. All these results, available via the web from well-maintained, public databases, are written to an Excel spreadsheet, which the user can download and save. MRMPath and MRMutation can be freely accessed. As a system that seeks to allow two or more resources to interoperate, MRMPath represents an advance in bioinformatics tool development. As a practical matter, the MRMPath automated approach represents significant time savings to researchers. PMID:23424586

  7. Pathway deviation-based biomarker and multi-effect target identification in asbestos-related squamous cell carcinoma of the lung

    PubMed Central

    Du, Jiang; Zhang, Lin

    2017-01-01

    Asbestos-related lung carcinoma is one of the most devastating occupational cancers, and effective techniques for early diagnosis are still lacking. In the present study, a systematic approach was applied to detect a potential biomarker for asbestos-related lung cancer (ARLC); in particular asbestos-related squamous cell carcinoma (ARLC-SCC). Microarray data (GSE23822) were retrieved from the Gene Expression Omnibus database, including 26 ARLC-SCCs and 30 non-asbestos-related squamous cell lung carcinomas (NARLC-SCCs). Differentially expressed genes (DEGs) were identified by the limma package, and then a protein-protein interaction (PPI) network was constructed according to the BioGRID and HPRD databases. A novel scoring approach integrating an expression deviation score and network degree of the gene was then proposed to weight the DEGs. Subsequently, the important genes were uploaded to DAVID for pathway enrichment analysis. Pathway correlation analysis was carried out using Spearman's rank correlation coefficient of the pathscore. In total, 1,333 DEGs, 391 upregulated and 942 downregulated, were obtained between the ARLC-SCCs and NARLC-SCCs. A total of 524 important genes for ARLC-SCC were significantly enriched in 22 KEGG pathways. Correlation analysis of these pathways showed that the pathway of SNARE interactions in vesicular transport was significantly correlated with 12 other pathways. Additionally, obvious correlations were found between multiple pathways by sharing cross-talk genes (EGFR, PRKX, PDGFB, PIK3R3, SLK, IGF1, CDC42 and PRKCA). On the whole, our data demonstrate that 8 cross-talk genes were found to bridge multiple ARLC-SCC-specific pathways, which may be used as candidate biomarkers and potential multi-effect targets. As these genes are involved in multiple pathways, it is possible that drugs targeting these genes may thus be able to influence multiple pathways simultaneously. PMID:28204826

  8. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases

    PubMed Central

    2012-01-01

    Background Increasingly, metabolite and reaction information is organized in the form of genome-scale metabolic reconstructions that describe the reaction stoichiometry, directionality, and gene to protein to reaction associations. A key bottleneck in the pace of reconstruction of new, high-quality metabolic models is the inability to directly make use of metabolite/reaction information from biological databases or other models due to incompatibilities in content representation (i.e., metabolites with multiple names across databases and models), stoichiometric errors such as elemental or charge imbalances, and incomplete atomistic detail (e.g., use of generic R-group or non-explicit specification of stereo-specificity). Description MetRxn is a knowledgebase that includes standardized metabolite and reaction descriptions by integrating information from BRENDA, KEGG, MetaCyc, Reactome.org and 44 metabolic models into a single unified data set. All metabolite entries have matched synonyms, resolved protonation states, and are linked to unique structures. All reaction entries are elementally and charge balanced. This is accomplished through the use of a workflow of lexicographic, phonetic, and structural comparison algorithms. MetRxn allows for the download of standardized versions of existing genome-scale metabolic models and the use of metabolic information for the rapid reconstruction of new ones. Conclusions The standardization in description allows for the direct comparison of the metabolite and reaction content between metabolic models and databases and the exhaustive prospecting of pathways for biotechnological production. This ever-growing dataset currently consists of over 76,000 metabolites participating in more than 72,000 reactions (including unresolved entries). MetRxn is hosted on a web-based platform that uses relational database models (MySQL). PMID:22233419

  9. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection.

    PubMed

    Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y

    2014-01-01

    The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).

  10. Session on computation in biological pathways

    SciTech Connect

    Karp, P.D.; Riley, M.

    1996-12-31

    The papers in this session focus on the development of pathway databases and computational tools for pathway analysis. The discussion involves existing databases of sequenced genomes, as well as techniques for studying regulatory pathways.

  11. Gene expression analysis reveals the dysregulation of immune and metabolic pathways in Alzheimer's disease

    PubMed Central

    Li, Zhiyan; Xu, Panpan; Yao, Lifen

    2016-01-01

    In recent years, several pathway analyses of genome-wide association studies reported the involvement of metabolic and immune pathways in Alzheimer's disease (AD). Until now, the exact mechanisms of these pathways in AD are still unclear. Here, we conducted a pathway analysis of a whole genome AD case-control expression dataset (n=41, 25 AD cases and 16 controls) from the human temporal cortex tissue. Using the differently expressed AD genes, we identified significant KEGG pathways related to metabolism and immune processes. Using the up- and down- regulated AD gene list, we further found up-regulated AD gene were significantly enriched in immune and metabolic pathways. We further compare the immune and metabolic KEGG pathways from the expression dataset with those from previous GWAS datasets, and found that most of these pathways are shared in both GWAS and expression datasets. PMID:27732949

  12. Gene expression profiling of epithelial ovarian cancer reveals key genes and pathways associated with chemotherapy resistance.

    PubMed

    Zhang, M; Luo, S C

    2016-01-22

    The aim of this study is to analyze gene expression data to identify key genes and pathways associated with resistance to platinum-based chemotherapy in epithelial ovarian cancer (EOC) and to improve clinical treatment strategies. The gene expression data set was downloaded from Gene Expression Omnibus and included 12 chemotherapy-resistant EOC samples and 16 chemotherapy-sensitive EOC samples. A differential analysis was performed to screen out differentially expressed genes (DEGs). A functional enrichment analysis was conducted for the DEGs using the database for annotation, visualization, and integration discovery. A protein-protein interaction (PPI) network was constructed with information from the human protein reference database. Pathway-pathway interactions were determined with a test based on the hypergeometric distribution. A total of 1564 DEGs were identified in chemotherapy-sensitive EOC, including 654 upregulated genes and 910 downregulated genes. The top three upregulated genes were HIST1H3G, AKT3, and RTN3, while the top three downregulated genes were NBLA00301, TRIM62, and EPHA5. A Gene Ontology enrichment analysis showed that cell adhesion, biological adhesion, and intracellular signaling cascades were significantly enriched in the DEGs. A KEGG pathway enrichment analysis revealed that the calcium, mitogen-activated protein kinase, and B cell receptor signaling pathways were significantly over-represented in the DEGs. A PPI network containing 101 interactions was acquired. The top three hub genes were RAC1, CAV1, and BCL2. Five modules were identified from the PPI network. Taken together, these findings could advance the understanding of the molecular mechanisms underlying intrinsic chemotherapy resistance in EOC.

  13. Path2Models: large-scale generation of computational models from biochemical pathway maps

    PubMed Central

    2013-01-01

    Background Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data. Results To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps. Conclusions To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized. PMID:24180668

  14. EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum.

    PubMed

    Hussein, Zeti Azura Mohamed; Loke, Kok Keong; Abidin, Rabiatul Adawiah Zainal; Othman, Roohaida

    2011-01-01

    Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/.

  15. EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum

    PubMed Central

    Hussein, Zeti Azura Mohamed; Loke, Kok Keong; Abidin, Rabiatul Adawiah Zainal; Othman, Roohaida

    2011-01-01

    Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/. PMID:22102771

  16. Identification of hub genes and pathways associated with retinoblastoma based on co-expression network analysis.

    PubMed

    Wang, Q L; Chen, X; Zhang, M H; Shen, Q H; Qin, Z M

    2015-12-08

    The objective of this paper was to identify hub genes and pathways associated with retinoblastoma using centrality analysis of the co-expression network and pathway-enrichment analysis. The co-expression network of retinoblastoma was constructed by weighted gene co-expression network analysis (WGCNA) based on differentially expressed (DE) genes, and clusters were obtained through the molecular complex detection (MCODE) algorithm. Degree centrality analysis of the co-expression network was performed to explore hub genes present in retinoblastoma. Pathway-enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Validation of hub gene expression in retinoblastoma was performed by reverse transcription-polymerase chain reaction (RT-PCR) analysis. The co-expression network based on 221 DE genes between retinoblastoma and normal controls consisted of 210 nodes and 3965 edges, and 5 clusters of the network were evaluated. By assessing the centrality analysis of the co-expression network, 21 hub genes were identified, such as SNORD115-41, RASSF2, and SNORD115-44. According to RT-PCR analysis, 16 of the 21 hub genes were differently expressed, including RASSF2 and CDCA7, and 5 were not differently expressed in retinoblastoma compared to normal controls. Pathway analysis showed that genes in 2 clusters were enriched in 3 pathways: purine metabolism, p53 signaling pathway, and melanogenesis. In this study, we successfully identified 16 hub genes and 3 pathways associated with retinoblastoma, which may be potential biomarkers for early detection and therapy for retinoblastoma.

  17. A toolbox model of evolution of metabolic pathways on networks of arbitrary topology.

    PubMed

    Pang, Tin Yau; Maslov, Sergei

    2011-05-01

    In prokaryotic genomes the number of transcriptional regulators is known to be proportional to the square of the total number of protein-coding genes. A toolbox model of evolution was recently proposed to explain this empirical scaling for metabolic enzymes and their regulators. According to its rules, the metabolic network of an organism evolves by horizontal transfer of pathways from other species. These pathways are part of a larger "universal" network formed by the union of all species-specific networks. It remained to be understood, however, how the topological properties of this universal network influence the scaling law of functional content of genomes in the toolbox model. Here we answer this question by first analyzing the scaling properties of the toolbox model on arbitrary tree-like universal networks. We prove that critical branching topology, in which the average number of upstream neighbors of a node is equal to one, is both necessary and sufficient for quadratic scaling. We further generalize the rules of the model to incorporate reactions with multiple substrates/products as well as branched and cyclic metabolic pathways. To achieve its metabolic tasks, the new model employs evolutionary optimized pathways with minimal number of reactions. Numerical simulations of this realistic model on the universal network of all reactions in the KEGG database produced approximately quadratic scaling between the number of regulated pathways and the size of the metabolic network. To quantify the geometrical structure of individual pathways, we investigated the relationship between their number of reactions, byproducts, intermediate, and feedback metabolites. Our results validate and explain the ubiquitous appearance of the quadratic scaling for a broad spectrum of topologies of underlying universal metabolic networks. They also demonstrate why, in spite of "small-world" topology, real-life metabolic networks are characterized by a broad distribution of pathway

  18. Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.

    PubMed

    Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas

    2017-01-21

    We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.

  19. Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis

    PubMed Central

    Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas

    2017-01-01

    We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues. PMID:28117714

  20. Altered molecular expression of the TLR4/NF-κB signaling pathway in mammary tissue of Chinese Holstein cattle with mastitis.

    PubMed

    Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin

    2015-01-01

    Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis.

  1. Biofuel Database

    National Institute of Standards and Technology Data Gateway

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  2. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  3. The Importance of Biological Databases in Biological Discovery.

    PubMed

    Baxevanis, Andreas D; Bateman, Alex

    2015-06-19

    Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed.

  4. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Hepatopancreas of Microbial Challenged Mitten Crab Eriocheir sinensis

    PubMed Central

    Li, Xihong; Cui, Zhaoxia; Liu, Yuan; Song, Chengwen; Shi, Guohui

    2013-01-01

    Background The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq) technology provides a powerful and efficient method for transcript analysis and immune gene discovery. Methods/Principal Findings A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 108 cfu·mL−1) was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr) database. For function classification and pathway assignment, 18,734 (36.00%) unigenes were categorized to three Gene Ontology (GO) categories, 12,243 (23.51%) were classified to 25 Clusters of Orthologous Groups (COG), and 8,983 (17.25%) were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. Conclusions/Significance This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab. PMID:23874555

  5. Role and mechanism of the AMPK pathway in waterborne Zn exposure influencing the hepatic energy metabolism of Synechogobius hasta

    NASA Astrophysics Data System (ADS)

    Wu, Kun; Huang, Chao; Shi, Xi; Chen, Feng; Xu, Yi-Huan; Pan, Ya-Xiong; Luo, Zhi; Liu, Xu

    2016-12-01

    Previous studies have investigated the physiological responses in the liver of Synechogobius hasta exposed to waterborne zinc (Zn). However, at present, very little is known about the underlying molecular mechanisms of these responses. In this study, RNA sequencing (RNA-seq) was performed to analyse the differences in the hepatic transcriptomes between control and Zn-exposed S. hasta. A total of 36,339 unigenes and 1,615 bp of unigene N50 were detected. These genes were further annotated to the Nonredundant protein (NR), Nonredundant nucleotide (Nt), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG) and Gene Ontology (GO) databases. After 60 days of Zn exposure, 708 and 237 genes were significantly up- and down-regulated, respectively. Many differentially expressed genes (DEGs) involved in energy metabolic pathways were identified, and their expression profiles suggested increased catabolic processes and reduced biosynthetic processes. These changes indicated that waterborne Zn exposure increased the energy production and requirement, which was related to the activation of the AMPK signalling pathway. Furthermore, using the primary hepatocytes of S. hasta, we identified the role of the AMPK signalling pathway in Zn-influenced energy metabolism.

  6. Role and mechanism of the AMPK pathway in waterborne Zn exposure influencing the hepatic energy metabolism of Synechogobius hasta

    PubMed Central

    Wu, Kun; Huang, Chao; Shi, Xi; Chen, Feng; Xu, Yi-Huan; Pan, Ya-Xiong; Luo, Zhi; Liu, Xu

    2016-01-01

    Previous studies have investigated the physiological responses in the liver of Synechogobius hasta exposed to waterborne zinc (Zn). However, at present, very little is known about the underlying molecular mechanisms of these responses. In this study, RNA sequencing (RNA-seq) was performed to analyse the differences in the hepatic transcriptomes between control and Zn-exposed S. hasta. A total of 36,339 unigenes and 1,615 bp of unigene N50 were detected. These genes were further annotated to the Nonredundant protein (NR), Nonredundant nucleotide (Nt), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG) and Gene Ontology (GO) databases. After 60 days of Zn exposure, 708 and 237 genes were significantly up- and down-regulated, respectively. Many differentially expressed genes (DEGs) involved in energy metabolic pathways were identified, and their expression profiles suggested increased catabolic processes and reduced biosynthetic processes. These changes indicated that waterborne Zn exposure increased the energy production and requirement, which was related to the activation of the AMPK signalling pathway. Furthermore, using the primary hepatocytes of S. hasta, we identified the role of the AMPK signalling pathway in Zn-influenced energy metabolism. PMID:27934965

  7. LigandBox: A database for 3D structures of chemical compounds.

    PubMed

    Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki

    2013-01-01

    A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening.

  8. An Approach for Identification of Novel Drug Targets in Streptococcus pyogenes SF370 Through Pathway Analysis.

    PubMed

    Singh, Satendra; Singh, Dev Bukhsh; Singh, Anamika; Gautam, Budhayash; Ram, Gurudayal; Dwivedi, Seema; Ramteke, Pramod W

    2016-12-01

    Streptococcus pyogenes is one of the most important pathogens as it is involved in various infections affecting upper respiratory tract and skin. Due to the emergence of multidrug resistance and cross-resistance, S. Pyogenes is becoming more pathogenic and dangerous. In the present study, an in silico comparative analysis of total 65 metabolic pathways of the host (Homo sapiens) and the pathogen was performed. Initially, 486 paralogous enzymes were identified so that they can be removed from possible drug target list. The 105 enzymes of the biochemical pathways of S. pyogenes from the KEGG metabolic pathway database were compared with the proteins from the Homo sapiens by performing a BLASTP search against the non-redundant database restricted to the Homo sapiens subset. Out of these, 83 enzymes were identified as non-human homologous while 30 enzymes of inadequate amino acid length were removed for further processing. Essential enzymes were finally mined from remaining 53 enzymes. Finally, 28 essential enzymes were identified in S. pyogenes SF370 (serotype M1). In subcellular localization study, 18 enzymes were predicted with cytoplasmic localization and ten enzymes with the membrane localization. These ten enzymes with putative membrane localization should be of particular interest. Acyl-carrier-protein S-malonyltransferase, DNA polymerase III subunit beta and dihydropteroate synthase are novel drug targets and thus can be used to design potential inhibitors against S. pyogenes infection. 3D structure of dihydropteroate synthase was modeled and validated that can be used for virtual screening and interaction study of potential inhibitors with the target enzyme.

  9. Image Databases.

    ERIC Educational Resources Information Center

    Pettersson, Rune

    Different kinds of pictorial databases are described with respect to aims, user groups, search possibilities, storage, and distribution. Some specific examples are given for databases used for the following purposes: (1) labor markets for artists; (2) document management; (3) telling a story; (4) preservation (archives and museums); (5) research;…

  10. Maize databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This chapter is a succinct overview of maize data held in the species-specific database MaizeGDB (the Maize Genomics and Genetics Database), and selected multi-species data repositories, such as Gramene/Ensembl Plants, Phytozome, UniProt and the National Center for Biotechnology Information (NCBI), ...

  11. Integrative Pathway Analysis of Metabolic Signature in Bladder Cancer: A Linkage to The Cancer Genome Atlas Project and Prediction of Survival

    PubMed Central

    von Rundstedt, Friedrich-Carl; Rajapakshe, Kimal; Ma, Jing; Arnold, James M.; Gohlke, Jie; Putluri, Vasanta; Krishnapuram, Rashmi; Piyarathna, D. Badrajee; Lotan, Yair; Gödde, Daniel; Roth, Stephan; Störkel, Stephan; Levitt, Jonathan M.; Michailidis, George; Sreekumar, Arun; Lerner, Seth P.; Coarfa, Cristian; Putluri, Nagireddy

    2016-01-01

    Purpose We used targeted mass spectrometry to study the metabolic fingerprint of urothelial cancer and determine whether the biochemical pathway analysis gene signature would have a predictive value in independent cohorts of patients with bladder cancer. Materials and Methods Pathologically evaluated, bladder derived tissues, including benign adjacent tissue from 14 patients and bladder cancer from 46, were analyzed by liquid chromatography based targeted mass spectrometry. Differential metabolites associated with tumor samples in comparison to benign tissue were identified by adjusting the p values for multiple testing at a false discovery rate threshold of 15%. Enrichment of pathways and processes associated with the metabolic signature were determined using the GO (Gene Ontology) Database and MSigDB (Molecular Signature Database). Integration of metabolite alterations with transcriptome data from TCGA (The Cancer Genome Atlas) was done to identify the molecular signature of 30 metabolic genes. Available outcome data from TCGA portal were used to determine the association with survival. Results We identified 145 metabolites, of which analysis revealed 31 differential metabolites when comparing benign and tumor tissue samples. Using the KEGG (Kyoto Encyclopedia of Genes and Genomes) Database we identified a total of 174 genes that correlated with the altered metabolic pathways involved. By integrating these genes with the transcriptomic data from the corresponding TCGA data set we identified a metabolic signature consisting of 30 genes. The signature was significant in its prediction of survival in 95 patients with a low signature score vs 282 with a high signature score (p = 0.0458). Conclusions Targeted mass spectrometry of bladder cancer is highly sensitive for detecting metabolic alterations. Applying transcriptome data allows for integration into larger data sets and identification of relevant metabolic pathways in bladder cancer progression. PMID:26802582

  12. CSGene: a literature-based database for cell senescence genes and its application to identify critical cell aging pathways and associated diseases

    PubMed Central

    Zhao, M; Chen, L; Qu, H

    2016-01-01

    Cell senescence is a cellular process in which normal diploid cells cease to replicate and is a major driving force for human cancers and aging-associated diseases. Recent studies on cell senescence have identified many new genetic components and pathways that control cell aging. However, there is no comprehensive resource for cell senescence that integrates various genetic studies and relationships with cell senescence, and the risk associated with complex diseases such as cancer is still unexplored. We have developed the first literature-based gene resource for exploring cell senescence genes, CSGene. We complied 504 experimentally verified genes from public data resources and published literature. Pathway analyses highlighted the prominent roles of cell senescence genes in the control of rRNA gene transcription and unusual rDNA repeat that constitute a center for the stability of the whole genome. We also found a strong association of cell senescence with HIV-1 infection and viral carcinogenesis that are mainly related to promoter/enhancer binding and chromatin modification processes. Moreover, pan-cancer mutation and network analysis also identified common cell aging mechanisms in cancers and uncovered a highly modular network structure. These results highlight the utility of CSGene for elucidating the complex cellular events of cell senescence. PMID:26775705

  13. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  14. Neuroplasticity and second messenger pathways in antidepressant efficacy: pharmacogenetic results from a prospective trial investigating treatment resistance.

    PubMed

    Fabbri, Chiara; Crisafulli, Concetta; Calati, Raffaella; Albani, Diego; Forloni, Gianluigi; Calabrò, Marco; Martines, Rosalba; Kasper, Siegfried; Zohar, Joseph; Juven-Wetzler, Alzbeta; Souery, Daniel; Montgomery, Stuart; Mendlewicz, Julien; Serretti, Alessandro

    2017-03-04

    Genes belonging to neuroplasticity, monoamine, circadian rhythm, and transcription factor pathways were investigated as modulators of antidepressant efficacy. The present study aimed (1) to replicate previous findings in an independent sample with treatment-resistant depression (TRD), and (2) to perform a pathway analysis to investigate the possible molecular mechanisms involved. 220 patients with major depressive disorder who were non-responders to a previous antidepressant were treated with venlafaxine for 4-6 weeks and in case of non-response with escitalopram for 4-6 weeks. Symptoms were assessed using the Montgomery Asberg Depression Rating Scale. The phenotypes were response and remission to venlafaxine, non-response (TRDA) and non-remission (TRDB) to neither venlafaxine nor escitalopram. 50 tag SNPs in 14 genes belonging to the pathways of interest were tested for association with phenotypes. Molecular pathways (KEGG database) that included one or more of the genes associated with the phenotypes were investigated also in the STAR*D sample. The associations between ZNF804A rs7603001 and response, CREB1 rs2254137 and remission were replicated, as well as CHL1 rs2133402 and lower risk of TRD. Other CHL1 SNPs were potential predictors of TRD (rs1516340, rs2272522, rs1516338, rs2133402). The MAPK1 rs6928 SNP was consistently associated with all the phenotypes. The protein processing in endoplasmic reticulum pathway (hsa04141) was the best pathway that may explain the mechanisms of MAPK1 involvement in antidepressant response. Signals in genes previously associated with antidepressant efficacy were confirmed for CREB1, ZNF804A and CHL1. These genes play pivotal roles in synaptic plasticity, neural activity and connectivity.

  15. The NCBI BioSystems database.

    PubMed

    Geer, Lewis Y; Marchler-Bauer, Aron; Geer, Renata C; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H

    2010-01-01

    The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI's Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets.

  16. hp-DPI: Helicobacter pylori database of protein interactomes--embracing experimental and inferred interactions.

    PubMed

    Lin, Chung-Yen; Chen, Chia-Ling; Cho, Chi-Shiang; Wang, Li-Ming; Chang, Chia-Ming; Chen, Pao-Yang; Lo, Chen-Zen; Hsiung, Chao A

    2005-04-01

    We implemented a statistical model into our protein interaction database for validation of two-hybrid assays of Helicobacter pylori, and prediction of putative protein interactions not yet discovered experimentally. To present the enormous amount of experimental and inferred protein interaction networking maps, the H.pylori Database of Protein Interactomes (hp-DPI) is developed with a succinct yet comprehensive visualization tool integrated with annotation from Genbank, GO, and KEGG. hp-DPI is first built with, but not limited to, H.pylori protein interactions and is expected to naturally include other organisms' protein interacting relationships in the future.

  17. Experiment Databases

    NASA Astrophysics Data System (ADS)

    Vanschoren, Joaquin; Blockeel, Hendrik

    Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.

  18. Solubility Database

    National Institute of Standards and Technology Data Gateway

    SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

  19. Quantitative Proteogenomics and the Reconstruction of the Metabolic Pathway in Lactobacillus mucosae LM1

    PubMed Central

    Lee, Ji-Yoon

    2015-01-01

    Lactobacillus mucosae is a natural resident of the gastrointestinal tract of humans and animals and a potential probiotic bacterium. To understand the global protein expression profile and metabolic features of L. mucosae LM1 in the early stationary phase, the QExactiveTM Hybrid Quadrupole-Orbitrap Mass Spectrometer was used. Characterization of the intracellular proteome identified 842 proteins, accounting for approximately 35% of the 2,404 protein-coding sequences in the complete genome of L. mucosae LM1. Proteome quantification using QExactiveTM Orbitrap MS detected 19 highly abundant proteins (> 1.0% of the intracellular proteome), including CysK (cysteine synthase, 5.41%) and EF-Tu (elongation factor Tu, 4.91%), which are involved in cell survival against environmental stresses. Metabolic pathway annotation of LM1 proteome using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database showed that half of the proteins expressed are important for basic metabolic and biosynthetic processes, and the other half might be structurally important or involved in basic cellular processes. In addition, glycogen biosynthesis was activated in the early stationary phase, which is important for energy storage and maintenance. The proteogenomic data presented in this study provide a suitable reference to understand the protein expression pattern of lactobacilli in standard conditions. PMID:26761899

  20. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl.), a Non-Model Plant with Potent Laxative Properties.

    PubMed

    Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra

    2015-01-01

    Senna (Cassia angustifolia Vahl.) is a world's natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with 'green plant database (txid 33090)', Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various

  1. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl.), a Non-Model Plant with Potent Laxative Properties

    PubMed Central

    Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra

    2015-01-01

    Senna (Cassia angustifolia Vahl.) is a world’s natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with ‘green plant database (txid 33090)’, Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various

  2. De Novo Transcriptome Analysis of Wing Development-Related Signaling Pathways in Locusta migratoria Manilensis and Ostrinia furnacalis (Guenée)

    PubMed Central

    Chu, Yuan; Zhang, Long; Shen, Jie; An, Chunju

    2014-01-01

    Background Orthopteran migratory locust, Locusta migratoria, and lepidopteran Asian corn borer, Ostrinia furnacalis, are two types of insects undergoing incomplete and complete metamorphosis, respectively. Identification of candidate genes regulating wing development in these two insects would provide insights into the further study about the molecular mechanisms controlling metamorphosis development. We have sequenced the transcriptome of O. furnacalis larvae previously. Here we sequenced and characterized the transcriptome of L. migratoria wing discs with special emphasis on wing development-related signaling pathways. Methodology/Principal Findings Illumina Hiseq2000 was used to sequence 8.38 Gb of the transcriptome from dissected nymphal wing discs. De novo assembly generated 91,907 unigenes with mean length of 610 nt. All unigenes were searched against five databases including Nt, Nr, Swiss-Prot, COG, and KEGG for annotations using blastn or blastx algorithm with an cut-off E-value of 10−5. A total of 23,359 (25.4%) unigenes have homologs within at least one database. Based on sequence similarity to homologs known to regulate Drosophila melanogaster wing development, we identified 50 and 46 potential wing development-related unigenes from L. migratoria and O. furnacalis transcriptome, respectively. The identified unigenes encode putative orthologs for nearly all components of the Hedgehog (Hh), Decapentaplegic (Dpp), Notch (N), and Wingless (Wg) signaling pathways, which are essential for growth and pattern formation during wing development. We investigated the expression profiles of the component genes involved in these signaling pathways in forewings and hind wings of L. migratoria and O. furnacalis. The results revealed the tested genes had different expression patterns in two insects. Conclusions/Significance This study provides the comprehensive sequence resource of the wing development-related signaling pathways of L. migratoria. The obtained data

  3. Changes in the Proteome of Langat-Infected Ixodes scapularis ISE6 Cells: Metabolic Pathways Associated with Flavivirus Infection

    PubMed Central

    Grabowski, Jeffrey M.; Perera, Rushika; Roumani, Ali M.; Hedrick, Victoria E.; Inerowicz, Halina D.; Hill, Catherine A.; Kuhn, Richard J.

    2016-01-01

    Background Ticks (Family Ixodidae) transmit a variety of disease causing agents to humans and animals. The tick-borne flaviviruses (TBFs; family Flaviviridae) are a complex of viruses, many of which cause encephalitis and hemorrhagic fever, and represent global threats to human health and biosecurity. Pathogenesis has been well studied in human and animal disease models. Equivalent analyses of tick-flavivirus interactions are limited and represent an area of study that could reveal novel approaches for TBF control. Methodology/Principal Findings High resolution LC-MS/MS was used to analyze the proteome of Ixodes scapularis (Lyme disease tick) embryonic ISE6 cells following infection with Langat virus (LGTV) and identify proteins associated with viral infection and replication. Maximal LGTV infection of cells and determination of peak release of infectious virus, was observed at 36 hours post infection (hpi). Proteins were extracted from ISE6 cells treated with LGTV and non-infectious (UV inactivated) LGTV at 36 hpi and analyzed by mass spectrometry. The Omics Discovery Pipeline (ODP) identified thousands of MS peaks. Protein homology searches against the I. scapularis IscaW1 genome assembly identified a total of 486 proteins that were subsequently assigned to putative functional pathways using searches against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. 266 proteins were differentially expressed following LGTV infection relative to non-infected (mock) cells. Of these, 68 proteins exhibited increased expression and 198 proteins had decreased expression. The majority of the former were classified in the KEGG pathways: “translation”, “amino acid metabolism”, and “protein folding/sorting/degradation”. Finally, Trichostatin A and Oligomycin A increased and decreased LGTV replication in vitro in ISE6 cells, respectively. Conclusions/Significance Proteomic analyses revealed ISE6 proteins that were differentially expressed at the peak of LGTV

  4. Reconstruction of biological pathways and metabolic networks from in silico labeled metabolites.

    PubMed

    Hadadi, Noushin; Hafner, Jasmin; Soh, Keng Cher; Hatzimanikatis, Vassily

    2017-01-01

    Reaction atom mappings track the positional changes of all of the atoms between the substrates and the products as they undergo the biochemical transformation. However, information on atom transitions in the context of metabolic pathways is not widely available in the literature. The understanding of metabolic pathways at the atomic level is of great importance as it can deconvolute the overlapping catabolic/anabolic pathways resulting in the observed metabolic phenotype. The automated identification of atom transitions within a metabolic network is a very challenging task since the degree of complexity of metabolic networks dramatically increases when we transit from metabolite-level studies to atom-level studies. Despite being studied extensively in various approaches, the field of atom mapping of metabolic networks is lacking an automated approach, which (i) accounts for the information of reaction mechanism for atom mapping and (ii) is extendable from individual atom-mapped reactions to atom-mapped reaction networks. Hereby, we introduce a computational framework, iAM.NICE (in silico Atom Mapped Network Integrated Computational Explorer), for the systematic atom-level reconstruction of metabolic networks from in silico labelled substrates. iAM.NICE is to our knowledge the first automated atom-mapping algorithm that is based on the underlying enzymatic biotransformation mechanisms, and its application goes beyond individual reactions and it can be used for the reconstruction of atom-mapped metabolic networks. We illustrate the applicability of our method through the reconstruction of atom-mapped reactions of the KEGG database and we provide an example of an atom-level representation of the core metabolic network of E. coli.

  5. Pathway deviation-based biomarker and multi-effect target identification in asbestos-related squamous cell carcinoma of the lung.

    PubMed

    Du, Jiang; Zhang, Lin

    2017-03-01

    Asbestos-related lung carcinoma is one of the most devastating occupational cancers, and effective techniques for early diagnosis are still lacking. In the present study, a systematic approach was applied to detect a potential biomarker for asbestos-related lung cancer (ARLC); in particular asbestos-related squamous cell carcinoma (ARLC-SCC). Microarray data (GSE23822) were retrieved from the Gene Expression Omnibus database, including 26 ARLC-SCCs and 30 non-asbestos-related squamous cell lung carcinomas (NARLC-SCCs). Differentially expressed genes (DEGs) were identified by the limma package, and then a protein-protein interaction (PPI) network was constructed according to the BioGRID and HPRD databases. A novel scoring approach integrating an expression deviation score and network degree of the gene was then proposed to weight the DEGs. Subsequently, the important genes were uploaded to DAVID for pathway enrichment analysis. Pathway correlation analysis was carried out using Spearman's rank correlation coefficient of the pathscore. In total, 1,333 DEGs, 391 upregulated and 942 downregulated, were obtained between the ARLC-SCCs and NARLC-SCCs. A total of 524 important genes for ARLC-SCC were significantly enriched in 22 KEGG pathways. Correlation analysis of these pathways showed that the pathway of SNARE interactions in vesicular transport was significantly correlated with 12 other pathways. Additionally, obvious correlations were found between multiple pathways by sharing cross-talk genes (EGFR, PRKX, PDGFB, PIK3R3, SLK, IGF1, CDC42 and PRKCA). On the whole, our data demonstrate that 8 cross-talk genes were found to bridge multiple ARLC-SCC-specific pathways, which may be used as candidate biomarkers and potential multi-effect targets. As these genes are involved in multiple pathways, it is possible that drugs targeting these genes may thus be able to influence multiple pathways simultaneously.

  6. The Comparative Toxicogenomics Database facilitates identification and understanding of chemical-gene-disease associations: arsenic as a case study

    PubMed Central

    Davis, Allan P; Murphy, Cynthia G; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J

    2008-01-01

    Background The etiology of many chronic diseases involves interactions between environmental factors and genes that modulate physiological processes. Understanding interactions between environmental chemicals and genes/proteins may provide insights into the mechanisms of chemical actions, disease susceptibility, toxicity, and therapeutic drug interactions. The Comparative Toxicogenomics Database (CTD; ) provides these insights by curating and integrating data describing relationships between chemicals, genes/proteins, and human diseases. To illustrate the scope and application of CTD, we present an analysis of curated data for the chemical arsenic. Arsenic represents a major global environmental health threat and is associated with many diseases. The mechanisms by which arsenic modulates these diseases are not well understood. Methods Curated interactions between arsenic compounds and genes were downloaded using export and batch query tools at CTD. The list of genes was analyzed for molecular interactions, Gene Ontology (GO) terms, KEGG pathway annotations, and inferred disease relationships. Results CTD contains curated data from the published literature describing 2,738 molecular interactions between 21 different arsenic compounds and 1,456 genes and proteins. Analysis of these genes and proteins provide insight into the biological functions and molecular networks that are affected by exposure to arsenic, including stress response, apoptosis, cell cycle, and specific protein signaling pathways. Integrating arsenic-gene data with gene-disease data yields a list of diseases that may be associated with arsenic exposure and genes that may explain this association. Conclusion CTD data integration and curation strategies yield insight into the actions of environmental chemicals and provide a basis for developing hypotheses about the molecular mechanisms underlying the etiology of environmental diseases. While many reports describe the molecular response to arsenic, CTD

  7. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    SciTech Connect

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D.; Broadbelt, Linda J.; Hanson, Andrew D.; Fiehn, Oliver; Tyo, Keith E. J.; Henry, Christopher S.

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results

  8. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    DOE PAGES

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose

  9. Transcriptome Analysis of Secondary Metabolism Pathway, Transcription Factors, and Transporters in Response to Methyl Jasmonate in Lycoris aurea

    PubMed Central

    Wang, Rong; Xu, Sheng; Wang, Ning; Xia, Bing; Jiang, Yumei; Wang, Ren

    2017-01-01

    Lycoris aurea, a medicinal species of the Amaryllidaceae family, is used in the practice of traditional Chinese medicine (TCM) because of its broad pharmacological activities of Amaryllidaceae alkaloids. Despite the officinal and economic importance of Lycoris species, the secondary mechanism for this species is relatively deficient. In this study, we attempted to characterize the transcriptome profiling of L. aurea seedlings with the methyl jasmonate (MeJA) treatment to uncover the molecular mechanisms regulating plant secondary metabolite pathway. By using short reads sequencing technology (Illumina), two sequencing cDNA libraries prepared from control (Con) and 100 μM MeJA-treated (MJ100) samples were sequenced. A total of 26,809,842 and 25,874,478 clean reads in the Con and MJ100 libraries, respectively, were obtained and assembled into 59,643 unigenes. Among them, 41,585 (69.72%) unigenes were annotated by basic local alignment search tool similarity searches against public sequence databases. These included 55 Gene Ontology (GO) terms, 128 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and 25 Clusters of Orthologous Groups (COG) families. Additionally, 4,175 differentially expressed genes (DEGs; false discovery rate ≤ 0.001 and |log2 Ratio| ≥ 1) with 2,291 up-regulated and 1,884 down-regulated, were found to be affected significantly under MeJA treatment. Subsequently, the DEGs encoding key enzymes involving in the secondary metabolite biosynthetic pathways, transcription factors, and transporter proteins were also analyzed and summarized. Meanwhile, we confirmed the altered expression levels of the unigenes that encode transporters and transcription factors using quantitative real-time PCR (qRT-PCR). With this transcriptome sequencing, future genetic and genomics studies related to the molecular mechanisms associated with the chemical composition of L. aurea may be improved. Additionally, the genes involved in the enrichment of secondary

  10. Transcriptomic analysis of the head kidney of Topmouth culter (Culter alburnus) infected with Flavobacterium columnare with an emphasis on phagosome pathway.

    PubMed

    Zhao, Lijuan; Tu, Jiagang; Zhang, Yulei; Wang, Jinfu; Yang, Ling; Wang, Weimin; Wu, Zaohe; Meng, Qinglei; Lin, Li

    2016-10-01

    Flavobacterium columnare (FC) has caused worldwide fish columnaris disease with high mortality and great economic losses in cultured fish, including Topmouth culter (Culter alburnus). However, the knowledge about the host factors involved in FC infection is little known. In this study, the transcriptomic profiles of the head kidney from Topmouth culter with or without FC infection were obtained using HiSeq™ 2500 (Illumina). Totally 79,641 unigenes with high quality were obtained. Among them, 4037 differently expressed genes, including 1217 up-regulated and 2820 down-regulated genes, were identified and enriched using databases of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). The differently expressed genes were mainly associated with pathways such as immune response, carbohydrate metabolism, amino acid metabolism, and lipid metabolism. Since phagocytosis is a central mechanism of innate immune response by host cells to defense against infectious agents, genes related to the phagosome pathway were scrutinized and 9 differently expressed phagosome-related genes were identified including 3 up-regulated and 6 down-regulated genes. Five of them were further validated by quantitative real-time polymerase chain reaction (qRT-PCR). This transcriptomic analysis of host genes in response to FC infection provides data towards understanding the infection mechanisms and will shed a new light on the prevention of columnaris.

  11. Orchidstra: an integrated orchid functional genomics database.

    PubMed

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-02-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species.

  12. Atomic Databases

    NASA Astrophysics Data System (ADS)

    Mendoza, Claudio

    2000-10-01

    Atomic and molecular data are required in a variety of fields ranging from the traditional astronomy, atmospherics and fusion research to fast growing technologies such as lasers, lighting, low-temperature plasmas, plasma assisted etching and radiotherapy. In this context, there are some research groups, both theoretical and experimental, scattered round the world that attend to most of this data demand, but the implementation of atomic databases has grown independently out of sheer necessity. In some cases the latter has been associated with the data production process or with data centers involved in data collection and evaluation; but sometimes it has been the result of individual initiatives that have been quite successful. In any case, the development and maintenance of atomic databases call for a number of skills and an entrepreneurial spirit that are not usually associated with most physics researchers. In the present report we present some of the highlights in this area in the past five years and discuss what we think are some of the main issues that have to be addressed.

  13. De novo Transcriptome Analysis of Sinapis alba in Revealing the Glucosinolate and Phytochelatin Pathways

    PubMed Central

    Zhang, Xiaohui; Liu, Tongjin; Duan, Mengmeng; Song, Jiangping; Li, Xixiang

    2016-01-01

    Sinapis alba is an important condiment crop and can also be used as a phytoremediation plant. Though it has important economic and agronomic values, sequence data, and the genetic tools are still rare in this plant. In the present study, a de novo transcriptome based on the transcriptions of leaves, stems, and roots was assembled for S. alba for the first time. The transcriptome contains 47,972 unigenes with a mean length of 1185 nt and an N50 of 1672 nt. Among these unigenes, 46,535 (97%) unigenes were annotated by at least one of the following databases: NCBI non-redundant (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, Gene Ontology (GO), and Clusters of Orthologous Groups of proteins (COGs). The tissue expression pattern profiles revealed that 3489, 1361, and 8482 unigenes were predominantly expressed in the leaves, stems, and roots of S. alba, respectively. Genes predominantly expressed in the leaf were enriched in photosynthesis- and carbon fixation-related pathways. Genes predominantly expressed in the stem were enriched in not only pathways related to sugar, ether lipid, and amino acid metabolisms but also plant hormone signal transduction and circadian rhythm pathways, while the root-dominant genes were enriched in pathways related to lignin and cellulose syntheses, involved in plant-pathogen interactions, and potentially responsible for heavy metal chelating, and detoxification. Based on this transcriptome, 14,727 simple sequence repeats (SSRs) were identified, and 12,830 pairs of primers were developed for 2522 SSR-containing unigenes. Additionally, the glucosinolate (GSL) and phytochelatin metabolic pathways, which give the characteristic flavor and the heavy metal tolerance of this plant, were intensively analyzed. The genes of aliphatic GSLs pathway were predominantly expressed in roots. The absence of aliphatic GSLs in leaf tissues was due to the shutdown of BCAT4, MAM1, and CYP79F1 expressions. Glutathione was extensively

  14. Copy number variations and genome-wide associations reveal putative genes and metabolic pathways involved with the feed conversion ratio in beef cattle.

    PubMed

    de Almeida Santana, Miguel Henrique; Junior, Gerson Antônio Oliveira; Cesar, Aline Silva Mello; Freua, Mateus Castelani; da Costa Gomes, Rodrigo; da Luz E Silva, Saulo; Leme, Paulo Roberto; Fukumasu, Heidge; Carvalho, Minos Esperândio; Ventura, Ricardo Vieira; Coutinho, Luiz Lehmann; Kadarmideen, Haja N; Ferraz, José Bento Sterman

    2016-11-01

    The use of genome-wide association results combined with other genomic approaches may uncover genes and metabolic pathways related to complex traits. In this study, the phenotypic and genotypic data of 1475 Nellore (Bos indicus) cattle and 941,033 single nucleotide polymorphisms (SNPs) were used for genome-wide association study (GWAS) and copy number variations (CNVs) analysis in order to identify candidate genes and putative pathways involved with the feed conversion ratio (FCR). The GWAS was based on the Bayes B approach analyzing genomic windows with multiple regression models to estimate the proportion of genetic variance explained by each window. The CNVs were detected with PennCNV software using the log R ratio and B allele frequency data. CNV regions (CNVRs) were identified with CNVRuler and a linear regression was used to associate CNVRs and the FCR. Functional annotation of associated genomic regions was performed with the Database for Annotation, Visualization and Integrated Discovery (DAVID) and the metabolic pathways were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG). We showed five genomic windows distributed over chromosomes 4, 6, 7, 8, and 24 that explain 12 % of the total genetic variance for FCR, and detected 12 CNVRs (chromosomes 1, 5, 7, 10, and 12) significantly associated [false discovery rate (FDR) < 0.05] with the FCR. Significant genomic regions (GWAS and CNV) harbor candidate genes involved in pathways related to energetic, lipid, and protein metabolism. The metabolic pathways found in this study are related to processes directly connected to feed efficiency in beef cattle. It was observed that, even though different genomic regions and genes were found between the two approaches (GWAS and CNV), the metabolic processes covered were related to each other. Therefore, a combination of the approaches complement each other and lead to a better understanding of the FCR.

  15. De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

    PubMed

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.

  16. De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

    PubMed Central

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689

  17. OxDBase: a database of oxygenases involved in biodegradation

    PubMed Central

    Arora, Pankaj K; Kumar, Manish; Chauhan, Archana; Raghava, Gajendra PS; Jain, Rakesh K

    2009-01-01

    Background Oxygenases belong to the oxidoreductive group of enzymes (E.C. Class 1), which oxidize the substrates by transferring oxygen from molecular oxygen (O2) and utilize FAD/NADH/NADPH as the co-substrate. Oxygenases can further be grouped into two categories i.e. monooxygenases and dioxygenases on the basis of number of oxygen atoms used for oxidation. They play a key role in the metabolism of organic compounds by increasing their reactivity or water solubility or bringing about cleavage of the aromatic ring. Findings We compiled a database of biodegradative oxygenases (OxDBase) which provides a compilation of the oxygenase data as sourced from primary literature in the form of web accessible database. There are two separate search engines for searching into the database i.e. mono and dioxygenases database respectively. Each enzyme entry contains its common name and synonym, reaction in which enzyme is involved, family and subfamily, structure and gene link and literature citation. The entries are also linked to several external database including BRENDA, KEGG, ENZYME and UM-BBD providing wide background information. At present the database contains information of over 235 oxygenases including both dioxygenases and monooxygenases. This database is freely available online at . Conclusion OxDBase is the first database that is dedicated only to oxygenases and provides comprehensive information about them. Due to the importance of the oxygenases in chemical synthesis of drug intermediates and oxidation of xenobiotic compounds, OxDBase database would be very useful tool in the field of synthetic chemistry as well as bioremediation. PMID:19405962

  18. Genomic Contributors to Rhythm Outcome of Atrial Fibrillation Catheter Ablation – Pathway Enrichment Analysis of GWAS Data

    PubMed Central

    Ueberham, Laura; Dinov, Borislav; Sommer, Philipp; Arya, Arash; Hindricks, Gerhard; Bollmann, Andreas

    2016-01-01

    Background Left atrial enlargement and persistent atrial fibrillation (AF) are well-known predictors for arrhythmia recurrence after AF catheter ablation (LRAF). In this study, by using pathway enrichment analysis of GWAS data, we tested the hypothesis that genetic pathways associated with these phenotypes are also associated with LRAF. Methods Samples from 660 patients with paroxysmal (n = 370) or persistent AF (n = 290) undergoing de-novo AF catheter ablation were genotyped for ~1,000,000 SNPs. SNPs found to be significantly associated with left atrial diameter (LAD) or AF type were used for gene-based association tests in a systematic biological Knowledge-based mining system for Genome-wide Genetic studies (KGG). Associated genes were tested for pathway enrichment using WEB-based Gene SeT AnaLysis Toolkit (WebGestalt), the Gene Annotation Tool to Help Explain Relationships (GATHER) and the databases provided by Kyoto Encyclopedia of Genes and Genomes (KEGG). In a second step, the association of consistently enriched pathways and LRAF was tested. Results By using sequential 7-day Holter ECGs, LRAF between 3 and 12 months was observed in 48% and was associated with LAD (B = 1.801, 95% CI 0.760–2.841, p = 1.0E-3) and persistent AF (OR = 2.1; 95% CI 1.567–2.931, p = 2.0E-6). WebGestalt (adj. p = 2.7E-22) and GATHER (adj. p = 5.2E-3) identified the calcium signaling pathway (hsa04020) as the only consistently enriched pathway for LAD, while the extracellular matrix (ECM) -receptor interaction pathway (hsa04512) was the only consistently enriched pathway for AF type (adj. p = 2.1E-15 in WebGestalt; adj. p = 9.3E-4 in GATHER). Both calcium signaling (adj. p = 2.2E-17 in WebGestalt; adj. p = 2.9E-2 in GATHER) and ECM-receptor interaction (adj. p = 1.2E-10 in WebGestalt; adj. p = 2.9E-2 in GATHER) were significantly associated with LRAF. Conclusions Calcium signaling and ECM-receptor interaction pathways are associated with LAD and AF type and, in turn, with LRAF

  19. Stackfile Database

    NASA Technical Reports Server (NTRS)

    deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

    2013-01-01

    This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

  20. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

    PubMed Central

    Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143

  1. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

    PubMed

    Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.

  2. Predicting the diagnosis of autism spectrum disorder using gene pathway analysis

    PubMed Central

    Skafidas, E; Testa, R; Zantomio, D; Chana, G; Everall, I P; Pantelis, C

    2014-01-01

    Autism spectrum disorder (ASD) depends on a clinical interview with no biomarkers to aid diagnosis. The current investigation interrogated single-nucleotide polymorphisms (SNPs) of individuals with ASD from the Autism Genetic Resource Exchange (AGRE) database. SNPs were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG)-derived pathways to identify affected cellular processes and develop a diagnostic test. This test was then applied to two independent samples from the Simons Foundation Autism Research Initiative (SFARI) and Wellcome Trust 1958 normal birth cohort (WTBC) for validation. Using AGRE SNP data from a Central European (CEU) cohort, we created a genetic diagnostic classifier consisting of 237 SNPs in 146 genes that correctly predicted ASD diagnosis in 85.6% of CEU cases. This classifier also predicted 84.3% of cases in an ethnically related Tuscan cohort; however, prediction was less accurate (56.4%) in a genetically dissimilar Han Chinese cohort (HAN). Eight SNPs in three genes (KCNMB4, GNAO1, GRM5) had the largest effect in the classifier with some acting as vulnerability SNPs, whereas others were protective. Prediction accuracy diminished as the number of SNPs analyzed in the model was decreased. Our diagnostic classifier correctly predicted ASD diagnosis with an accuracy of 71.7% in CEU individuals from the SFARI (ASD) and WTBC (controls) validation data sets. In conclusion, we have developed an accurate diagnostic test for a genetically homogeneous group to aid in early detection of ASD. While SNPs differ across ethnic groups, our pathway approach identified cellular processes common to ASD across ethnicities. Our results have wide implications for detection, intervention and prevention of ASD. PMID:22965006

  3. A database of macromolecular motions.

    PubMed Central

    Gerstein, M; Krebs, W

    1998-01-01

    We describe a database of macromolecular motions meant to be of general use to the structural community. The database, which is accessible on the World Wide Web with an entry point at http://bioinfo.mbb.yale.edu/MolMovDB , attempts to systematize all instances of protein and nucleic acid movement for which there is at least some structural information. At present it contains >120 motions, most of which are of proteins. Protein motions are further classified hierarchically into a limited number of categories, first on the basis of size (distinguishing between fragment, domain and subunit motions) and then on the basis of packing. Our packing classification divides motions into various categories (shear, hinge, other) depending on whether or not they involve sliding over a continuously maintained and tightly packed interface. In addition, the database provides some indication about the evidence behind each motion (i.e. the type of experimental information or whether the motion is inferred based on structural similarity) and attempts to describe many aspects of a motion in terms of a standardized nomenclature (e.g. the maximum rotation, the residue selection of a fixed core, etc.). Currently, we use a standard relational design to implement the database. However, the complexity and heterogeneity of the information kept in the database makes it an ideal application for an object-relational approach, and we are moving it in this direction. Specifically, in terms of storing complex information, the database contains plausible representations for motion pathways, derived from restrained 3D interpolation between known endpoint conformations. These pathways can be viewed in a variety of movie formats, and the database is associated with a server that can automatically generate these movies from submitted coordinates. PMID:9722650

  4. Robust gene signatures from microarray data using genetic algorithms enriched with biological pathway keywords.

    PubMed

    Luque-Baena, R M; Urda, D; Gonzalo Claros, M; Franco, L; Jerez, J M

    2014-06-01

    Genetic algorithms are widely used in the estimation of expression profiles from microarrays data. However, these techniques are unable to produce stable and robust solutions suitable to use in clinical and biomedical studies. This paper presents a novel two-stage evolutionary strategy for gene feature selection combining the genetic algorithm with biological information extracted from the KEGG database. A comparative study is carried out over public data from three different types of cancer (leukemia, lung cancer and prostate cancer). Even though the analyses only use features having KEGG information, the results demonstrate that this two-stage evolutionary strategy increased the consistency, robustness and accuracy of a blind discrimination among relapsed and healthy individuals. Therefore, this approach could facilitate the definition of gene signatures for the clinical prognosis and diagnostic of cancer diseases in a near future. Additionally, it could also be used for biological knowledge discovery about the studied disease.

  5. Gene microarray assessment of multiple genes and signal pathways involved in androgen-dependent prostate cancer becoming androgen independent.

    PubMed

    Liu, Jun-Bao; Dai, Chun-Mei; Su, Xiao-Yun; Cao, Lu; Qin, Rui; Kong, Qing-Bo

    2014-01-01

    To study the gene expression change and possible signal pathway during androgen-dependent prostate cancer (ADPC) becoming androgen-independent prostate cancer (AIPC), an LNCaP cell model of AIPC was established using flutamide in combination with androgen-free environment inducement, and differential expression genes were screened by microarray. Then the biological process, molecular function and KEGG pathway of differential expression genes are analyzed by Molecule Annotation System (MAS). By comparison of 12,207 expression genes, 347 expression genes were acquired, of which 156 were up-ragulated and 191 down-regulated. After analyzing the biological process and molecule function of differential expression genes, these genes are found to play crucial roles in cell proliferation, differntiation, cell cycle control, protein metabolism and modification and other biological process, serve as signal molecules, enzymes, peptide hormones, cytokines, cytoskeletal proteins and adhesion molecules. The analysis of KEGG show that the relevant genes of AIPC transformation participate in glutathione metabolism, cell cycle, P53 signal pathway, cytochrome P450 metabolism, Hedgehog signal pathway, MAPK signal pathway, adipocytokines signal pathway, PPAR signal pathway, TGF-β signal pathway and JAK-STAT signal pathway. In conclusion, during the process of ADPC becoming AIPC, it is not only one specific gene or pathway, but multiple genes and pathways that change. The findings above lay the foundation for study of AIPC mechanism and development of AIPC targeting drugs.

  6. The MetaCyc Database.

    PubMed

    Karp, Peter D; Riley, Monica; Paley, Suzanne M; Pellegrini-Toole, Alida

    2002-01-01

    MetaCyc is a metabolic-pathway database that describes 445 pathways and 1115 enzymes occurring in 158 organisms. MetaCyc is a review-level database in that a given entry in MetaCyc often integrates information from multiple literature sources. The pathways in MetaCyc were determined experimentally, and are labeled with the species in which they are known to occur based on literature references examined to date. MetaCyc contains extensive commentary and literature citations. Applications of MetaCyc include pathway analysis of genomes, metabolic engineering and biochemistry education. MetaCyc is queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. MetaCyc is available via the World Wide Web at http://ecocyc.org/ecocyc/metacyc.html, and is available for local installation as a binary program for the PC and the Sun workstation, and as a set of flatfiles. Contact metacyc-info@ai.sri.com for information on obtaining a local copy of MetaCyc.

  7. Core Proteomic Analysis of Unique Metabolic Pathways of Salmonella enterica for the Identification of Potential Drug Targets

    PubMed Central

    2016-01-01

    Background Infections caused by Salmonella enterica, a Gram-negative facultative anaerobic bacteria belonging to the family of Enterobacteriaceae, are major threats to the health of humans and animals. The recent availability of complete genome data of pathogenic strains of the S. enterica gives new avenues for the identification of drug targets and drug candidates. We have used the genomic and metabolic pathway data to identify pathways and proteins essential to the pathogen and absent from the host. Methods We took the whole proteome sequence data of 42 strains of S. enterica and Homo sapiens along with KEGG-annotated metabolic pathway data, clustered proteins sequences using CD-HIT, identified essential genes using DEG database and discarded S. enterica homologs of human proteins in unique metabolic pathways (UMPs) and characterized hypothetical proteins with SVM-prot and InterProScan. Through this core proteomic analysis we have identified enzymes essential to the pathogen. Results The identification of 73 enzymes common in 42 strains of S. enterica is the real strength of the current study. We proposed all 73 unexplored enzymes as potential drug targets against the infections caused by the S. enterica. The study is comprehensive around S. enterica and simultaneously considered every possible pathogenic strain of S. enterica. This comprehensiveness turned the current study significant since, to the best of our knowledge it is the first subtractive core proteomic analysis of the unique metabolic pathways applied to any pathogen for the identification of drug targets. We applied extensive computational methods to shortlist few potential drug targets considering the druggability criteria e.g. Non-homologous to the human host, essential to the pathogen and playing significant role in essential metabolic pathways of the pathogen (i.e. S. enterica). In the current study, the subtractive proteomics through a novel approach was applied i.e. by considering only proteins

  8. JICST Factual Database JICST DNA Database

    NASA Astrophysics Data System (ADS)

    Shirokizawa, Yoshiko; Abe, Atsushi

    Japan Information Center of Science and Technology (JICST) has started the on-line service of DNA database in October 1988. This database is composed of EMBL Nucleotide Sequence Library and Genetic Sequence Data Bank. The authors outline the database system, data items and search commands. Examples of retrieval session are presented.

  9. De novo assembly and transcriptome analysis of the rubber tree (Hevea brasiliensis) and SNP markers development for rubber biosynthesis pathways.

    PubMed

    Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira

    2014-01-01

    Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.

  10. De Novo Assembly and Transcriptome Analysis of the Rubber Tree (Hevea brasiliensis) and SNP Markers Development for Rubber Biosynthesis Pathways

    PubMed Central

    Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira

    2014-01-01

    Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection. PMID:25048025

  11. Detection of driver pathways using mutated gene network in cancer.

    PubMed

    Li, Feng; Gao, Lin; Ma, Xiaoke; Yang, Xiaofei

    2016-06-21

    Distinguishing driver pathways has been extensively studied because they are critical for understanding the development and molecular mechanisms of cancers. Most existing methods for driver pathways are based on high coverage as well as high mutual exclusivity, with the underlying assumption that mutations are exclusive. However, in many cases, mutated driver genes in the same pathways are not strictly mutually exclusive. Based on this observation, we propose an index for quantifying mutual exclusivity between gene pairs. Then, we construct a mutated gene network for detecting driver pathways by integrating the proposed index and coverage. The detection of driver pathways on the mutated gene network consists of two steps: raw pathways are obtained using a CPM method, and the final driver pathways are selected using a strict testing strategy. We apply this method to glioblastoma and breast cancers and find that our method is more accurate than state-of-the-art methods in terms of enrichment of KEGG pathways. Furthermore, the detected driver pathways intersect with well-known pathways with moderate exclusivity, which cannot be discovered using the existing algorithms. In conclusion, the proposed method provides an effective way to investigate driver pathways in cancers.

  12. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    SciTech Connect

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  13. Inference of miRNA targets using evolutionary conservation and pathway analysis

    PubMed Central

    Gaidatzis, Dimos; van Nimwegen, Erik; Hausser, Jean; Zavolan, Mihaela

    2007-01-01

    Background MicroRNAs have emerged as important regulatory genes in a variety of cellular processes and, in recent years, hundreds of such genes have been discovered in animals. In contrast, functional annotations are available only for a very small fraction of these miRNAs, and even in these cases only partially. Results We developed a general Bayesian method for the inference of miRNA target sites, in which, for each miRNA, we explicitly model the evolution of orthologous target sites in a set of related species. Using this method we predict target sites for all known miRNAs in flies, worms, fish, and mammals. By comparing our predictions in fly with a reference set of experimentally tested miRNA-mRNA interactions we show that our general method performs at least as well as the most accurate methods available to date, including ones specifically tailored for target prediction in fly. An important novel feature of our model is that it explicitly infers the phylogenetic distribution of functional target sites, independently for each miRNA. This allows us to infer species-specific and clade-specific miRNA targeting. We also show that, in long human 3' UTRs, miRNA target sites occur preferentially near the start and near the end of the 3' UTR. To characterize miRNA function beyond the predicted lists of targets we further present a method to infer significant associations between the sets of targets predicted for individual miRNAs and specific biochemical pathways, in particular those of the KEGG pathway database. We show that this approach retrieves several known functional miRNA-mRNA associations, and predicts novel functions for known miRNAs in cell growth and in development. Conclusion We have presented a Bayesian target prediction algorithm without any tunable parameters, that can be applied to sequences from any clade of species. The algorithm automatically infers the phylogenetic distribution of functional sites for each miRNA, and assigns a posterior

  14. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  15. Databases: Beyond the Basics.

    ERIC Educational Resources Information Center

    Whittaker, Robert

    This presented paper offers an elementary description of database characteristics and then provides a survey of databases that may be useful to the teacher and researcher in Slavic and East European languages and literatures. The survey focuses on commercial databases that are available, usable, and needed. Individual databases discussed include:…

  16. Integrative data mining of high-throughput in vitro screens, in vivo data, and disease information to identify Adverse Outcome Pathway (AOP) signatures:ToxCast high-throughput screening data and Comparative Toxicogenomics Database (CTD) as a case study.

    EPA Science Inventory

    The Adverse Outcome Pathway (AOP) framework provides a systematic way to describe linkages between molecular and cellular processes and organism or population level effects. The current AOP assembly methods however, are inefficient. Our goal is to generate computationally-pr...

  17. Human Mitochondrial Protein Database

    National Institute of Standards and Technology Data Gateway

    SRD 131 Human Mitochondrial Protein Database (Web, free access)   The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.

  18. YMDB: the Yeast Metabolome Database

    PubMed Central

    Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S.

    2012-01-01

    The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated ‘metabolomic’ database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855

  19. Pathway-based factor analysis of gene expression data produces highly heritable phenotypes that associate with age.

    PubMed

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-03-09

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold ([Formula: see text]). These phenotypes are more heritable ([Formula: see text]) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors.

  20. Pathway-Based Factor Analysis of Gene Expression Data Produces Highly Heritable Phenotypes That Associate with Age

    PubMed Central

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-01-01

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 “pathway phenotypes” that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold (P<5.38×10−5). These phenotypes are more heritable (h2=0.32) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. PMID:25758824

  1. Functional diversity and structural disorder in the human ubiquitination pathway.

    PubMed

    Bhowmick, Pallab; Pancsa, Rita; Guharoy, Mainak; Tompa, Peter

    2013-01-01

    The ubiquitin-proteasome system plays a central role in cellular regulation and protein quality control (PQC). The system is built as a pyramid of increasing complexity, with two E1 (ubiquitin activating), few dozen E2 (ubiquitin conjugating) and several hundred E3 (ubiquitin ligase) enzymes. By collecting and analyzing E3 sequences from the KEGG BRITE database and literature, we assembled a coherent dataset of 563 human E3s and analyzed their various physical features. We found an increase in structural disorder of the system with multiple disorder predictors (IUPred - E1: 5.97%, E2: 17.74%, E3: 20.03%). E3s that can bind E2 and substrate simultaneously (single subunit E3, ssE3) have significantly higher disorder (22.98%) than E3s in which E2 binding (multi RING-finger, mRF, 0.62%), scaffolding (6.01%) and substrate binding (adaptor/substrate recognition subunits, 17.33%) functions are separated. In ssE3s, the disorder was localized in the substrate/adaptor binding domains, whereas the E2-binding RING/HECT-domains were structured. To demonstrate the involvement of disorder in E3 function, we applied normal modes and molecular dynamics analyses to show how a disordered and highly flexible linker in human CBL (an E3 that acts as a regulator of several tyrosine kinase-mediated signalling pathways) facilitates long-range conformational changes bringing substrate and E2-binding domains towards each other and thus assisting in ubiquitin transfer. E3s with multiple interaction partners (as evidenced by data in STRING) also possess elevated levels of disorder (hubs, 22.90% vs. non-hubs, 18.36%). Furthermore, a search in PDB uncovered 21 distinct human E3 interactions, in 7 of which the disordered region of E3s undergoes induced folding (or mutual induced folding) in the presence of the partner. In conclusion, our data highlights the primary role of structural disorder in the functions of E3 ligases that manifests itself in the substrate/adaptor binding functions as well

  2. Transcriptome Analysis of Pig In Vivo, In Vitro–Fertilized, and Nuclear Transfer Blastocyst-Stage Embryos Treated with Histone Deacetylase Inhibitors Postfusion and Activation Reveals Changes in the Lysosomal Pathway

    PubMed Central

    Whitworth, Kristin M.; Mao, Jiude; Lee, Kiho; Spollen, William G.; Samuel, Melissa S.; Walters, Eric M.; Spate, Lee D.

    2015-01-01

    Abstract Genetically modified pigs are commonly created via somatic cell nuclear transfer (SCNT). Treatment of reconstructed embryos with histone deacetylase inhibitors (HDACi) immediately after activation improves cloning efficiency. The objective of this experiment was to evaluate the transcriptome of SCNT embryos treated with suberoylanilide hydroxamic acid (SAHA), 4-iodo-SAHA (ISAHA), or Scriptaid as compared to untreated SCNT, in vitro–fertilized (IVF), and in vivo (IVV) blastocyst-stage embryos. SAHA (10 μM) had the highest level of blastocyst development at 43.9%, and all treatments except 10 μM ISAHA had the same percentage of blastocyst development as Scriptaid (p<0.05). Two treatments, 1.0 μM ISAHA and 1.0 μM SAHA, had higher mean cell number than No HDACi treatment (p<0.021). Embryo transfers performed with 10 μM SAHA- and 1 μM ISAHA-treated embryos resulted in the birth of healthy piglets. GenBank accession numbers from up- and downregulated transcripts were loaded into the Database for Annotation, Visualization and Integrated Discovery to identify enriched biological themes. HDACi treatment yielded the highest enrichment for transcripts within the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway, lysosome. The mean intensity of LysoTracker was lower in IVV embryos compared to IVF and SCNT embryos (p<0.0001). SAHA and ISAHA can successfully be used to create healthy piglets from SCNT. PMID:26731590

  3. Transcriptome Analysis of Pig In Vivo, In Vitro-Fertilized, and Nuclear Transfer Blastocyst-Stage Embryos Treated with Histone Deacetylase Inhibitors Postfusion and Activation Reveals Changes in the Lysosomal Pathway.

    PubMed

    Whitworth, Kristin M; Mao, Jiude; Lee, Kiho; Spollen, William G; Samuel, Melissa S; Walters, Eric M; Spate, Lee D; Prather, Randall S

    2015-08-01

    Genetically modified pigs are commonly created via somatic cell nuclear transfer (SCNT). Treatment of reconstructed embryos with histone deacetylase inhibitors (HDACi) immediately after activation improves cloning efficiency. The objective of this experiment was to evaluate the transcriptome of SCNT embryos treated with suberoylanilide hydroxamic acid (SAHA), 4-iodo-SAHA (ISAHA), or Scriptaid as compared to untreated SCNT, in vitro-fertilized (IVF), and in vivo (IVV) blastocyst-stage embryos. SAHA (10 μM) had the highest level of blastocyst development at 43.9%, and all treatments except 10 μM ISAHA had the same percentage of blastocyst development as Scriptaid (p<0.05). Two treatments, 1.0 μM ISAHA and 1.0 μM SAHA, had higher mean cell number than No HDACi treatment (p<0.021). Embryo transfers performed with 10 μM SAHA- and 1 μM ISAHA-treated embryos resulted in the birth of healthy piglets. GenBank accession numbers from up- and downregulated transcripts were loaded into the Database for Annotation, Visualization and Integrated Discovery to identify enriched biological themes. HDACi treatment yielded the highest enrichment for transcripts within the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway, lysosome. The mean intensity of LysoTracker was lower in IVV embryos compared to IVF and SCNT embryos (p<0.0001). SAHA and ISAHA can successfully be used to create healthy piglets from SCNT.

  4. Physiological Information Database (PID)

    EPA Science Inventory

    EPA has developed a physiological information database (created using Microsoft ACCESS) intended to be used in PBPK modeling. The database contains physiological parameter values for humans from early childhood through senescence as well as similar data for laboratory animal spec...

  5. THE ECOTOX DATABASE

    EPA Science Inventory

    The database provides chemical-specific toxicity information for aquatic life, terrestrial plants, and terrestrial wildlife. ECOTOX is a comprehensive ecotoxicology database and is therefore essential for providing and suppoirting high quality models needed to estimate population...

  6. Aviation Safety Issues Database

    NASA Technical Reports Server (NTRS)

    Morello, Samuel A.; Ricks, Wendell R.

    2009-01-01

    The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.

  7. Scopus database: a review.

    PubMed

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  8. Development of SRS.php, a Simple Object Access Protocol-based library for data acquisition from integrated biological databases.

    PubMed

    Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R

    2007-12-11

    Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.

  9. JICST Factual Database

    NASA Astrophysics Data System (ADS)

    Hayase, Shuichi; Okano, Keiko

    Japan Information Center of Science and Technology (JICST) has started the on-line service of JICST Crystal Structure Database (JICST CR) in this January (1990). This database provides the information of atomic positions in a crystal and related informations of the crystal. The database system and the crystal data in JICST CR are outlined in this manuscript.

  10. The NCBI Taxonomy database.

    PubMed

    Federhen, Scott

    2012-01-01

    The NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/taxonomy) is the standard nomenclature and classification repository for the International Nucleotide Sequence Database Collaboration (INSDC), comprising the GenBank, ENA (EMBL) and DDBJ databases. It includes organism names and taxonomic lineages for each of the sequences represented in the INSDC's nucleotide and protein sequence databases. The taxonomy database is manually curated by a small group of scientists at the NCBI who use the current taxonomic literature to maintain a phylogenetic taxonomy for the source organisms represented in the sequence databases. The taxonomy database is a central organizing hub for many of the resources at the NCBI, and provides a means for clustering elements within other domains of NCBI web site, for internal linking between domains of the Entrez system and for linking out to taxon-specific external resources on the web. Our primary purpose is to index the domain of sequences as conveniently as possible for our user community.

  11. Exploring metabolic pathway disruption in the subchronic phencyclidine model of schizophrenia with the Generalized Singular Value Decomposition

    PubMed Central

    2011-01-01

    (KEGG) metabolite pathway database) were altered in the PFC of PCP-treated rats. Several significant changes were discovered, notably: 1) neuroactive ligands active at glutamate and GABA receptors are disrupted in the PFC of PCP-treated animals, 2) glutamate dysfunction in these animals was not limited to compromised glutamatergic neurotransmission but also involves the disruption of metabolic pathways linked to glutamate; and 3) a specific series of purine reactions Xanthine ← Hypoxyanthine ↔ Inosine ← IMP → adenylosuccinate is also disrupted in the PFC of PCP-treated animals. Conclusions Network reordering via the GSVD provides a means to discover statistically validated differences in clustering between a pair of networks. In practice this analytical approach, when applied to metabolomic data, allows us to quantify the alterations in metabolic pathways between two experimental groups. With this new computational technique we identified metabolic pathway alterations that are consistent with known results. Furthermore, we discovered disruption in a novel series of purine reactions that may contribute to the PFC dysfunction and cognitive deficits seen in schizophrenia. PMID:21575198

  12. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection

    PubMed Central

    Rigden, Daniel J.; Fernández-Suárez, Xosé M.; Galperin, Michael Y.

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases. PMID:26740669

  13. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection.

    PubMed

    Rigden, Daniel J; Fernández-Suárez, Xosé M; Galperin, Michael Y

    2016-01-04

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases.

  14. IDPredictor: predict database links in biomedical database.

    PubMed

    Mehlhorn, Hendrik; Lange, Matthias; Scholz, Uwe; Schreiber, Falk

    2012-06-26

    Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data is spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure. To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments, an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge except out of the interlinked databases. A prerequisite of supporting the concept of an integrated data view is to acquire insights into cross-references among database entities. This issue is being hampered by the fact, that only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems. In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predicts and extracts cross-references from multiple life science databases and possible referenced data targets. We study the retrieval quality of our method and report on first, promising results. The method is implemented as the tool IDPredictor, which is published under the DOI 10.5447/IPK/2012/4 and is freely available using the URL: http://dx.doi.org/10.5447/IPK/2012/4.

  15. SENTRA, a database of signal transduction proteins.

    SciTech Connect

    D'Souza, M.; Romine, M. F.; Maltsev, N.; Mathematics and Computer Science; PNNL

    2000-01-01

    SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.

  16. An Introduction to Database Structure and Database Machines.

    ERIC Educational Resources Information Center

    Detweiler, Karen

    1984-01-01

    Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

  17. MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity

    PubMed Central

    2012-01-01

    Background Exposure to environmental tobacco smoke (ETS) leads to higher rates of pulmonary diseases and infections in children. To study the biochemical changes that may precede lung diseases, metabolomic effects on fetal and maternal lungs and plasma from rats exposed to ETS were compared to filtered air control animals. Genome- reconstructed metabolic pathways may be used to map and interpret dysregulation in metabolic networks. However, mass spectrometry-based non-targeted metabolomics datasets often comprise many metabolites for which links to enzymatic reactions have not yet been reported. Hence, network visualizations that rely on current biochemical databases are incomplete and also fail to visualize novel, structurally unidentified metabolites. Results We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development. Conclusions MetaMapp graphs efficiently

  18. 2010 Worldwide Gasification Database

    DOE Data Explorer

    The 2010 Worldwide Gasification Database describes the current world gasification industry and identifies near-term planned capacity additions. The database lists gasification projects and includes information (e.g., plant location, number and type of gasifiers, syngas capacity, feedstock, and products). The database reveals that the worldwide gasification capacity has continued to grow for the past several decades and is now at 70,817 megawatts thermal (MWth) of syngas output at 144 operating plants with a total of 412 gasifiers.

  19. ITS-90 Thermocouple Database

    National Institute of Standards and Technology Data Gateway

    SRD 60 NIST ITS-90 Thermocouple Database (Web, free access)   Web version of Standard Reference Database 60 and NIST Monograph 175. The database gives temperature -- electromotive force (emf) reference functions and tables for the letter-designated thermocouple types B, E, J, K, N, R, S and T. These reference functions have been adopted as standards by the American Society for Testing and Materials (ASTM) and the International Electrotechnical Commission (IEC).

  20. Veterans Administration Databases

    Cancer.gov

    The Veterans Administration Information Resource Center provides database and informatics experts, customer service, expert advice, information products, and web technology to VA researchers and others.

  1. Mugshot Identification Database (MID)

    National Institute of Standards and Technology Data Gateway

    NIST Mugshot Identification Database (MID) (PC database for purchase)   NIST Special Database 18 is being distributed for use in development and testing of automated mugshot identification systems. The database consists of three CD-ROMs, containing a total of 3248 images of variable size using lossless compression. A newer version of the compression/decompression software on the CDROM can be found at the website http://www.nist.gov/itl/iad/ig/nigos.cfm as part of the NBIS package.

  2. HIV Sequence Databases

    PubMed Central

    Kuiken, Carla; Korber, Bette; Shafer, Robert W.

    2008-01-01

    Two important databases are often used in HIV genetic research, the HIV Sequence Database in Los Alamos, which collects all sequences and focuses on annotation and data analysis, and the HIV RT/Protease Sequence Database in Stanford, which collects sequences associated with the development of viral resistance against anti-retroviral drugs and focuses on analysis of those sequences. The types of data and services these two databases offer, the tools they provide, and the way they are set up and operated are described in detail. PMID:12875108

  3. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    SciTech Connect

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K.

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.

  4. The BioPAX community standard for pathway data sharing.

    PubMed

    Demir, Emek; Cary, Michael P; Paley, Suzanne; Fukuda, Ken; Lemer, Christian; Vastrik, Imre; Wu, Guanming; D'Eustachio, Peter; Schaefer, Carl; Luciano, Joanne; Schacherer, Frank; Martinez-Flores, Irma; Hu, Zhenjun; Jimenez-Jacinto, Veronica; Joshi-Tope, Geeta; Kandasamy, Kumaran; Lopez-Fuentes, Alejandra C; Mi, Huaiyu; Pichler, Elgar; Rodchenkov, Igor; Splendiani, Andrea; Tkachev, Sasha; Zucker, Jeremy; Gopinath, Gopal; Rajasimha, Harsha; Ramakrishnan, Ranjani; Shah, Imran; Syed, Mustafa; Anwar, Nadia; Babur, Ozgün; Blinov, Michael; Brauner, Erik; Corwin, Dan; Donaldson, Sylva; Gibbons, Frank; Goldberg, Robert; Hornbeck, Peter; Luna, Augustin; Murray-Rust, Peter; Neumann, Eric; Ruebenacker, Oliver; Reubenacker, Oliver; Samwald, Matthias; van Iersel, Martijn; Wimalaratne, Sarala; Allen, Keith; Braun, Burk; Whirl-Carrillo, Michelle; Cheung, Kei-Hoi; Dahlquist, Kam; Finney, Andrew; Gillespie, Marc; Glass, Elizabeth; Gong, Li; Haw, Robin; Honig, Michael; Hubaut, Olivier; Kane, David; Krupa, Shiva; Kutmon, Martina; Leonard, Julie; Marks, Debbie; Merberg, David; Petri, Victoria; Pico, Alex; Ravenscroft, Dean; Ren, Liya; Shah, Nigam; Sunshine, Margot; Tang, Rebecca; Whaley, Ryan; Letovksy, Stan; Buetow, Kenneth H; Rzhetsky, Andrey; Schachter, Vincent; Sobral, Bruno S; Dogrusoz, Ugur; McWeeney, Shannon; Aladjem, Mirit; Birney, Ewan; Collado-Vides, Julio; Goto, Susumu; Hucka, Michael; Le Novère, Nicolas; Maltsev, Natalia; Pandey, Akhilesh; Thomas, Paul; Wingender, Edgar; Karp, Peter D; Sander, Chris; Bader, Gary D

    2010-09-01

    Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.

  5. The BioPAX community standard for pathway

    SciTech Connect

    Syed, Mustafa H

    2010-01-01

    Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.

  6. Consumer Product Category Database

    EPA Pesticide Factsheets

    The Chemical and Product Categories database (CPCat) catalogs the use of over 40,000 chemicals and their presence in different consumer products. The chemical use information is compiled from multiple sources while product information is gathered from publicly available Material Safety Data Sheets (MSDS). EPA researchers are evaluating the possibility of expanding the database with additional product and use information.

  7. BioImaging Database

    SciTech Connect

    David Nix, Lisa Simirenko

    2006-10-25

    The Biolmaging Database (BID) is a relational database developed to store the data and meta-data for the 3D gene expression in early Drosophila embryo development on a cellular level. The schema was written to be used with the MySQL DBMS but with minor modifications can be used on any SQL compliant relational DBMS.

  8. Biological Macromolecule Crystallization Database

    National Institute of Standards and Technology Data Gateway

    SRD 21 Biological Macromolecule Crystallization Database (Web, free access)   The Biological Macromolecule Crystallization Database and NASA Archive for Protein Crystal Growth Data (BMCD) contains the conditions reported for the crystallization of proteins and nucleic acids used in X-ray structure determinations and archives the results of microgravity macromolecule crystallization studies.

  9. Online Database Searching Workbook.

    ERIC Educational Resources Information Center

    Littlejohn, Alice C.; Parker, Joan M.

    Designed primarily for use by first-time searchers, this workbook provides an overview of online searching. Following a brief introduction which defines online searching, databases, and database producers, five steps in carrying out a successful search are described: (1) identifying the main concepts of the search statement; (2) selecting a…

  10. HIV Structural Database

    National Institute of Standards and Technology Data Gateway

    SRD 102 HIV Structural Database (Web, free access)   The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

  11. Atomic Spectra Database (ASD)

    National Institute of Standards and Technology Data Gateway

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  12. Structural Ceramics Database

    National Institute of Standards and Technology Data Gateway

    SRD 30 NIST Structural Ceramics Database (Web, free access)   The NIST Structural Ceramics Database (WebSCD) provides evaluated materials property data for a wide range of advanced ceramics known variously as structural ceramics, engineering ceramics, and fine ceramics.

  13. Morchella MLST database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...

  14. A Quality System Database

    NASA Technical Reports Server (NTRS)

    Snell, William H.; Turner, Anne M.; Gifford, Luther; Stites, William

    2010-01-01

    A quality system database (QSD), and software to administer the database, were developed to support recording of administrative nonconformance activities that involve requirements for documentation of corrective and/or preventive actions, which can include ISO 9000 internal quality audits and customer complaints.

  15. Knowledge Discovery in Databases.

    ERIC Educational Resources Information Center

    Norton, M. Jay

    1999-01-01

    Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…

  16. Ionic Liquids Database- (ILThermo)

    National Institute of Standards and Technology Data Gateway

    SRD 147 Ionic Liquids Database- (ILThermo) (Web, free access)   IUPAC Ionic Liquids Database, ILThermo, is a free web research tool that allows users worldwide to access an up-to-date data collection from the publications on experimental investigations of thermodynamic, and transport properties of ionic liquids as well as binary and ternary mixtures containing ionic liquids.

  17. Database Reviews: Legal Information.

    ERIC Educational Resources Information Center

    Seiser, Virginia

    Detailed reviews of two legal information databases--"Laborlaw I" and "Legal Resource Index"--are presented in this paper. Each database review begins with a bibliographic entry listing the title; producer; vendor; cost per hour contact time; offline print cost per citation; time period covered; frequency of updates; and size…

  18. An extended bioreaction database that significantly improves reconstruction and analysis of genome-scale metabolic networks.

    PubMed

    Stelzer, Michael; Sun, Jibin; Kamphans, Tom; Fekete, Sándor P; Zeng, An-Ping

    2011-11-01

    The bioreaction database established by Ma and Zeng (Bioinformatics, 2003, 19, 270-277) for in silico reconstruction of genome-scale metabolic networks has been widely used. Based on more recent information in the reference databases KEGG LIGAND and Brenda, we upgrade the bioreaction database in this work by almost doubling the number of reactions from 3565 to 6851. Over 70% of the reactions have been manually updated/revised in terms of reversibility, reactant pairs, currency metabolites and error correction. For the first time, 41 spontaneous sugar mutarotation reactions are introduced into the biochemical database. The upgrade significantly improves the reconstruction of genome scale metabolic networks. Many gaps or missing biochemical links can be recovered, as exemplified with three model organisms Homo sapiens, Aspergillus niger, and Escherichia coli. The topological parameters of the constructed networks were also largely affected, however, the overall network structure remains scale-free. Furthermore, we consider the problem of computing biologically feasible shortest paths in reconstructed metabolic networks. We show that these paths are hard to compute and present solutions to find such paths in networks of small and medium size.

  19. Deriving pathway maps from automated text analysis using a grammar-based approach.

    PubMed

    Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn

    2006-04-01

    We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.

  20. DDTRP: Database of Drug Targets for Resistant Pathogens

    PubMed Central

    Sundaramurthi, Jagadish Chandrabose; Ramanandan, Prabhakaran; Brindha, Sridharan; Subhasree, Chelladurai Ramarathnam; Prasad, Abhimanyu; Kumaraswami, Vasanthapuram; Hanna, Luke Elizabeth

    2011-01-01

    Emergence of drug resistance is a major threat to public health. Many pathogens have developed resistance to most of the existing antibiotics, and multidrug-resistant and extensively drug resistant strains are extremely difficult to treat. This has resulted in an urgent need for novel drugs. We describe a database called ‘Database of Drug Targets for Resistant Pathogens’ (DDTRP). The database contains information on drugs with reported resistance, their respective targets, metabolic pathways involving these targets, and a list of potential alternate targets for seven pathogens. The database can be accessed freely at http://bmi.icmr.org.in/DDTRP. PMID:21938213

  1. DIP: The Database of Interacting Proteins

    DOE Data Explorer

    The DIP Database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. By interaction, the DIP Database creators mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organisation and complexity of the protein interaction network at the cellular level. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. It is a relational database that can be searched by protein, sequence, motif, article information, and pathBLAST. The website also serves as an access point to a number of projects related to DIP, such as LiveDIP, The Database of Ligand-Receptor Partners (DLRP) and JDIP. Users have free and open access to DIP after login. [Taken from the DIP Guide and the DIP website] (Specialized Interface) (Registration Required)

  2. National Database of Geriatrics

    PubMed Central

    Kannegaard, Pia Nimann; Vinding, Kirsten L; Hare-Bruun, Helle

    2016-01-01

    Aim of database The aim of the National Database of Geriatrics is to monitor the quality of interdisciplinary diagnostics and treatment of patients admitted to a geriatric hospital unit. Study population The database population consists of patients who were admitted to a geriatric hospital unit. Geriatric patients cannot be defined by specific diagnoses. A geriatric patient is typically a frail multimorbid elderly patient with decreasing functional ability and social challenges. The database includes 14–15,000 admissions per year, and the database completeness has been stable at 90% during the past 5 years. Main variables An important part of the geriatric approach is the interdisciplinary collaboration. Indicators, therefore, reflect the combined efforts directed toward the geriatric patient. The indicators include Barthel index, body mass index, de Morton Mobility Index, Chair Stand, percentage of discharges with a rehabilitation plan, and the part of cases where an interdisciplinary conference has taken place. Data are recorded by doctors, nurses, and therapists in a database and linked to the Danish National Patient Register. Descriptive data Descriptive patient-related data include information about home, mobility aid, need of fall and/or cognitive diagnosing, and categorization of cause (general geriatric, orthogeriatric, or neurogeriatric). Conclusion The National Database of Geriatrics covers ∼90% of geriatric admissions in Danish hospitals and provides valuable information about a large and increasing patient population in the health care system. PMID:27822120

  3. Hazard Analysis Database Report

    SciTech Connect

    GRAMS, W.H.

    2000-12-28

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from the results of the hazard evaluations, and (2) Hazard Topography Database: Data from the system familiarization and hazard identification.

  4. Glycoproteomic and glycomic databases.

    PubMed

    Baycin Hizal, Deniz; Wolozny, Daniel; Colao, Joseph; Jacobson, Elena; Tian, Yuan; Krag, Sharon S; Betenbaugh, Michael J; Zhang, Hui

    2014-01-01

    Protein glycosylation serves critical roles in the cellular and biological processes of many organisms. Aberrant glycosylation has been associated with many illnesses such as hereditary and chronic diseases like cancer, cardiovascular diseases, neurological disorders, and immunological disorders. Emerging mass spectrometry (MS) technologies that enable the high-throughput identification of glycoproteins and glycans have accelerated the analysis and made possible the creation of dynamic and expanding databases. Although glycosylation-related databases have been established by many laboratories and institutions, they are not yet widely known in the community. Our study reviews 15 different publicly available databases and identifies their key elements so that users can identify the most applicable platform for their analytical needs. These databases include biological information on the experimentally identified glycans and glycopeptides from various cells and organisms such as human, rat, mouse, fly and zebrafish. The features of these databases - 7 for glycoproteomic data, 6 for glycomic data, and 2 for glycan binding proteins are summarized including the enrichment techniques that are used for glycoproteome and glycan identification. Furthermore databases such as Unipep, GlycoFly, GlycoFish recently established by our group are introduced. The unique features of each database, such as the analytical methods used and bioinformatical tools available are summarized. This information will be a valuable resource for the glycobiology community as it presents the analytical methods and glycosylation related databases together in one compendium. It will also represent a step towards the desired long term goal of integrating the different databases of glycosylation in order to characterize and categorize glycoproteins and glycans better for biomedical research.

  5. Phase Equilibria Diagrams Database

    National Institute of Standards and Technology Data Gateway

    SRD 31 NIST/ACerS Phase Equilibria Diagrams Database (PC database for purchase)   The Phase Equilibria Diagrams Database contains commentaries and more than 21,000 diagrams for non-organic systems, including those published in all 21 hard-copy volumes produced as part of the ACerS-NIST Phase Equilibria Diagrams Program (formerly titled Phase Diagrams for Ceramists): Volumes I through XIV (blue books); Annuals 91, 92, 93; High Tc Superconductors I & II; Zirconium & Zirconia Systems; and Electronic Ceramics I. Materials covered include oxides as well as non-oxide systems such as chalcogenides and pnictides, phosphates, salt systems, and mixed systems of these classes.

  6. JICST Factual Database

    NASA Astrophysics Data System (ADS)

    Suzuki, Kazuaki; Shimura, Kazuki; Monma, Yoshio; Sakamoto, Masao; Morishita, Hiroshi; Kanazawa, Kenji

    The Japan Information Center of Science and Technology (JICST) has started the on-line service of JICST/NRIM Materials Strength Database for Engineering Steels and Alloys (JICST ME) in this March (1990). This database has been developed under the joint research between JICST and the National Research Institute for Metals (NRIM). It provides material strength data (creep, fatigue, etc.) of engineering steels and alloys. It is able to search and display on-line, and to analyze the searched data statistically and plot the result on graphic display. The database system and the data in JICST ME are described.

  7. Plant Genome Duplication Database.

    PubMed

    Lee, Tae-Ho; Kim, Junah; Robertson, Jon S; Paterson, Andrew H

    2017-01-01

    Genome duplication, widespread in flowering plants, is a driving force in evolution. Genome alignments between/within genomes facilitate identification of homologous regions and individual genes to investigate evolutionary consequences of genome duplication. PGDD (the Plant Genome Duplication Database), a public web service database, provides intra- or interplant genome alignment information. At present, PGDD contains information for 47 plants whose genome sequences have been released. Here, we describe methods for identification and estimation of dates of genome duplication and speciation by functions of PGDD.The database is freely available at http://chibba.agtec.uga.edu/duplication/.

  8. A computational platform to maintain and migrate manual functional annotations for BioCyc databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Model organism databases are an important resource for information on biological pathways and genomic data. Such databases represent the accumulation of biological data, some of which has been manually curated from literature. An essential feature of these databases is the continuing data integratio...

  9. Numeric Databases in the Sciences.

    ERIC Educational Resources Information Center

    Meschel, S. V.

    1984-01-01

    Provides exploration into types of numeric databases available (also known as source databases, nonbibliographic databases, data-files, data-banks, fact banks); examines differences and similarities between bibliographic and numeric databases; identifies disciplines that utilize numeric databases; and surveys representative examples in the…

  10. Dynameomics: A comprehensive database of protein dynamics

    PubMed Central

    van der Kamp, Marc W.; Schaeffer, Richard D.; Jonsson, Amanda L.; Scouras, Alexander D.; Simms, Andrew; Toofanny, Rudesh D.; Benson, Noah C.; Anderson, Peter C.; Merkley, Eric D.; Rysavy, Steve; Bromley, Denny; Beck, David A. C.; Daggett, Valerie

    2010-01-01

    Summary The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 1000 proteins, representing the majority of folds in globular proteins. These data are stored and organized using an innovative database approach, which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins. Here we describe the project in general terms and the type of information contained in the database. Then we provide examples of mining the database for information relevant to protein folding, structure building, the effect of single-nucleotide polymorphisms, and drug design. The native state simulation data and corresponding analyses for the 100 most populated metafolds, together with related resources, are publicly accessible through www.dynameomics.org. PMID:20399180

  11. The ChEMBL database in 2017

    PubMed Central

    Gaulton, Anna; Hersey, Anne; Nowotka, Michał; Bento, A. Patrícia; Chambers, Jon; Mendez, David; Mutowo, Prudence; Atkinson, Francis; Bellis, Louisa J.; Cibrián-Uhalte, Elena; Davies, Mark; Dedman, Nathan; Karlsson, Anneli; Magariños, María Paula; Overington, John P.; Papadatos, George; Smit, Ines; Leach, Andrew R.

    2017-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services. PMID:27899562

  12. The ChEMBL database in 2017.

    PubMed

    Gaulton, Anna; Hersey, Anne; Nowotka, Michał; Bento, A Patrícia; Chambers, Jon; Mendez, David; Mutowo, Prudence; Atkinson, Francis; Bellis, Louisa J; Cibrián-Uhalte, Elena; Davies, Mark; Dedman, Nathan; Karlsson, Anneli; Magariños, María Paula; Overington, John P; Papadatos, George; Smit, Ines; Leach, Andrew R

    2017-01-04

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.

  13. THE CTEPP DATABASE

    EPA Science Inventory

    The CTEPP (Children's Total Exposure to Persistent Pesticides and Other Persistent Organic Pollutants) database contains a wealth of data on children's aggregate exposures to pollutants in their everyday surroundings. Chemical analysis data for the environmental media and ques...

  14. Chemical Kinetics Database

    National Institute of Standards and Technology Data Gateway

    SRD 17 NIST Chemical Kinetics Database (Web, free access)   The NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000.

  15. Hawaii bibliographic database

    NASA Astrophysics Data System (ADS)

    Wright, Thomas L.; Takahashi, Taeko Jane

    The Hawaii bibliographic database has been created to contain all of the literature, from 1779 to the present, pertinent to the volcanological history of the Hawaiian-Emperor volcanic chain. References are entered in a PC- and Macintosh-compatible EndNote Plus bibliographic database with keywords and s or (if no ) with annotations as to content. Keywords emphasize location, discipline, process, identification of new chemical data or age determinations, and type of publication. The database is updated approximately three times a year and is available to upload from an ftp site. The bibliography contained 8460 references at the time this paper was submitted for publication. Use of the database greatly enhances the power and completeness of library searches for anyone interested in Hawaiian volcanism.

  16. Enhancing medical database security.

    PubMed

    Pangalos, G; Khair, M; Bozios, L

    1994-08-01

    A methodology for the enhancement of database security in a hospital environment is presented in this paper which is based on both the discretionary and the mandatory database security policies. In this way the advantages of both approaches are combined to enhance medical database security. An appropriate classification of the different types of users according to their different needs and roles and a User Role Definition Hierarchy has been used. The experience obtained from the experimental implementation of the proposed methodology in a major general hospital is briefly discussed. The implementation has shown that the combined discretionary and mandatory security enforcement effectively limits the unauthorized access to the medical database, without severely restricting the capabilities of the system.

  17. Uranium Location Database Compilation

    EPA Pesticide Factsheets

    EPA has compiled mine location information from federal, state, and Tribal agencies into a single database as part of its investigation into the potential environmental hazards of wastes from abandoned uranium mines in the western United States.

  18. Livestock Anaerobic Digester Database

    EPA Pesticide Factsheets

    The Anaerobic Digester Database provides basic information about anaerobic digesters on livestock farms in the United States, organized in Excel spreadsheets. It includes projects that are under construction, operating, or shut down.

  19. Hawaii bibliographic database

    USGS Publications Warehouse

    Wright, T.L.; Takahashi, T.J.

    1998-01-01

    The Hawaii bibliographic database has been created to contain all of the literature, from 1779 to the present, pertinent to the volcanological history of the Hawaiian-Emperor volcanic chain. References are entered in a PC- and Macintosh-compatible EndNote Plus bibliographic database with keywords and abstracts or (if no abstract) with annotations as to content. Keywords emphasize location, discipline, process, identification of new chemical data or age determinations, and type of publication. The database is updated approximately three times a year and is available to upload from an ftp site. The bibliography contained 8460 references at the time this paper was submitted for publication. Use of the database greatly enhances the power and completeness of library searches for anyone interested in Hawaiian volcanism.

  20. Nuclear Science References Database

    SciTech Connect

    Pritychenko, B.; Běták, E.; Singh, B.; Totans, J.

    2014-06-15

    The Nuclear Science References (NSR) database together with its associated Web interface, is the world's only comprehensive source of easily accessible low- and intermediate-energy nuclear physics bibliographic information for more than 210,000 articles since the beginning of nuclear science. The weekly-updated NSR database provides essential support for nuclear data evaluation, compilation and research activities. The principles of the database and Web application development and maintenance are described. Examples of nuclear structure, reaction and decay applications are specifically included. The complete NSR database is freely available at the websites of the National Nuclear Data Center (http://www.nndc.bnl.gov/nsr) and the International Atomic Energy Agency (http://www-nds.iaea.org/nsr)

  1. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1994-05-27

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  2. Cancer Metabolomics and the Human Metabolome Database

    PubMed Central

    Wishart, David S.; Mandal, Rupasri; Stanislaus, Avalyn; Ramirez-Gaona, Miguel

    2016-01-01

    The application of metabolomics towards cancer research has led to a renewed appreciation of metabolism in cancer development and progression. It has also led to the discovery of metabolite cancer biomarkers and the identification of a number of novel cancer causing metabolites. The rapid growth of metabolomics in cancer research is also leading to challenges. In particular, with so many cancer-associate metabolites being identified, it is often difficult to keep track of which compounds are associated with which cancers. It is also challenging to track down information on the specific pathways that particular metabolites, drugs or drug metabolites may be affecting. Even more frustrating are the difficulties associated with identifying metabolites from NMR or MS spectra. Fortunately, a number of metabolomics databases are emerging that are designed to address these challenges. One such database is the Human Metabolome Database (HMDB). The HMDB is currently the world’s largest and most comprehensive, organism-specific metabolomics database. It contains more than 40,000 metabolite entries, thousands of metabolite concentrations, >700 metabolic and disease-associated pathways, as well as information on dozens of cancer biomarkers. This review is intended to provide a brief summary of the HMDB and to offer some guidance on how it can be used in metabolomic studies of cancer. PMID:26950159

  3. Cancer Metabolomics and the Human Metabolome Database.

    PubMed

    Wishart, David S; Mandal, Rupasri; Stanislaus, Avalyn; Ramirez-Gaona, Miguel

    2016-03-02

    The application of metabolomics towards cancer research has led to a renewed appreciation of metabolism in cancer development and progression. It has also led to the discovery of metabolite cancer biomarkers and the identification of a number of novel cancer causing metabolites. The rapid growth of metabolomics in cancer research is also leading to challenges. In particular, with so many cancer-associate metabolites being identified, it is often difficult to keep track of which compounds are associated with which cancers. It is also challenging to track down information on the specific pathways that particular metabolites, drugs or drug metabolites may be affecting. Even more frustrating are the difficulties associated with identifying metabolites from NMR or MS spectra. Fortunately, a number of metabolomics databases are emerging that are designed to address these challenges. One such database is the Human Metabolome Database (HMDB). The HMDB is currently the world's largest and most comprehensive, organism-specific metabolomics database. It contains more than 40,000 metabolite entries, thousands of metabolite concentrations, >700 metabolic and disease-associated pathways, as well as information on dozens of cancer biomarkers. This review is intended to provide a brief summary of the HMDB and to offer some guidance on how it can be used in metabolomic studies of cancer.

  4. Querying genomic databases

    SciTech Connect

    Baehr, A.; Hagstrom, R.; Joerg, D.; Overbeek, R.

    1991-09-01

    A natural-language interface has been developed that retrieves genomic information by using a simple subset of English. The interface spares the biologist from the task of learning database-specific query languages and computer programming. Currently, the interface deals with the E. coli genome. It can, however, be readily extended and shows promise as a means of easy access to other sequenced genomic databases as well.

  5. Database computing in HEP

    SciTech Connect

    Day, C.T.; Loken, S.; MacFarlane, J.F. ); May, E.; Lifka, D.; Lusk, E.; Price, L.E. ); Baden, A. . Dept. of Physics); Grossman, R.; Qin, X. . Dept. of Mathematics, Statistics and Computer Science); Cormell, L.; Leibold, P.; Liu, D

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples.

  6. Human mapping databases.

    PubMed

    Talbot, C; Cuticchia, A J

    2001-05-01

    This unit concentrates on the data contained within two human genome databasesGDB (Genome Database) and OMIM (Online Mendelian Inheritance in Man)and includes discussion of different methods for submitting and accessing data. An understanding of electronic mail, FTP, and the use of a World Wide Web (WWW) navigational tool such as Netscape or Internet Explorer is a prerequisite for utilizing the information in this unit.

  7. Steam Properties Database

    National Institute of Standards and Technology Data Gateway

    SRD 10 NIST/ASME Steam Properties Database (PC database for purchase)   Based upon the International Association for the Properties of Water and Steam (IAPWS) 1995 formulation for the thermodynamic properties of water and the most recent IAPWS formulations for transport and other properties, this updated version provides water properties over a wide range of conditions according to the accepted international standards.

  8. The comprehensive peptaibiotics database.

    PubMed

    Stoppacher, Norbert; Neumann, Nora K N; Burgstaller, Lukas; Zeilinger, Susanne; Degenkolb, Thomas; Brückner, Hans; Schuhmacher, Rainer

    2013-05-01

    Peptaibiotics are nonribosomally biosynthesized peptides, which - according to definition - contain the marker amino acid α-aminoisobutyric acid (Aib) and possess antibiotic properties. Being known since 1958, a constantly increasing number of peptaibiotics have been described and investigated with a particular emphasis on hypocrealean fungi. Starting from the existing online 'Peptaibol Database', first published in 1997, an exhaustive literature survey of all known peptaibiotics was carried out and resulted in a list of 1043 peptaibiotics. The gathered information was compiled and used to create the new 'The Comprehensive Peptaibiotics Database', which is presented here. The database was devised as a software tool based on Microsoft (MS) Access. It is freely available from the internet at http://peptaibiotics-database.boku.ac.at and can easily be installed and operated on any computer offering a Windows XP/7 environment. It provides useful information on characteristic properties of the peptaibiotics included such as peptide category, group name of the microheterogeneous mixture to which the peptide belongs, amino acid sequence, sequence length, producing fungus, peptide subfamily, molecular formula, and monoisotopic mass. All these characteristics can be used and combined for automated search within the database, which makes The Comprehensive Peptaibiotics Database a versatile tool for the retrieval of valuable information about peptaibiotics. Sequence data have been considered as to December 14, 2012.

  9. Drinking Water Database

    NASA Technical Reports Server (NTRS)

    Murray, ShaTerea R.

    2004-01-01

    This summer I had the opportunity to work in the Environmental Management Office (EMO) under the Chemical Sampling and Analysis Team or CS&AT. This team s mission is to support Glenn Research Center (GRC) and EM0 by providing chemical sampling and analysis services and expert consulting. Services include sampling and chemical analysis of water, soil, fbels, oils, paint, insulation materials, etc. One of this team s major projects is the Drinking Water Project. This is a project that is done on Glenn s water coolers and ten percent of its sink every two years. For the past two summers an intern had been putting together a database for this team to record the test they had perform. She had successfully created a database but hadn't worked out all the quirks. So this summer William Wilder (an intern from Cleveland State University) and I worked together to perfect her database. We began be finding out exactly what every member of the team thought about the database and what they would change if any. After collecting this data we both had to take some courses in Microsoft Access in order to fix the problems. Next we began looking at what exactly how the database worked from the outside inward. Then we began trying to change the database but we quickly found out that this would be virtually impossible.

  10. The Transporter Classification Database

    PubMed Central

    Saier, Milton H.; Reddy, Vamsee S.; Tamang, Dorjee G.; Västermark, Åke

    2014-01-01

    The Transporter Classification Database (TCDB; http://www.tcdb.org) serves as a common reference point for transport protein research. The database contains more than 10 000 non-redundant proteins that represent all currently recognized families of transmembrane molecular transport systems. Proteins in TCDB are organized in a five level hierarchical system, where the first two levels are the class and subclass, the second two are the family and subfamily, and the last one is the transport system. Superfamilies that contain multiple families are included as hyperlinks to the five tier TC hierarchy. TCDB includes proteins from all types of living organisms and is the only transporter classification system that is both universal and recognized by the International Union of Biochemistry and Molecular Biology. It has been expanded by manual curation, contains extensive text descriptions providing structural, functional, mechanistic and evolutionary information, is supported by unique software and is interconnected to many other relevant databases. TCDB is of increasing usefulness to the international scientific community and can serve as a model for the expansion of database technologies. This manuscript describes an update of the database descriptions previously featured in NAR database issues. PMID:24225317

  11. Specialist Bibliographic Databases.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  12. Specialist Bibliographic Databases

    PubMed Central

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  13. Crude Oil Analysis Database

    DOE Data Explorer

    Shay, Johanna Y.

    The composition and physical properties of crude oil vary widely from one reservoir to another within an oil field, as well as from one field or region to another. Although all oils consist of hydrocarbons and their derivatives, the proportions of various types of compounds differ greatly. This makes some oils more suitable than others for specific refining processes and uses. To take advantage of this diversity, one needs access to information in a large database of crude oil analyses. The Crude Oil Analysis Database (COADB) currently satisfies this need by offering 9,056 crude oil analyses. Of these, 8,500 are United States domestic oils. The database contains results of analysis of the general properties and chemical composition, as well as the field, formation, and geographic location of the crude oil sample. [Taken from the Introduction to COAMDATA_DESC.pdf, part of the zipped software and database file at http://www.netl.doe.gov/technologies/oil-gas/Software/database.html] Save the zipped file to your PC. When opened, it will contain PDF documents and a large Excel spreadsheet. It will also contain the database in Microsoft Access 2002.

  14. Databases: Peter's Picks and Pans.

    ERIC Educational Resources Information Center

    Jacso, Peter

    1995-01-01

    Reviews the best and worst in databases on disk, CD-ROM, and online, and offers judgments and observations on database characteristics. Two databases are praised and three are criticized. (Author/JMV)

  15. Genotype Correlation Analysis Reveals Pathway-Based Functional Disequilibrium and Potential Epistasis in the Human Interactome

    PubMed Central

    Bush, William S.; Haines, Jonathan L.

    2016-01-01

    Epistasis is thought to be a pervasive part of complex phenotypes due to the dynamics and complexity of biological systems, and a further understanding of epistasis in the context of biological pathways may provide insight into the etiology of complex disease. In this study, we use genotype data from the International HapMap Project to characterize the functional dependencies between alleles in the human interactome as defined by KEGG pathways. We performed chi-square tests to identify non-independence between functionally-related SNP pairs within parental Caucasian and Yoruba samples. We further refine this list by testing for skewed transmission of pseudo-haplotypes to offspring using a haplotype-based TDT test. From these analyses, we identify pathways enriched for functional disequilibrium, and a set of 863 SNP pairs (representing 453 gene pairs) showing consistent non-independence and transmission distortion. These results represent gene pairs with strong evidence of epistasis within the context of a biological function.

  16. Bioinformatics Analysis Reveals MicroRNAs Regulating Biological Pathways in Exercise-Induced Cardiac Physiological Hypertrophy

    PubMed Central

    Xu, Jiahong; Liu, Yang; Xie, Yuan

    2017-01-01

    Exercise-induced physiological cardiac hypertrophy is generally considered to be a type of adaptive change after exercise training and is beneficial for cardiovascular diseases. This study aims at investigating exercise-regulated microRNAs (miRNAs) and their potential biological pathways. Here, we collected 23 miRNAs from 8 published studies. MirPath v.3 from the DIANA tools website was used to execute the analysis, and TargetScan was used to predict the target genes. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses were performed to identify potential pathways and functional annotations associated with exercise-induced physiological cardiac hypertrophy. Various miRNA targets and molecular pathways, such as Fatty acid elongation, Arrhythmogenic right ventricular cardiomyopathy (ARVC), and ECM-receptor interaction, were identified. This study could prompt the understanding of the regulatory mechanisms underlying exercise-induced physiological cardiac hypertrophy. PMID:28286759

  17. Bioinformatics Analysis Reveals MicroRNAs Regulating Biological Pathways in Exercise-Induced Cardiac Physiological Hypertrophy.

    PubMed

    Xu, Jiahong; Liu, Yang; Xie, Yuan; Zhao, Cuimei; Wang, Hongbao

    2017-01-01

    Exercise-induced physiological cardiac hypertrophy is generally considered to be a type of adaptive change after exercise training and is beneficial for cardiovascular diseases. This study aims at investigating exercise-regulated microRNAs (miRNAs) and their potential biological pathways. Here, we collected 23 miRNAs from 8 published studies. MirPath v.3 from the DIANA tools website was used to execute the analysis, and TargetScan was used to predict the target genes. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses were performed to identify potential pathways and functional annotations associated with exercise-induced physiological cardiac hypertrophy. Various miRNA targets and molecular pathways, such as Fatty acid elongation, Arrhythmogenic right ventricular cardiomyopathy (ARVC), and ECM-receptor interaction, were identified. This study could prompt the understanding of the regulatory mechanisms underlying exercise-induced physiological cardiac hypertrophy.

  18. Pathway analysis of body mass index genome-wide association study highlights risk pathways in cardiovascular disease

    PubMed Central

    Zhao, Xin; Gu, Jinxia; Li, Ming; Xi, Jie; Sun, Wenyu; Song, Guangmin; Liu, Guiyou

    2015-01-01

    Cardiovascular disease (CVD) is a class of diseases that involve the heart or blood vessels. It is reported that body mass index (BMI) is risk factor for CVD. Genome-wide association studies (GWAS) have recently provided rapid insights into genetics of CVD and its risk factors. However, the specific mechanisms how BMI influences CVD risk are largely unknown. We think that BMI may influences CVD risk by shared genetic pathways. In order to confirm this view, we conducted a pathway analysis of BMI GWAS, which examined approximately 329,091 single nucleotide polymorphisms from 4763 samples. We identified 31 significant KEGG pathways. There is literature evidence supporting the involvement of GnRH signaling, vascular smooth muscle contraction, dilated cardiomyopathy, Gap junction, Wnt signaling, Calcium signaling and Chemokine signaling in CVD. Collectively, our study supports the potential role of the CVD risk pathways in BMI. BMI may influence CVD risk by the shared genetic pathways. We believe that our results may advance our understanding of BMI mechanisms in CVD. PMID:26264282

  19. WikiPathways App for Cytoscape: Making biological pathways amenable to network analysis and visualization.

    PubMed

    Kutmon, Martina; Lotia, Samad; Evelo, Chris T; Pico, Alexander R

    2014-01-01

    In this paper we present the open-source WikiPathways app for Cytoscape ( http://apps.cytoscape.org/apps/wikipathways) that can be used to import biological pathways for data visualization and network analysis. WikiPathways is an open, collaborative biological pathway database that provides fully annotated pathway diagrams for manual download or through web services. The WikiPathways app allows users to load pathways in two different views: as an annotated pathway ideal for data visualization and as a simple network to perform computational analysis. An example pathway and dataset are used to demonstrate the functionality of the WikiPathways app and how they can be combined and used together with other apps. More than 3000 downloads in the first 12 months following its release in August 2013 highlight the importance and adoption of the app in the network biology field.

  20. Differential pathway network analysis used to identify key pathways associated with pediatric pneumonia.

    PubMed

    Yang, Jun-Bo; Luo, Rong; Yan, Yan; Chen, Yan

    2016-12-01

    We aimed to identify key pathways to further explore the molecular mechanism underlying pediatric pneumonia using differential pathway network which integrated protein-protein interactions (PPI) data and pathway information. PPI data and pathway information were obtained from STRING and Reactome database, respectively. Next, pathway interactions were identified on the basis of constructing gene-gene interactions randomly, and a weight value computed using Spearman correlation coefficient was assigned to each pathway-pathway interaction, thereby to further detect differential pathway interactions. Subsequently, construction of differential pathway network was implemented using Cytoscope, following by network clustering analysis using ClusterONE. Finally, topological analysis for differential pathway network was performed to identify hub pathway which had top 5% degree distribution. Significantly, 901 pathways were identified to construct pathway interactions. After discarding the pathway interactions with weight value < 1.2, a differential pathway network was constructed, which contained 499 interactions and 347 pathways. Topological analysis showed 17 hub pathways (FGFR1 fusion mutants, molecules associated with elastic fibres, FGFR1 mutant receptor activation, and so on) were identified. Significantly, signaling by FGFR1 fusion mutants and FGFR1 mutant receptor activation simultaneously appeared in two clusters. Molecules associated with elastic fibres existed in one cluster. Accordingly, differential pathway network method might serve as a predictive tool to help us to further understand the development of pediatric pneumonia. FGFR1 fusion mutants, FGFR1 mutant receptor activation, and molecules associated with elastic fibres might play important roles in the progression of pediatric pneumonia.

  1. Great Basin paleontological database

    USGS Publications Warehouse

    Zhang, N.; Blodgett, R.B.; Hofstra, A.H.

    2008-01-01

    The U.S. Geological Survey has constructed a paleontological database for the Great Basin physiographic province that can be served over the World Wide Web for data entry, queries, displays, and retrievals. It is similar to the web-database solution that we constructed for Alaskan paleontological data (www.alaskafossil.org). The first phase of this effort was to compile a paleontological bibliography for Nevada and portions of adjacent states in the Great Basin that has recently been completed. In addition, we are also compiling paleontological reports (Known as E&R reports) of the U.S. Geological Survey, which are another extensive source of l,egacy data for this region. Initial population of the database benefited from a recently published conodont data set and is otherwise focused on Devonian and Mississippian localities because strata of this age host important sedimentary exhalative (sedex) Au, Zn, and barite resources and enormons Carlin-type An deposits. In addition, these strata are the most important petroleum source rocks in the region, and record the transition from extension to contraction associated with the Antler orogeny, the Alamo meteorite impact, and biotic crises associated with global oceanic anoxic events. The finished product will provide an invaluable tool for future geologic mapping, paleontological research, and mineral resource investigations in the Great Basin, making paleontological data acquired over nearly the past 150 yr readily available over the World Wide Web. A description of the structure of the database and the web interface developed for this effort are provided herein. This database is being used ws a model for a National Paleontological Database (which we am currently developing for the U.S. Geological Survey) as well as for other paleontological databases now being developed in other parts of the globe. ?? 2008 Geological Society of America.

  2. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    SciTech Connect

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  3. NASA Records Database

    NASA Technical Reports Server (NTRS)

    Callac, Christopher; Lunsford, Michelle

    2005-01-01

    The NASA Records Database, comprising a Web-based application program and a database, is used to administer an archive of paper records at Stennis Space Center. The system begins with an electronic form, into which a user enters information about records that the user is sending to the archive. The form is smart : it provides instructions for entering information correctly and prompts the user to enter all required information. Once complete, the form is digitally signed and submitted to the database. The system determines which storage locations are not in use, assigns the user s boxes of records to some of them, and enters these assignments in the database. Thereafter, the software tracks the boxes and can be used to locate them. By use of search capabilities of the software, specific records can be sought by box storage locations, accession numbers, record dates, submitting organizations, or details of the records themselves. Boxes can be marked with such statuses as checked out, lost, transferred, and destroyed. The system can generate reports showing boxes awaiting destruction or transfer. When boxes are transferred to the National Archives and Records Administration (NARA), the system can automatically fill out NARA records-transfer forms. Currently, several other NASA Centers are considering deploying the NASA Records Database to help automate their records archives.

  4. ADANS database specification

    SciTech Connect

    1997-01-16

    The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.

  5. The Chandra Bibliography Database

    NASA Astrophysics Data System (ADS)

    Rots, A. H.; Winkelman, S. L.; Paltani, S.; Blecksmith, S. E.; Bright, J. D.

    2004-07-01

    Early in the mission, the Chandra Data Archive started the development of a bibliography database, tracking publications in refereed journals and on-line conference proceedings that are based on Chandra observations, allowing our users to link directly to articles in the ADS from our archive, and to link to the relevant data in the archive from the ADS entries. Subsequently, we have been working closely with the ADS and other data centers, in the context of the ADEC-ITWG, on standardizing the literature-data linking. We have also extended our bibliography database to include all Chandra-related articles and we are also keeping track of the number of citations of each paper. Obviously, in addition to providing valuable services to our users, this database allows us to extract a wide variety of statistical information. The project comprises five components: the bibliography database-proper, a maintenance database, an interactive maintenance tool, a user browsing interface, and a web services component for exchanging information with the ADS. All of these elements are nearly mission-independent and we intend make the package as a whole available for use by other data centers. The capabilities thus provided represent support for an essential component of the Virtual Observatory.

  6. FishTraits Database

    USGS Publications Warehouse

    Angermeier, Paul L.; Frimpong, Emmanuel A.

    2009-01-01

    The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. FishTraits is a database of >100 traits for 809 (731 native and 78 exotic) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database contains information on four major categories of traits: (1) trophic ecology, (2) body size and reproductive ecology (life history), (3) habitat associations, and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status is also included. Together, we refer to the traits, distribution, and conservation status information as attributes. Descriptions of attributes are available here. Many sources were consulted to compile attributes, including state and regional species accounts and other databases.

  7. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James L.; Christiansen, Eric L.; Lear, Dana M.

    2011-01-01

    With three missions outstanding, the Shuttle Hypervelocity Impact Database has nearly 3000 entries. The data is divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. Details and insights on the contents of the database including examples of descriptive statistics will be provided. Post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will also be discussed. Potential enhancements to the database structure and availability of the data for other researchers will be addressed in the Future Work section. A related database of returned surfaces from the International Space Station will also be introduced.

  8. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James I.; Christiansen, Eric I.; Lear, Dana M.

    2011-01-01

    With three flights remaining on the manifest, the shuttle impact hypervelocity database has over 2800 entries. The data is currently divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. The paper will provide details and insights on the contents of the database including examples of descriptive statistics using the impact data. A discussion of post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will be presented. Future work to be discussed will be possible enhancements to the database structure and availability of the data for other researchers. A related database of ISS returned surfaces that are under development will also be introduced.

  9. PIECE 2.0: an update for the plant gene structure comparison and evolution database

    PubMed Central

    Wang, Yi; Xu, Ling; Thilmony, Roger; You, Frank M.; Gu, Yong Q.; Coleman-Derr, Devin

    2017-01-01

    PIECE (Plant Intron Exon Comparison and Evolution) is a web-accessible database that houses intron and exon information of plant genes. PIECE serves as a resource for biologists interested in comparing intron–exon organization and provides valuable insights into the evolution of gene structure in plant genomes. Recently, we updated PIECE to a new version, PIECE 2.0 (http://probes.pw.usda.gov/piece or http://aegilops.wheat.ucdavis.edu/piece). PIECE 2.0 contains annotated genes from 49 sequenced plant species as compared to 25 species in the previous version. In the current version, we also added several new features: (i) a new viewer was developed to show phylogenetic trees displayed along with the structure of individual genes; (ii) genes in the phylogenetic tree can now be also grouped according to KOG (The annotation of Eukaryotic Orthologous Groups) and KO (KEGG Orthology) in addition to Pfam domains; (iii) information on intronless genes are now included in the database; (iv) a statistical summary of global gene structure information for each species and its comparison with other species was added; and (v) an improved GSDraw tool was implemented in the web server to enhance the analysis and display of gene structure. The updated PIECE 2.0 database will be a valuable resource for the plant research community for the study of gene structure and evolution. PMID:27742820

  10. De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris.

    PubMed

    Niu, Shan-Ce; Xu, Qing; Zhang, Guo-Qiang; Zhang, Yong-Qiang; Tsai, Wen-Chieh; Hsu, Jui-Ling; Liang, Chieh-Kai; Luo, Yi-Bo; Liu, Zhong-Jian

    2016-09-27

    Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues representing the root, stem, leaf, flower buds, column, lip, petal, sepal and three developmental stages of seeds. Our aims were to contribute to a better understanding of the molecular mechanisms driving the analysed tissue characteristics and to enrich the available data for P. equestris. Here, we present three databases. The first dataset is the RNA-Seq raw reads, which can be used to execute new experiments with different analysis approaches. The other two datasets allow different types of searches for candidate homologues. The second dataset includes the sets of assembled unigenes and predicted coding sequences and proteins, enabling a sequence-based search. The third dataset consists of the annotation results of the aligned unigenes versus the Nonredundant (Nr) protein database, Kyoto Encyclopaedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) databases with low e-values, enabling a name-based search.

  11. De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris

    PubMed Central

    Niu, Shan-Ce; Xu, Qing; Zhang, Guo-Qiang; Zhang, Yong-Qiang; Tsai, Wen-Chieh; Hsu, Jui-Ling; Liang, Chieh-Kai; Luo, Yi-Bo; Liu, Zhong-Jian

    2016-01-01

    Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues representing the root, stem, leaf, flower buds, column, lip, petal, sepal and three developmental stages of seeds. Our aims were to contribute to a better understanding of the molecular mechanisms driving the analysed tissue characteristics and to enrich the available data for P. equestris. Here, we present three databases. The first dataset is the RNA-Seq raw reads, which can be used to execute new experiments with different analysis approaches. The other two datasets allow different types of searches for candidate homologues. The second dataset includes the sets of assembled unigenes and predicted coding sequences and proteins, enabling a sequence-based search. The third dataset consists of the annotation results of the aligned unigenes versus the Nonredundant (Nr) protein database, Kyoto Encyclopaedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) databases with low e-values, enabling a name-based search. PMID:27673730

  12. VIEWCACHE: An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nick; Sellis, Timoleon

    1991-01-01

    The objective is to illustrate the concept of incremental access to distributed databases. An experimental database management system, ADMS, which has been developed at the University of Maryland, in College Park, uses VIEWCACHE, a database access method based on incremental search. VIEWCACHE is a pointer-based access method that provides a uniform interface for accessing distributed databases and catalogues. The compactness of the pointer structures formed during database browsing and the incremental access method allow the user to search and do inter-database cross-referencing with no actual data movement between database sites. Once the search is complete, the set of collected pointers pointing to the desired data are dereferenced.

  13. Open Geoscience Database

    NASA Astrophysics Data System (ADS)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data

  14. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on R-32, R-123, R-124, R- 125, R-134a, R-141b, R142b, R-143a, R-152a, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses polyalkylene glycol (PAG), ester, and other lubricants. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits.

  15. The PROSITE database

    PubMed Central

    Hulo, Nicolas; Bairoch, Amos; Bulliard, Virginie; Cerutti, Lorenzo; De Castro, Edouard; Langendijk-Genevaux, Petra S.; Pagni, Marco; Sigrist, Christian J. A.

    2006-01-01

    The PROSITE database consists of a large collection of biologically meaningful signatures that are described as patterns or profiles. Each signature is linked to a documentation that provides useful biological information on the protein family, domain or functional site identified by the signature. The PROSITE database is now complemented by a series of rules that can give more precise information about specific residues. During the last 2 years, the documentation and the ScanProsite web pages were redesigned to add more functionalities. The latest version of PROSITE (release 19.11 of September 27, 2005) contains 1329 patterns and 552 profile entries. Over the past 2 years more than 200 domains have been added, and now 52% of UniProtKB/Swiss-Prot entries (release 48.1 of September 27, 2005) have a cross-reference to a PROSITE entry. The database is accessible at . PMID:16381852

  16. Medical database security evaluation.

    PubMed

    Pangalos, G J

    1993-01-01

    Users of medical information systems need confidence in the security of the system they are using. They also need a method to evaluate and compare its security capabilities. Every system has its own requirements for maintaining confidentiality, integrity and availability. In order to meet these requirements a number of security functions must be specified covering areas such as access control, auditing, error recovery, etc. Appropriate confidence in these functions is also required. The 'trust' in trusted computer systems rests on their ability to prove that their secure mechanisms work as advertised and cannot be disabled or diverted. The general framework and requirements for medical database security and a number of parameters of the evaluation problem are presented and discussed. The problem of database security evaluation is then discussed, and a number of specific proposals are presented, based on a number of existing medical database security systems.

  17. Mouse genome database 2016

    PubMed Central

    Bult, Carol J.; Eppig, Janan T.; Blake, Judith A.; Kadin, James A.; Richardson, Joel E.

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  18. Mouse genome database 2016.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-04

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data.

  19. Enhancing medical database semantics.

    PubMed Central

    Leão, B. de F.; Pavan, A.

    1995-01-01

    Medical Databases deal with dynamic, heterogeneous and fuzzy data. The modeling of such complex domain demands powerful semantic data modeling methodologies. This paper describes GSM-Explorer a Case Tool that allows for the creation of relational databases using semantic data modeling techniques. GSM Explorer fully incorporates the Generic Semantic Data Model-GSM enabling knowledge engineers to model the application domain with the abstraction mechanisms of generalization/specialization, association and aggregation. The tool generates a structure that implements persistent database-objects through the automatic generation of customized SQL ANSI scripts that sustain the semantics defined in the higher lever. This paper emphasizes the system architecture and the mapping of the semantic model into relational tables. The present status of the project and its further developments are discussed in the Conclusions. PMID:8563288

  20. National Ambient Radiation Database

    SciTech Connect

    Dziuban, J.; Sears, R.

    2003-02-25

    The U.S. Environmental Protection Agency (EPA) recently developed a searchable database and website for the Environmental Radiation Ambient Monitoring System (ERAMS) data. This site contains nationwide radiation monitoring data for air particulates, precipitation, drinking water, surface water and pasteurized milk. This site provides location-specific as well as national information on environmental radioactivity across several media. It provides high quality data for assessing public exposure and environmental impacts resulting from nuclear emergencies and provides baseline data during routine conditions. The database and website are accessible at www.epa.gov/enviro/. This site contains (1) a query for the general public which is easy to use--limits the amount of information provided, but includes the ability to graph the data with risk benchmarks and (2) a query for a more technical user which allows access to all of the data in the database, (3) background information on ER AMS.

  1. The Neotoma Paleoecology Database

    NASA Astrophysics Data System (ADS)

    Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

    2015-12-01

    The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community

  2. Database Management System

    NASA Technical Reports Server (NTRS)

    1990-01-01

    In 1981 Wayne Erickson founded Microrim, Inc, a company originally focused on marketing a microcomputer version of RIM (Relational Information Manager). Dennis Comfort joined the firm and is now vice president, development. The team developed an advanced spinoff from the NASA system they had originally created, a microcomputer database management system known as R:BASE 4000. Microrim added many enhancements and developed a series of R:BASE products for various environments. R:BASE is now the second largest selling line of microcomputer database management software in the world.

  3. JICST Factual Database(1)

    NASA Astrophysics Data System (ADS)

    Kurosawa, Shinji

    The outline of JICST factual database (JOIS-F), which JICST has started from January, 1988, and its online service are described in this paper. First, the author mentions the circumstances from 1973, when its planning was started, to the present, and its relation to "Project by Special Coordination Founds for Promoting Science and Technology". Secondly, databases, which are now under development aiming to start its services from fiscal 1988 or fiscal 1989, of DNA, metallic material intensity, crystal structure, chemical substance regulations, and so forth, are described. Lastly, its online service is briefly explained.

  4. Drycleaner Database - Region 7

    EPA Pesticide Factsheets

    THIS DATA ASSET NO LONGER ACTIVE: This is metadata documentation for the Region 7 Drycleaner Database (R7DryClnDB) which tracks all Region7 drycleaners who notify Region 7 subject to Maximum Achievable Control Technologiy (MACT) standards. The Air and Waste Management Division is the primary managing entity for this database. This work falls under objectives for EPA's 2003-2008 Strategic Plan (Goal 4) for Healthy Communities & Ecosystems, which are to reduce chemical and/or pesticide risks at facilities.

  5. The Genopolis Microarray Database

    PubMed Central

    Splendiani, Andrea; Brandizi, Marco; Even, Gael; Beretta, Ottavio; Pavelka, Norman; Pelizzola, Mattia; Mayhaus, Manuel; Foti, Maria; Mauri, Giancarlo; Ricciardi-Castagnoli, Paola

    2007-01-01

    Background Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. Results The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip® platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. Conclusion The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local

  6. Databases for plant phosphoproteomics.

    PubMed

    Schulze, Waltraud X; Yao, Qiuming; Xu, Dong

    2015-01-01

    Phosphorylation is the most studied posttranslational modification involved in signal transduction in stress responses, development, and growth. In the recent years large-scale phosphoproteomic studies were carried out using various model plants and several growth and stress conditions. Here we present an overview of online resources for plant phosphoproteomic databases: PhosPhAt as a resource for Arabidopsis phosphoproteins, P3DB as a resource expanding to crop plants, and Medicago PhosphoProtein Database as a resource for the model plant Medicago trunculata.

  7. Weathering Database Technology

    ERIC Educational Resources Information Center

    Snyder, Robert

    2005-01-01

    Collecting weather data is a traditional part of a meteorology unit at the middle level. However, making connections between the data and weather conditions can be a challenge. One way to make these connections clearer is to enter the data into a database. This allows students to quickly compare different fields of data and recognize which…

  8. Danish Gynecological Cancer Database

    PubMed Central

    Sørensen, Sarah Mejer; Bjørn, Signe Frahm; Jochumsen, Kirsten Marie; Jensen, Pernille Tine; Thranov, Ingrid Regitze; Hare-Bruun, Helle; Seibæk, Lene; Høgdall, Claus

    2016-01-01

    Aim of database The Danish Gynecological Cancer Database (DGCD) is a nationwide clinical cancer database and its aim is to monitor the treatment quality of Danish gynecological cancer patients, and to generate data for scientific purposes. DGCD also records detailed data on the diagnostic measures for gynecological cancer. Study population DGCD was initiated January 1, 2005, and includes all patients treated at Danish hospitals for cancer of the ovaries, peritoneum, fallopian tubes, cervix, vulva, vagina, and uterus, including rare histological types. Main variables DGCD data are organized within separate data forms as follows: clinical data, surgery, pathology, pre- and postoperative care, complications, follow-up visits, and final quality check. DGCD is linked with additional data from the Danish “Pathology Registry”, the “National Patient Registry”, and the “Cause of Death Registry” using the unique Danish personal identification number (CPR number). Descriptive data Data from DGCD and registers are available online in the Statistical Analysis Software portal. The DGCD forms cover almost all possible clinical variables used to describe gynecological cancer courses. The only limitation is the registration of oncological treatment data, which is incomplete for a large number of patients. Conclusion The very complete collection of available data from more registries form one of the unique strengths of DGCD compared to many other clinical databases, and provides unique possibilities for validation and completeness of data. The success of the DGCD is illustrated through annual reports, high coverage, and several peer-reviewed DGCD-based publications. PMID:27822089

  9. Uranium Location Database

    EPA Pesticide Factsheets

    A GIS compiled locational database in Microsoft Access of ~15,000 mines with uranium occurrence or production, primarily in the western United States. The metadata was cooperatively compiled from Federal and State agency data sets and enables the user to conduct geographic and analytical studies on mine impacts on the public and environment.

  10. The Exoplanet Orbit Database

    NASA Astrophysics Data System (ADS)

    Wright, J. T.; Fakhouri, O.; Marcy, G. W.; Han, E.; Feng, Y.; Johnson, John Asher; Howard, A. W.; Fischer, D. A.; Valenti, J. A.; Anderson, J.; Piskunov, N.

    2011-04-01

    We present a database of well-determined orbital parameters of exoplanets, and their host stars’ properties. This database comprises spectroscopic orbital elements measured for 427 planets orbiting 363 stars from radial velocity and transit measurements as reported in the literature. We have also compiled fundamental transit parameters, stellar parameters, and the method used for the planets discovery. This Exoplanet Orbit Database includes all planets with robust, well measured orbital parameters reported in peer-reviewed articles. The database is available in a searchable, filterable, and sortable form online through the Exoplanets Data Explorer table, and the data can be plotted and explored through the Exoplanet Data Explorer plotter. We use the Data Explorer to generate publication-ready plots, giving three examples of the signatures of exoplanet migration and dynamical evolution: We illustrate the character of the apparent correlation between mass and period in exoplanet orbits, the different selection biases between radial velocity and transit surveys, and that the multiplanet systems show a distinct semimajor-axis distribution from apparently singleton systems.

  11. Patent Family Databases.

    ERIC Educational Resources Information Center

    Simmons, Edlyn S.

    1985-01-01

    Reports on retrieval of patent information online and includes definition of patent family, basic and equivalent patents, "parents and children" applications, designated states, patent family databases--International Patent Documentation Center, World Patents Index, APIPAT (American Petroleum Institute), CLAIMS (IFI/Plenum). A table…

  12. Diatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 114 Diatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 121 diatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty, and reference are given for each transition reported.

  13. High Performance Buildings Database

    DOE Data Explorer

    The High Performance Buildings Database is a shared resource for the building industry, a unique central repository of in-depth information and data on high-performance, green building projects across the United States and abroad. The database includes information on the energy use, environmental performance, design process, finances, and other aspects of each project. Members of the design and construction teams are listed, as are sources for additional information. In total, up to twelve screens of detailed information are provided for each project profile. Projects range in size from small single-family homes or tenant fit-outs within buildings to large commercial and institutional buildings and even entire campuses. The database is a data repository as well. A series of Web-based data-entry templates allows anyone to enter information about a building project into the database. Once a project has been submitted, each of the partner organizations can review the entry and choose whether or not to publish that particular project on its own Web site.

  14. MARC and Relational Databases.

    ERIC Educational Resources Information Center

    Llorens, Jose; Trenor, Asuncion

    1993-01-01

    Discusses the use of MARC format in relational databases and addresses problems of incompatibilities. A solution is presented that is in accordance with Open Systems Interconnection (OSI) standards and is based on experiences at the library of the Universidad Politecnica de Valencia (Spain). (four references) (EA)

  15. Databases and data mining

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Over the course of the past decade, the breadth of information that is made available through online resources for plant biology has increased astronomically, as have the interconnectedness among databases, online tools, and methods of data acquisition and analysis. For maize researchers, the numbe...

  16. Hydrocarbon Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 115 Hydrocarbon Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 91 hydrocarbon molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  17. Danish Urogynaecological Database

    PubMed Central

    Hansen, Ulla Darling; Gradel, Kim Oren; Larsen, Michael Due

    2016-01-01

    The Danish Urogynaecological Database is established in order to ensure high quality of treatment for patients undergoing urogynecological surgery. The database contains details of all women in Denmark undergoing incontinence surgery or pelvic organ prolapse surgery amounting to ~5,200 procedures per year. The variables are collected along the course of treatment of the patient from the referral to a postoperative control. Main variables are prior obstetrical and gynecological history, symptoms, symptom-related quality of life, objective urogynecological findings, type of operation, complications if relevant, implants used if relevant, 3–6-month postoperative recording of symptoms, if any. A set of clinical quality indicators is being maintained by the steering committee for the database and is published in an annual report which also contains extensive descriptive statistics. The database has a completeness of over 90% of all urogynecological surgeries performed in Denmark. Some of the main variables have been validated using medical records as gold standard. The positive predictive value was above 90%. The data are used as a quality monitoring tool by the hospitals and in a number of scientific studies of specific urogynecological topics, broader epidemiological topics, and the use of patient reported outcome measures. PMID:27826217

  18. Food composition databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Food composition is the determination of what is in the foods we eat and is the critical bridge between nutrition, health promotion and disease prevention and food production. Compilation of data into useable databases is essential to the development of dietary guidance for individuals and populat...

  19. Redis database administration tool

    SciTech Connect

    Martinez, J. J.

    2013-02-13

    MyRedis is a product of the Lorenz subproject under the ASC Scirntific Data Management effort. MyRedis is a web based utility designed to allow easy administration of instances of Redis databases. It can be usedd to view and manipulate data as well as run commands directly against a variety of different Redis hosts.

  20. The CEBAF Element Database

    SciTech Connect

    Theodore Larrieu, Christopher Slominski, Michele Joyce

    2011-03-01

    With the inauguration of the CEBAF Element Database (CED) in Fall 2010, Jefferson Lab computer scientists have taken a step toward the eventual goal of a model-driven accelerator. Once fully populated, the database will be the primary repository of information used for everything from generating lattice decks to booting control computers to building controls screens. A requirement influencing the CED design is that it provide access to not only present, but also future and past configurations of the accelerator. To accomplish this, an introspective database schema was designed that allows new elements, types, and properties to be defined on-the-fly with no changes to table structure. Used in conjunction with Oracle Workspace Manager, it allows users to query data from any time in the database history with the same tools used to query the present configuration. Users can also check-out workspaces to use as staging areas for upcoming machine configurations. All Access to the CED is through a well-documented Application Programming Interface (API) that is translated automatically from original C++ source code into native libraries for scripting languages such as perl, php, and TCL making access to the CED easy and ubiquitous.

  1. Triatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 117 Triatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 55 triatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  2. Identification of genes and signaling pathways associated with diabetic neuropathy using a weighted correlation network analysis

    PubMed Central

    Li, Ya; Ma, Weiguo; Xie, Chuanqing; Zhang, Min; Yin, Xiaohong; Wang, Fenfen; Xu, Jie; Shi, Bingyin

    2016-01-01

    Abstract Background: The molecular mechanisms behind diabetic neuropathy remains to be investigated. Methods: This is a secondary study on microarray dataset (GSE24290) downloaded from Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI), which included 18 nerve tissue samples of progressing diabetic neuropathy (fibers loss ≥500 fibers/mm2) and 17 nerve tissue samples of nonprogressing diabetic neuropathy (fibers loss ≤100 fibers/mm2). Differentially expressed genes (DEGs) were screened between progressing and nonprogressing diabetic neuropathy. With the DEGs obtained, a weighted gene coexpression network analysis was conducted to identify gene clusters associated with diabetic neuropathy. Diabetes-related microRNAs (miRNAs) and their target genes were predicted and mapped to the genes in the gene clusters identified. Consequently, a miRNA–gene network was constructed, for which gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was performed. Potential drugs for treatment of diabetic neuropathy were also predicted. Results: Total 370 upregulated and 379 downregulated DEGs were screened between nonprogressing and progressing diabetic neuropathy. Has-miR-377, has-miR-216a, and has-miR-217 were associated with diabetes. Inflammation was the most significant GO term. The peroxisome proliferator-activated receptor (PPAR) pathway and the adenosine monophosphate (AMP)-activated protein kinase (AMPK) signaling pathway were significantly KEGG pathways significantly enriched with PPAR gamma (PPARG), stearoyl-CoA desaturase (SCD), cluster of differentiation 36 (CD36), and phosphoenolpyruvate carboxykinase 1 (PCK1). Conclusion: The study suggests that PPARG, SCD, CD36, PCK1, AMPK pathway, and PPAR pathway may be involved in progression of diabetic neuropathy. PMID:27893688

  3. JDD, Inc. Database

    NASA Technical Reports Server (NTRS)

    Miller, David A., Jr.

    2004-01-01

    JDD Inc, is a maintenance and custodial contracting company whose mission is to provide their clients in the private and government sectors "quality construction, construction management and cleaning services in the most efficient and cost effective manners, (JDD, Inc. Mission Statement)." This company provides facilities support for Fort Riley in Fo,rt Riley, Kansas and the NASA John H. Glenn Research Center at Lewis Field here in Cleveland, Ohio. JDD, Inc. is owned and operated by James Vaughn, who started as painter at NASA Glenn and has been working here for the past seventeen years. This summer I worked under Devan Anderson, who is the safety manager for JDD Inc. in the Logistics and Technical Information Division at Glenn Research Center The LTID provides all transportation, secretarial, security needs and contract management of these various services for the center. As a safety manager, my mentor provides Occupational Health and Safety Occupation (OSHA) compliance to all JDD, Inc. employees and handles all other issues (Environmental Protection Agency issues, workers compensation, safety and health training) involving to job safety. My summer assignment was not as considered "groundbreaking research" like many other summer interns have done in the past, but it is just as important and beneficial to JDD, Inc. I initially created a database using a Microsoft Excel program to classify and categorize data pertaining to numerous safety training certification courses instructed by our safety manager during the course of the fiscal year. This early portion of the database consisted of only data (training field index, employees who were present at these training courses and who was absent) from the training certification courses. Once I completed this phase of the database, I decided to expand the database and add as many dimensions to it as possible. Throughout the last seven weeks, I have been compiling more data from day to day operations and been adding the

  4. BIND—The Biomolecular Interaction Network Database

    PubMed Central

    Bader, Gary D.; Donaldson, Ian; Wolting, Cheryl; Ouellette, B. F. Francis; Pawson, Tony; Hogue, Christopher W. V.

    2001-01-01

    The Biomolecular Interaction Network Database (BIND; http://binddb.org) is a database designed to store full descriptions of interactions, molecular complexes and pathways. Development of the BIND 2.0 data model has led to the incorporation of virtually all components of molecular mechanisms including interactions between any two molecules composed of proteins, nucleic acids and small molecules. Chemical reactions, photochemical activation and conformational changes can also be described. Everything from small molecule biochemistry to signal transduction is abstracted in such a way that graph theory methods may be applied for data mining. The database can be used to study networks of interactions, to map pathways across taxonomic branches and to generate information for kinetic simulations. BIND anticipates the coming large influx of interaction information from high-throughput proteomics efforts including detailed information about post-translational modifications from mass spectrometry. Version 2.0 of the BIND data model is discussed as well as implementation, content and the open nature of the BIND project. The BIND data specification is available as ASN.1 and XML DTD. PMID:11125103

  5. Tautomerism in large databases

    PubMed Central

    Sitzmann, Markus; Ihlenfeldt, Wolf-Dietrich

    2010-01-01

    We have used the Chemical Structure DataBase (CSDB) of the NCI CADD Group, an aggregated collection of over 150 small-molecule databases totaling 103.5 million structure records, to conduct tautomerism analyses on one of the largest currently existing sets of real (i.e. not computer-generated) compounds. This analysis was carried out using calculable chemical structure identifiers developed by the NCI CADD Group, based on hash codes available in the chemoinformatics toolkit CACTVS and a newly developed scoring scheme to define a canonical tautomer for any encountered structure. CACTVS’s tautomerism definition, a set of 21 transform rules expressed in SMIRKS line notation, was used, which takes a comprehensive stance as to the possible types of tautomeric interconversion included. Tautomerism was found to be possible for more than 2/3 of the unique structures in the CSDB. A total of 680 million tautomers were calculated from, and including, the original structure records. Tautomerism overlap within the same individual database (i.e. at least one other entry was present that was really only a different tautomeric representation of the same compound) was found at an average rate of 0.3% of the original structure records, with values as high as nearly 2% for some of the databases in CSDB. Projected onto the set of unique structures (by FICuS identifier), this still occurred in about 1.5% of the cases. Tautomeric overlap across all constituent databases in CSDB was found for nearly 10% of the records in the collection. PMID:20512400

  6. JICST Factual DatabaseJICST Chemical Substance Safety Regulation Database

    NASA Astrophysics Data System (ADS)

    Abe, Atsushi; Sohma, Tohru

    JICST Chemical Substance Safety Regulation Database is based on the Database of Safety Laws for Chemical Compounds constructed by Japan Chemical Industry Ecology-Toxicology & Information Center (JETOC) sponsored by the Sience and Technology Agency in 1987. JICST has modified JETOC database system, added data and started the online service through JOlS-F (JICST Online Information Service-Factual database) in January 1990. JICST database comprises eighty-three laws and fourteen hundred compounds. The authors outline the database, data items, files and search commands. An example of online session is presented.

  7. Gramene database in 2010: updates and extensions.

    PubMed

    Youens-Clark, Ken; Buckler, Ed; Casstevens, Terry; Chen, Charles; Declerck, Genevieve; Derwent, Paul; Dharmawardhana, Palitha; Jaiswal, Pankaj; Kersey, Paul; Karthikeyan, A S; Lu, Jerry; McCouch, Susan R; Ren, Liya; Spooner, William; Stein, Joshua C; Thomason, Jim; Wei, Sharon; Ware, Doreen

    2011-01-01

    Now in its 10th year, the Gramene database (http://www.gramene.org) has grown from its primary focus on rice, the first fully-sequenced grass genome, to become a resource for major model and crop plants including Arabidopsis, Brachypodium, maize, sorghum, poplar and grape in addition to several species of rice. Gramene began with the addition of an Ensembl genome browser and has expanded in the last decade to become a robust resource for plant genomics hosting a wide array of data sets including quantitative trait loci (QTL), metabolic pathways, genetic diversity, genes, proteins, germplasm, literature, ontologies and a fully-structured markers and sequences database integrated with genome browsers and maps from various published studies (genetic, physical, bin, etc.). In addition, Gramene now hosts a variety of web services including a Distributed Annotation Server (DAS), BLAST and a public MySQL database. Twice a year, Gramene releases a major build of the database and makes interim releases to correct errors or to make important updates to software and/or data.

  8. Protein Model Database

    SciTech Connect

    Fidelis, K; Adzhubej, A; Kryshtafovych, A; Daniluk, P

    2005-02-23

    The phenomenal success of the genome sequencing projects reveals the power of completeness in revolutionizing biological science. Currently it is possible to sequence entire organisms at a time, allowing for a systemic rather than fractional view of their organization and the various genome-encoded functions. There is an international plan to move towards a similar goal in the area of protein structure. This will not be achieved by experiment alone, but rather by a combination of efforts in crystallography, NMR spectroscopy, and computational modeling. Only a small fraction of structures are expected to be identified experimentally, the remainder to be modeled. Presently there is no organized infrastructure to critically evaluate and present these data to the biological community. The goal of the Protein Model Database project is to create such infrastructure, including (1) public database of theoretically derived protein structures; (2) reliable annotation of protein model quality, (3) novel structure analysis tools, and (4) access to the highest quality modeling techniques available.

  9. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-11-09

    The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.

  10. Surgical research using national databases.

    PubMed

    Alluri, Ram K; Leland, Hyuma; Heckmann, Nathanael

    2016-10-01

    Recent changes in healthcare and advances in technology have increased the use of large-volume national databases in surgical research. These databases have been used to develop perioperative risk stratification tools, assess postoperative complications, calculate costs, and investigate numerous other topics across multiple surgical specialties. The results of these studies contain variable information but are subject to unique limitations. The use of large-volume national databases is increasing in popularity, and thorough understanding of these databases will allow for a more sophisticated and better educated interpretation of studies that utilize such databases. This review will highlight the composition, strengths, and weaknesses of commonly used national databases in surgical research.

  11. Surgical research using national databases

    PubMed Central

    Leland, Hyuma; Heckmann, Nathanael

    2016-01-01

    Recent changes in healthcare and advances in technology have increased the use of large-volume national databases in surgical research. These databases have been used to develop perioperative risk stratification tools, assess postoperative complications, calculate costs, and investigate numerous other topics across multiple surgical specialties. The results of these studies contain variable information but are subject to unique limitations. The use of large-volume national databases is increasing in popularity, and thorough understanding of these databases will allow for a more sophisticated and better educated interpretation of studies that utilize such databases. This review will highlight the composition, strengths, and weaknesses of commonly used national databases in surgical research. PMID:27867945

  12. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-07-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  13. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1999-01-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilities access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  14. COBE Astronomical Databases

    NASA Astrophysics Data System (ADS)

    Freedman, I.; Raugh, A. C.; Cheng, E. S.

    A project to store and convert external astronomical survey maps to the Cosmic Background Explorer (COBE) spacecraft pixelization is described. Established software is reused in order to reduce development costs. The proposed packages and systems include the Image Reduction and Analysis Facility (IRAF), Interactive Data Language Astronomy Library (IDL), the FITSIO data transfer package and the Astronomical Image Processing System (AIPS). The software structure of the astronomical databases, projected conversion schemes, quality assurance procedures and outstanding problems will be discussed.

  15. Developing customer databases.

    PubMed

    Rao, S K; Shenbaga, S

    2000-01-01

    There is a growing consensus among pharmaceutical companies that more product and customer-specific approaches to marketing and selling a new drug can result in substantial increases in sales. Marketers and researchers taking a proactive micro-marketing approach to identifying, profiling, and communicating with target customers are likely to facilitate such approaches and outcomes. This article provides a working framework for creating customer databases that can be effectively mined to achieve a variety of such marketing and sales force objectives.

  16. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-11-15

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  17. The Danish Melanoma Database

    PubMed Central

    Hölmich, Lisbet Rosenkrantz; Klausen, Siri; Spaun, Eva; Schmidt, Grethe; Gad, Dorte; Svane, Inge Marie; Schmidt, Henrik; Lorentzen, Henrik Frank; Ibfelt, Else Helene

    2016-01-01

    Aim of database The aim of the database is to monitor and improve the treatment and survival of melanoma patients. Study population All Danish patients with cutaneous melanoma and in situ melanomas must be registered in the Danish Melanoma Database (DMD). In 2014, 2,525 patients with invasive melanoma and 780 with in situ tumors were registered. The coverage is currently 93% compared with the Danish Pathology Register. Main variables The main variables include demographic, clinical, and pathological characteristics, including Breslow’s tumor thickness, ± ulceration, mitoses, and tumor–node–metastasis stage. Information about the date of diagnosis, treatment, type of surgery, including safety margins, results of lymphoscintigraphy in patients for whom this was indicated (tumors > T1a), results of sentinel node biopsy, pathological evaluation hereof, and follow-up information, including recurrence, nature, and treatment hereof is registered. In case of death, the cause and date are included. Currently, all data are entered manually; however, data catchment from the existing registries is planned to be included shortly. Descriptive data The DMD is an old research database, but new as a clinical quality register. The coverage is high, and the performance in the five Danish regions is quite similar due to strong adherence to guidelines provided by the Danish Melanoma Group. The list of monitored indicators is constantly expanding, and annual quality reports are issued. Several important scientific studies are based on DMD data. Conclusion DMD holds unique detailed information about tumor characteristics, the surgical treatment, and follow-up of Danish melanoma patients. Registration and monitoring is currently expanding to encompass even more clinical parameters to benefit both patient treatment and research. PMID:27822097

  18. Electronic Journals as Databases

    NASA Astrophysics Data System (ADS)

    Holl, A.

    2004-07-01

    The Information Bulletin on Variable Stars is a bulletin fully available in electronic form. We are working on converting the text, tables and figures of the papers published into a database, and, at the same time, making them accessible and addressable. IBVS Data Service will provide information on variable stars --- like finding charts, light curves --- and will be VO compatible. Other services could link to individual figures, data files, etc. this way.

  19. Real Time Baseball Database

    NASA Astrophysics Data System (ADS)

    Fukue, Yasuhiro

    The author describes the system outline, features and operations of "Nikkan Sports Realtime Basaball Database" which was developed and operated by Nikkan Sports Shimbun, K. K. The system enables to input numerical data of professional baseball games as they proceed simultaneously, and execute data updating at realtime, just-in-time. Other than serving as supporting tool for prepareing newspapers it is also available for broadcasting media, general users through NTT dial Q2 and others.

  20. The Danish Sarcoma Database

    PubMed Central

    Jørgensen, Peter Holmberg; Lausten, Gunnar Schwarz; Pedersen, Alma B

    2016-01-01

    Aim The aim of the database is to gather information about sarcomas treated in Denmark in order to continuously monitor and improve the quality of sarcoma treatment in a local, a national, and an international perspective. Study population Patients in Denmark diagnosed with a sarcoma, both skeletal and ekstraskeletal, are to be registered since 2009. Main variables The database contains information about appearance of symptoms; date of receiving referral to a sarcoma center; date of first visit; whether surgery has been performed elsewhere before referral, diagnosis, and treatment; tumor characteristics such as location, size, malignancy grade, and growth pattern; details on treatment (kind of surgery, amount of radiation therapy, type and duration of chemotherapy); complications of treatment; local recurrence and metastases; and comorbidity. In addition, several quality indicators are registered in order to measure the quality of care provided by the hospitals and make comparisons between hospitals and with international standards. Descriptive data Demographic patient-specific data such as age, sex, region of living, comorbidity, World Health Organization’s International Classification of Diseases – tenth edition codes and TNM Classification of Malignant Tumours, and date of death (after yearly coupling to the Danish Civil Registration System). Data quality and completeness are currently secured. Conclusion The Danish Sarcoma Database is population based and includes sarcomas occurring in Denmark since 2009. It is a valuable tool for monitoring sarcoma incidence and quality of treatment and its improvement, postoperative complications, and recurrence within 5 years follow-up. The database is also a valuable research tool to study the impact of technical and medical interventions on prognosis of sarcoma patients. PMID:27822116

  1. Unified Database Development Program.

    DTIC Science & Technology

    1984-03-01

    unified database (UDB) program was to develop an automated system that would be useful to those responsible for the design , development, testing, and...weapon system design . Baekgound The Air Force is concerned with the lack of adequate logistics consideration during the weapon system design process. To...produce a weapon system with optimal cost and mission effectiveness, logistics factors must be considered very early and throughout the system design

  2. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1997-02-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  3. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1998-08-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  4. The Cambridge Structural Database.

    PubMed

    Groom, Colin R; Bruno, Ian J; Lightfoot, Matthew P; Ward, Suzanna C

    2016-04-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface.

  5. ARTI Refrigerant Database

    SciTech Connect

    Cain, J.M.

    1993-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  6. State Analysis Database Tool

    NASA Technical Reports Server (NTRS)

    Rasmussen, Robert; Bennett, Matthew

    2006-01-01

    The State Analysis Database Tool software establishes a productive environment for collaboration among software and system engineers engaged in the development of complex interacting systems. The tool embodies State Analysis, a model-based system engineering methodology founded on a state-based control architecture (see figure). A state represents a momentary condition of an evolving system, and a model may describe how a state evolves and is affected by other states. The State Analysis methodology is a process for capturing system and software requirements in the form of explicit models and states, and defining goal-based operational plans consistent with the models. Requirements, models, and operational concerns have traditionally been documented in a variety of system engineering artifacts that address different aspects of a mission s lifecycle. In State Analysis, requirements, models, and operations information are State Analysis artifacts that are consistent and stored in a State Analysis Database. The tool includes a back-end database, a multi-platform front-end client, and Web-based administrative functions. The tool is structured to prompt an engineer to follow the State Analysis methodology, to encourage state discovery and model description, and to make software requirements and operations plans consistent with model descriptions.

  7. The Cambridge Structural Database

    PubMed Central

    Groom, Colin R.; Bruno, Ian J.; Lightfoot, Matthew P.; Ward, Suzanna C.

    2016-01-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal–organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface. PMID:27048719

  8. Fiber pixelated image database

    NASA Astrophysics Data System (ADS)

    Shinde, Anant; Perinchery, Sandeep Menon; Matham, Murukeshan Vadakke

    2016-08-01

    Imaging of physically inaccessible parts of the body such as the colon at micron-level resolution is highly important in diagnostic medical imaging. Though flexible endoscopes based on the imaging fiber bundle are used for such diagnostic procedures, their inherent honeycomb-like structure creates fiber pixelation effects. This impedes the observer from perceiving the information from an image captured and hinders the direct use of image processing and machine intelligence techniques on the recorded signal. Significant efforts have been made by researchers in the recent past in the development and implementation of pixelation removal techniques. However, researchers have often used their own set of images without making source data available which subdued their usage and adaptability universally. A database of pixelated images is the current requirement to meet the growing diagnostic needs in the healthcare arena. An innovative fiber pixelated image database is presented, which consists of pixelated images that are synthetically generated and experimentally acquired. Sample space encompasses test patterns of different scales, sizes, and shapes. It is envisaged that this proposed database will alleviate the current limitations associated with relevant research and development and would be of great help for researchers working on comb structure removal algorithms.

  9. The Listeria monocytogenes strain 10403S BioCyc database.

    PubMed

    Orsi, Renato H; Bergholz, Teresa M; Wiedmann, Martin; Boor, Kathryn J

    2015-01-01

    Listeria monocytogenes is a food-borne pathogen of humans and other animals. The striking ability to survive several stresses usually used for food preservation makes L. monocytogenes one of the biggest concerns to the food industry, while the high mortality of listeriosis in specific groups of humans makes it a great concern for public health. Previous studies have shown that a regulatory network involving alternative sigma (σ) factors and transcription factors is pivotal to stress survival. However, few studies have evaluated at the metabolic networks controlled by these regulatory mechanisms. The L. monocytogenes BioCyc database uses the strain 10403S as a model. Computer-generated initial annotation for all genes also allowed for identification, annotation and display of predicted reactions and pathways carried out by a single cell. Further ongoing manual curation based on published data as well as database mining for selected genes allowed the more refined annotation of functions, which, in turn, allowed for annotation of new pathways and fine-tuning of previously defined pathways to more L. monocytogenes-specific pathways. Using RNA-Seq data, several transcription start sites and promoter regions were mapped to the 10403S genome and annotated within the database. Additionally, the identification of promoter regions and a comprehensive review of available literature allowed the annotation of several regulatory interactions involving σ factors and transcription factors. The L. monocytogenes 10403S BioCyc database is a new resource for researchers studying Listeria and related organisms. It allows users to (i) have a comprehensive view of all reactions and pathways predicted to take place within the cell in the cellular overview, as well as to (ii) upload their own data, such as differential expression data, to visualize the data in the scope of predicted pathways and regulatory networks and to carry on enrichment analyses using several different annotations

  10. Generalized Database Management System Support for Numeric Database Environments.

    ERIC Educational Resources Information Center

    Dominick, Wayne D.; Weathers, Peggy G.

    1982-01-01

    This overview of potential for utilizing database management systems (DBMS) within numeric database environments highlights: (1) major features, functions, and characteristics of DBMS; (2) applicability to numeric database environment needs and user needs; (3) current applications of DBMS technology; and (4) research-oriented and…

  11. Design and Performance of a Xenobiotic Metabolism Database Manager for Building Metabolic Pathway Databases

    EPA Science Inventory

    A major challenge for scientists and regulators is accounting for the metabolic activation of chemicals that may lead to increased toxicity. Reliable forecasting of chemical metabolism is a critical factor in estimating a chemical’s toxic potential. Research is underway to develo...

  12. SmallSat Database

    NASA Technical Reports Server (NTRS)

    Petropulos, Dolores; Bittner, David; Murawski, Robert; Golden, Bert

    2015-01-01

    The SmallSat has an unrealized potential in both the private industry and in the federal government. Currently over 70 companies, 50 universities and 17 governmental agencies are involved in SmallSat research and development. In 1994, the U.S. Army Missile and Defense mapped the moon using smallSat imagery. Since then Smart Phones have introduced this imagery to the people of the world as diverse industries watched this trend. The deployment cost of smallSats is also greatly reduced compared to traditional satellites due to the fact that multiple units can be deployed in a single mission. Imaging payloads have become more sophisticated, smaller and lighter. In addition, the growth of small technology obtained from private industries has led to the more widespread use of smallSats. This includes greater revisit rates in imagery, significantly lower costs, the ability to update technology more frequently and the ability to decrease vulnerability of enemy attacks. The popularity of smallSats show a changing mentality in this fast paced world of tomorrow. What impact has this created on the NASA communication networks now and in future years? In this project, we are developing the SmallSat Relational Database which can support a simulation of smallSats within the NASA SCaN Compatability Environment for Networks and Integrated Communications (SCENIC) Modeling and Simulation Lab. The NASA Space Communications and Networks (SCaN) Program can use this modeling to project required network support needs in the next 10 to 15 years. The SmallSat Rational Database could model smallSats just as the other SCaN databases model the more traditional larger satellites, with a few exceptions. One being that the smallSat Database is designed to be built-to-order. The SmallSat database holds various hardware configurations that can be used to model a smallSat. It will require significant effort to develop as the research material can only be populated by hand to obtain the unique data

  13. High Temperature Superconducting Materials Database

    National Institute of Standards and Technology Data Gateway

    SRD 149 NIST High Temperature Superconducting Materials Database (Web, free access)   The NIST High Temperature Superconducting Materials Database (WebHTS) provides evaluated thermal, mechanical, and superconducting property data for oxides and other nonconventional superconductors.

  14. Mobile Source Observation Database (MSOD)

    EPA Pesticide Factsheets

    The Mobile Source Observation Database (MSOD) is a relational database developed by the Assessment and Standards Division (ASD) of the U.S. EPA Office of Transportation and Air Quality (formerly the Office of Mobile Sources).

  15. A Case for Database Filesystems

    SciTech Connect

    Adams, P A; Hax, J C

    2009-05-13

    Data intensive science is offering new challenges and opportunities for Information Technology and traditional relational databases in particular. Database filesystems offer the potential to store Level Zero data and analyze Level 1 and Level 3 data within the same database system [2]. Scientific data is typically composed of both unstructured files and scalar data. Oracle SecureFiles is a new database filesystem feature in Oracle Database 11g that is specifically engineered to deliver high performance and scalability for storing unstructured or file data inside the Oracle database. SecureFiles presents the best of both the filesystem and the database worlds for unstructured content. Data stored inside SecureFiles can be queried or written at performance levels comparable to that of traditional filesystems while retaining the advantages of the Oracle database.

  16. ThermoData Engine Database

    National Institute of Standards and Technology Data Gateway

    SRD 103 NIST ThermoData Engine Database (PC database for purchase)   ThermoData Engine is the first product fully implementing all major principles of the concept of dynamic data evaluation formulated at NIST/TRC.

  17. The Reach Address Database (RAD)

    EPA Pesticide Factsheets

    The Reach Address Database (RAD) stores reach address information for each Water Program feature that has been linked to the underlying surface water features (streams, lakes, etc) in the National Hydrology Database (NHD) Plus dataset.

  18. Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae.

    PubMed

    Yamamoto, Naoki; Kudo, Toru; Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

    2016-01-01

    Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera.

  19. Hydrogen Leak Detection Sensor Database

    NASA Technical Reports Server (NTRS)

    Baker, Barton D.

    2010-01-01

    This slide presentation reviews the characteristics of the Hydrogen Sensor database. The database is the result of NASA's continuing interest in and improvement of its ability to detect and assess gas leaks in space applications. The database specifics and a snapshot of an entry in the database are reviewed. Attempts were made to determine the applicability of each of the 65 sensors for ground and/or vehicle use.

  20. Microbial Properties Database Editor Tutorial

    EPA Science Inventory

    A Microbial Properties Database Editor (MPDBE) has been developed to help consolidate microbial-relevant data to populate a microbial database and support a database editor by which an authorized user can modify physico-microbial properties related to microbial indicators and pat...

  1. Scientific and Technical Document Database

    National Institute of Standards and Technology Data Gateway

    NIST Scientific and Technical Document Database (PC database for purchase)   The images in NIST Special Database 20 contain a very rich set of graphic elements from scientific and technical documents, such as graphs, tables, equations, two column text, maps, pictures, footnotes, annotations, and arrays of such elements.

  2. Choosing among the physician databases.

    PubMed

    Heller, R H

    1988-04-01

    Prudent examination and knowing how to ask the "right questions" can enable hospital marketers and planners to find the most accurate and appropriate database. The author compares the comprehensive AMA physician database with the less expensive MEDEC database to determine their strengths and weaknesses.

  3. A Spanish American War Database.

    ERIC Educational Resources Information Center

    Hands, Edmund

    1992-01-01

    Discusses a database used by honors high school U.S. history students learning about the Spanish-American War. Reports that the students compiled the database. Includes some of the historical background of the war, questions for study, a database key, and a table showing U.S. senators' votes relating to the War. (SG)

  4. EMU Lessons Learned Database

    NASA Technical Reports Server (NTRS)

    Matthews, Kevin M., Jr.; Crocker, Lori; Cupples, J. Scott

    2011-01-01

    As manned space exploration takes on the task of traveling beyond low Earth orbit, many problems arise that must be solved in order to make the journey possible. One major task is protecting humans from the harsh space environment. The current method of protecting astronauts during Extravehicular Activity (EVA) is through use of the specially designed Extravehicular Mobility Unit (EMU). As more rigorous EVA conditions need to be endured at new destinations, the suit will need to be tailored and improved in order to accommodate the astronaut. The Objective behind the EMU Lessons Learned Database(LLD) is to be able to create a tool which will assist in the development of next-generation EMUs, along with maintenance and improvement of the current EMU, by compiling data from Failure Investigation and Analysis Reports (FIARs) which have information on past suit failures. FIARs use a system of codes that give more information on the aspects of the failure, but if one is unfamiliar with the EMU they will be unable to decipher the information. A goal of the EMU LLD is to not only compile the information, but to present it in a user-friendly, organized, searchable database accessible to all familiarity levels with the EMU; both newcomers and veterans alike. The EMU LLD originally started as an Excel database, which allowed easy navigation and analysis of the data through pivot charts. Creating an entry requires access to the Problem Reporting And Corrective Action database (PRACA), which contains the original FIAR data for all hardware. FIAR data are then transferred to, defined, and formatted in the LLD. Work is being done to create a web-based version of the LLD in order to increase accessibility to all of Johnson Space Center (JSC), which includes converting entries from Excel to the HTML format. FIARs related to the EMU have been completed in the Excel version, and now focus has shifted to expanding FIAR data in the LLD to include EVA tools and support hardware such as

  5. DDRprot: a database of DNA damage response-related proteins

    PubMed Central

    Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M.

    2016-01-01

    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome’s integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used. Database URL: http://ddr.cbbio.es. PMID:27577567

  6. Plant Reactome: a resource for plant pathways and comparative analysis

    PubMed Central

    Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D.; Wu, Guanming; Fabregat, Antonio; Elser, Justin L.; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D.; Ware, Doreen; Jaiswal, Pankaj

    2017-01-01

    Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. PMID:27799469

  7. Construction of file database management

    SciTech Connect

    MERRILL,KYLE J.

    2000-03-01

    This work created a database for tracking data analysis files from multiple lab techniques and equipment stored on a central file server. Experimental details appropriate for each file type are pulled from the file header and stored in a searchable database. The database also stores specific location and self-directory structure for each data file. Queries can be run on the database according to file type, sample type or other experimental parameters. The database was constructed in Microsoft Access and Visual Basic was used for extraction of information from the file header.

  8. Databases as an information service

    NASA Technical Reports Server (NTRS)

    Vincent, D. A.

    1983-01-01

    The relationship of databases to information services, and the range of information services users and their needs for information is explored and discussed. It is argued that for database information to be valuable to a broad range of users, it is essential that access methods be provided that are relatively unstructured and natural to information services users who are interested in the information contained in databases, but who are not willing to learn and use traditional structured query languages. Unless this ease of use of databases is considered in the design and application process, the potential benefits from using database systems may not be realized.

  9. NLCD 2011 database

    EPA Pesticide Factsheets

    National Land Cover Database 2011 (NLCD 2011) is the most recent national land cover product created by the Multi-Resolution Land Characteristics (MRLC) Consortium. NLCD 2011 provides - for the first time - the capability to assess wall-to-wall, spatially explicit, national land cover changes and trends across the United States from 2001 to 2011. As with two previous NLCD land cover products NLCD 2011 keeps the same 16-class land cover classification scheme that has been applied consistently across the United States at a spatial resolution of 30 meters. NLCD 2011 is based primarily on a decision-tree classification of circa 2011 Landsat satellite data. This dataset is associated with the following publication:Homer, C., J. Dewitz, L. Yang, S. Jin, P. Danielson, G. Xian, J. Coulston, N. Herold, J. Wickham , and K. Megown. Completion of the 2011 National Land Cover Database for the Conterminous United States – Representing a Decade of Land Cover Change Information. PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING. American Society for Photogrammetry and Remote Sensing, Bethesda, MD, USA, 81(0): 345-354, (2015).

  10. Asbestos Exposure Assessment Database

    NASA Technical Reports Server (NTRS)

    Arcot, Divya K.

    2010-01-01

    Exposure to particular hazardous materials in a work environment is dangerous to the employees who work directly with or around the materials as well as those who come in contact with them indirectly. In order to maintain a national standard for safe working environments and protect worker health, the Occupational Safety and Health Administration (OSHA) has set forth numerous precautionary regulations. NASA has been proactive in adhering to these regulations by implementing standards which are often stricter than regulation limits and administering frequent health risk assessments. The primary objective of this project is to create the infrastructure for an Asbestos Exposure Assessment Database specific to NASA Johnson Space Center (JSC) which will compile all of the exposure assessment data into a well-organized, navigable format. The data includes Sample Types, Samples Durations, Crafts of those from whom samples were collected, Job Performance Requirements (JPR) numbers, Phased Contrast Microscopy (PCM) and Transmission Electron Microscopy (TEM) results and qualifiers, Personal Protective Equipment (PPE), and names of industrial hygienists who performed the monitoring. This database will allow NASA to provide OSHA with specific information demonstrating that JSC s work procedures are protective enough to minimize the risk of future disease from the exposures. The data has been collected by the NASA contractors Computer Sciences Corporation (CSC) and Wyle Laboratories. The personal exposure samples were collected from devices worn by laborers working at JSC and by building occupants located in asbestos-containing buildings.

  11. The ITPA disruption database

    NASA Astrophysics Data System (ADS)

    Eidietis, N. W.; Gerhardt, S. P.; Granetz, R. S.; Kawano, Y.; Lehnen, M.; Lister, J. B.; Pautasso, G.; Riccardo, V.; Tanna, R. L.; Thornton, A. J.; ITPA Disruption Database Participants, The

    2015-06-01

    A multi-device database of disruption characteristics has been developed under the auspices of the International Tokamak Physics Activity magneto-hydrodynamics topical group. The purpose of this ITPA disruption database (IDDB) is to find the commonalities between the disruption and disruption mitigation characteristics in a wide variety of tokamaks in order to elucidate the physics underlying tokamak disruptions and to extrapolate toward much larger devices, such as ITER and future burning plasma devices. In contrast to previous smaller disruption data collation efforts, the IDDB aims to provide significant context for each shot provided, allowing exploration of a wide array of relationships between pre-disruption and disruption parameters. The IDDB presently includes contributions from nine tokamaks, including both conventional aspect ratio and spherical tokamaks. An initial parametric analysis of the available data is presented. This analysis includes current quench rates, halo current fraction and peaking, and the effectiveness of massive impurity injection. The IDDB is publicly available, with instruction for access provided herein.

  12. The TIGR Maize Database.

    PubMed

    Chan, Agnes P; Pertea, Geo; Cheung, Foo; Lee, Dan; Zheng, Li; Whitelaw, Cathy; Pontaroli, Ana C; SanMiguel, Phillip; Yuan, Yinan; Bennetzen, Jeffrey; Barbazuk, William Brad; Quackenbush, John; Rabinowicz, Pablo D

    2006-01-01

    Maize is a staple crop of the grass family and also an excellent model for plant genetics. Owing to the large size and repetitiveness of its genome, we previously investigated two approaches to accelerate gene discovery and genome analysis in maize: methylation filtration and high C(0)t selection. These techniques allow the construction of gene-enriched genomic libraries by minimizing repeat sequences due to either their methylation status or their copy number, yielding a 7-fold enrichment in genic sequences relative to a random genomic library. Approximately 900,000 gene-enriched reads from maize were generated and clustered into Assembled Zea mays (AZM) sequences. Here we report the current AZM release, which consists of approximately 298 Mb representing 243,807 sequence assemblies and singletons. In order to provide a repository of publicly available maize genomic sequences, we have created the TIGR Maize Database (http://maize.tigr.org). In this resource, we have assembled and annotated the AZMs and used available sequenced markers to anchor AZMs to maize chromosomes. We have constructed a maize repeat database and generated draft sequence assemblies of 287 maize bacterial artificial chromosome (BAC) clone sequences, which we annotated along with 172 additional publicly available BAC clones. All sequences, assemblies and annotations are available at the project website via web interfaces and FTP downloads.

  13. UGTA Photograph Database

    SciTech Connect

    NSTec Environmental Restoration

    2009-04-20

    One of the advantages of the Nevada Test Site (NTS) is that most of the geologic and hydrologic features such as hydrogeologic units (HGUs), hydrostratigraphic units (HSUs), and faults, which are important aspects of flow and transport modeling, are exposed at the surface somewhere in the vicinity of the NTS and thus are available for direct observation. However, due to access restrictions and the remote locations of many of the features, most Underground Test Area (UGTA) participants cannot observe these features directly in the field. Fortunately, National Security Technologies, LLC, geologists and their predecessors have photographed many of these features through the years. During fiscal year 2009, work was done to develop an online photograph database for use by the UGTA community. Photographs were organized, compiled, and imported into Adobe® Photoshop® Elements 7. The photographs were then assigned keyword tags such as alteration type, HGU, HSU, location, rock feature, rock type, and stratigraphic unit. Some fully tagged photographs were then selected and uploaded to the UGTA website. This online photograph database provides easy access for all UGTA participants and can help “ground truth” their analytical and modeling tasks. It also provides new participants a resource to more quickly learn the geology and hydrogeology of the NTS.

  14. IPD: the Immuno Polymorphism Database.

    PubMed

    Robinson, James; Marsh, Steven G E

    2007-01-01

    The Immuno Polymorphism Database (IPD) (http://www.ebi.ac.uk/ipd/) is a set of specialist databases related to the study of polymorphic genes in the immune system. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of killer cell immunoglobulin-like receptors (KIRs); IPD-MHC, a database of sequences of the major histocompatibility complex (MHC) of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTAB, which provides access to the European Searchable Tumour Cell Line Database, a cell bank of immunologically characterized melanoma cell lines. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. Those sections with similar data, such as IPD-KIR and IPD-MHC, share the same database structure.

  15. National Geochronological Database

    USGS Publications Warehouse

    Revised by Sloan, Jan; Henry, Christopher D.; Hopkins, Melanie; Ludington, Steve; Original database by Zartman, Robert E.; Bush, Charles A.; Abston, Carl

    2003-01-01

    The National Geochronological Data Base (NGDB) was established by the United States Geological Survey (USGS) to collect and organize published isotopic (also known as radiometric) ages of rocks in the United States. The NGDB (originally known as the Radioactive Age Data Base, RADB) was started in 1974. A committee appointed by the Director of the USGS was given the mission to investigate the feasibility of compiling the published radiometric ages for the United States into a computerized data bank for ready access by the user community. A successful pilot program, which was conducted in 1975 and 1976 for the State of Wyoming, led to a decision to proceed with the compilation of the entire United States. For each dated rock sample reported in published literature, a record containing information on sample location, rock description, analytical data, age, interpretation, and literature citation was constructed and included in the NGDB. The NGDB was originally constructed and maintained on a mainframe computer, and later converted to a Helix Express relational database maintained on an Apple Macintosh desktop computer. The NGDB and a program to search the data files were published and distributed on Compact Disc-Read Only Memory (CD-ROM) in standard ISO 9660 format as USGS Digital Data Series DDS-14 (Zartman and others, 1995). As of May 1994, the NGDB consisted of more than 18,000 records containing over 30,000 individual ages, which is believed to represent approximately one-half the number of ages published for the United States through 1991. Because the organizational unit responsible for maintaining the database was abolished in 1996, and because we wanted to provide the data in more usable formats, we have reformatted the data, checked and edited the information in some records, and provided this online version of the NGDB. This report describes the changes made to the data and formats, and provides instructions for the use of the database in geographic

  16. A Scalable Database Infrastructure

    NASA Astrophysics Data System (ADS)

    Arko, R. A.; Chayes, D. N.

    2001-12-01

    The rapidly increasing volume and complexity of MG&G data, and the growing demand from funding agencies and the user community that it be easily accessible, demand that we improve our approach to data management in order to reach a broader user-base and operate more efficient and effectively. We have chosen an approach based on industry-standard relational database management systems (RDBMS) that use community-wide data specifications, where there is a clear and well-documented external interface that allows use of general purpose as well as customized clients. Rapid prototypes assembled with this approach show significant advantages over the traditional, custom-built data management systems that often use "in-house" legacy file formats, data specifications, and access tools. We have developed an effective database prototype based a public domain RDBMS (PostgreSQL) and metadata standard (FGDC), and used it as a template for several ongoing MG&G database management projects - including ADGRAV (Antarctic Digital Gravity Synthesis), MARGINS, the Community Review system of the Digital Library for Earth Science Education, multibeam swath bathymetry metadata, and the R/V Maurice Ewing onboard acquisition system. By using standard formats and specifications, and working from a common prototype, we are able to reuse code and deploy rapidly. Rather than spend time on low-level details such as storage and indexing (which are built into the RDBMS), we can focus on high-level details such as documentation and quality control. In addition, because many commercial off-the-shelf (COTS) and public domain data browsers and visualization tools have built-in RDBMS support, we can focus on backend development and leave the choice of a frontend client(s) up to the end user. While our prototype is running under an open source RDBMS on a single processor host, the choice of standard components allows this implementation to scale to commercial RDBMS products and multiprocessor servers as

  17. Genome wide expression analysis in HPV16 Cervical Cancer: identification of altered metabolic pathways

    PubMed Central

    Pérez-Plasencia, Carlos; Vázquez-Ortiz, Guelaguetza; López-Romero, Ricardo; Piña-Sanchez, Patricia; Moreno, José; Salcedo, Mauricio

    2007-01-01

    Background Cervical carcinoma (CC) is a leading cause of death among women worldwide. Human papilloma virus (HPV) is a major etiological factor in CC and HPV 16 is the more frequent viral type present. Our aim was to characterize metabolic pathways altered in HPV 16 tumor samples by means of transcriptome wide analysis and bioinformatics tools for visualizing expression data in the context of KEGG biological pathways. Results We found 2,067 genes significantly up or down-modulated (at least 2-fold) in tumor clinical samples compared to normal tissues, representing ~3.7% of analyzed genes. Cervical carcinoma was associated with an important up-regulation of Wnt signaling pathway, which was validated by in situ hybridization in clinical samples. Other up-regulated pathways were those of calcium signaling and MAPK signaling, as well as cell cycle-related genes. There was down-regulation of focal adhesion, TGF-β signaling, among other metabolic pathways. Conclusion This analysis of HPV 16 tumors transcriptome could be useful for the identification of genes and molecular pathways involved in the pathogenesis of cervical carcinoma. Understanding the possible role of these proteins in the pathogenesis of CC deserves further studies. PMID:17822553

  18. PPD - Proteome Profile Database.

    PubMed

    Sakharkar, Kishore R; Chow, Vincent T K

    2004-01-01

    With the complete sequencing of multiple genomes, there have been extensions in the methods of sequence analysis from single gene/protein-based to analyzing multiple genes and proteins simultaneously. Therefore, there is a demand of user-friendly software tools that will allow mining of these enormous datasets. PPD is a WWW-based database for comparative analysis of protein lengths in completely sequenced prokaryotic and eukaryotic genomes. PPD's core objective is to create protein classification tables based on the lengths of proteins by specifying a set of organisms and parameters. The interface can also generate information on changes in proteins of specific length distributions. This feature is of importance when the user's interest is focused on some evolutionarily related organisms or on organisms with similar or related tissue specificity or life-style. PPD is available at: PPD Home.

  19. View generated database

    NASA Technical Reports Server (NTRS)

    Downward, James G.

    1992-01-01

    This document represents the final report for the View Generated Database (VGD) project, NAS7-1066. It documents the work done on the project up to the point at which all project work was terminated due to lack of project funds. The VGD was to provide the capability to accurately represent any real-world object or scene as a computer model. Such models include both an accurate spatial/geometric representation of surfaces of the object or scene, as well as any surface detail present on the object. Applications of such models are numerous, including acquisition and maintenance of work models for tele-autonomous systems, generation of accurate 3-D geometric/photometric models for various 3-D vision systems, and graphical models for realistic rendering of 3-D scenes via computer graphics.

  20. Ribosomal Database Project II

    DOE Data Explorer

    The Ribosomal Database Project (RDP) provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial small-subunit 16S rRNA sequences. As of March 2008, RDP Release 10 is available and currently (August 2009) contains 1,074,075 aligned 16S rRNA sequences. Data that can be downloaded include zipped GenBank and FASTA alignment files, a histogram (in Excel) of the number of RDP sequences spanning each base position, data in the Functional Gene Pipeline Repository, and various user submitted data. The RDP-II website also provides numerous analysis tools.[From the RDP-II home page at http://rdp.cme.msu.edu/index.jsp

  1. Danish Palliative Care Database

    PubMed Central

    Groenvold, Mogens; Adsersen, Mathilde; Hansen, Maiken Bang

    2016-01-01

    Aims The aim of the Danish Palliative Care Database (DPD) is to monitor, evaluate, and improve the clinical quality of specialized palliative care (SPC) (ie, the activity of hospital-based palliative care teams/departments and hospices) in Denmark. Study population The study population is all patients in Denmark referred to and/or in contact with SPC after January 1, 2010. Main variables The main variables in DPD are data about referral for patients admitted and not admitted to SPC, type of the first SPC contact, clinical and sociodemographic factors, multidisciplinary conference, and the patient-reported European Organisation for Research and Treatment of Cancer Quality of Life Questionaire-Core-15-Palliative Care questionnaire, assessing health-related quality of life. The data support the estimation of currently five quality of care indicators, ie, the proportions of 1) referred and eligible patients who were actually admitted to SPC, 2) patients who waited <10 days before admission to SPC, 3) patients who died from cancer and who obtained contact with SPC, 4) patients who were screened with European Organisation for Research and Treatment of Cancer Quality of Life Questionaire-Core-15-Palliative Care at admission to SPC, and 5) patients who were discussed at a multidisciplinary conference. Descriptive data In 2014, all 43 SPC units in Denmark reported their data to DPD, and all 9,434 cancer patients (100%) referred to SPC were registered in DPD. In total, 41,104 unique cancer patients were registered in DPD during the 5 years 2010–2014. Of those registered, 96% had cancer. Conclusion DPD is a national clinical quality database for SPC having clinically relevant variables and high data and patient completeness. PMID:27822111

  2. MetaBase—the wiki-database of biological databases

    PubMed Central

    Bolser, Dan M.; Chibon, Pierre-Yves; Palopoli, Nicolas; Gong, Sungsam; Jacob, Daniel; Angel, Victoria Dominguez Del; Swan, Dan; Bassi, Sebastian; González, Virginia; Suravajhala, Prashanth; Hwang, Seungwoo; Romano, Paolo; Edwards, Rob; Bishop, Bryan; Eargle, John; Shtatland, Timur; Provart, Nicholas J.; Clements, Dave; Renfro, Daniel P.; Bhak, Daeui; Bhak, Jong

    2012-01-01

    Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project. PMID:22139927

  3. A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations

    PubMed Central

    Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

    2016-01-01

    Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418

  4. OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder

    PubMed Central

    Privitera, Anna P.; Distefano, Rosario; Wefer, Hugo A.; Ferro, Alfredo; Pulvirenti, Alfredo; Giugno, Rosalba

    2015-01-01

    Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data

  5. kpath: integration of metabolic pathway linked data

    PubMed Central

    Navas-Delgado, Ismael; García-Godoy, María Jesús; López-Camacho, Esteban; Rybinski, Maciej; Reyes-Palomares, Armando; Medina, Miguel Ángel; Aldana-Montes, José F.

    2015-01-01

    In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases. Database URL: The public Linked Data repository can be queried at http://sparql.kpath.khaos.uma.es using the graph URI “www.khaos.uma.es/metabolic-pathways-app”. The GUI providing navigational access to kpath database is available at http://browser.kpath.khaos.uma.es. PMID:26055101

  6. Enhanced Worldwide Ocean Optics Database

    DTIC Science & Technology

    2008-01-01

    Ryukyu Ridge 9 salinity, temperature, c532, "K" & bb from aBeta, Kd488, and Chl_a profiles Sept 1987 NORDA Sargasso Sea 13 K490 & Temperature...Optics Database (WOOD)1. The database shall be easy to use, Internet accessible, and frequently updated with data from recent at- sea measurements...The database shall be capable of supporting a wide range of applications, such as environmental assessments, sea test planning, and Navy applications

  7. Inorganic Crystal Structure Database (ICSD)

    National Institute of Standards and Technology Data Gateway

    SRD 84 FIZ/NIST Inorganic Crystal Structure Database (ICSD) (PC database for purchase)   The Inorganic Crystal Structure Database (ICSD) is produced cooperatively by the Fachinformationszentrum Karlsruhe(FIZ) and the National Institute of Standards and Technology (NIST). The ICSD is a comprehensive collection of crystal structure data of inorganic compounds containing more than 140,000 entries and covering the literature from 1915 to the present.

  8. CycADS: an annotation database system to ease the development and update of BioCyc databases

    PubMed Central

    Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

    2011-01-01

    In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http

  9. CycADS: an annotation database system to ease the development and update of BioCyc databases.

    PubMed

    Vellozo, Augusto F; Véron, Amélie S; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

    2011-01-01

    In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org.

  10. Relativistic quantum private database queries

    NASA Astrophysics Data System (ADS)

    Sun, Si-Jia; Yang, Yu-Guang; Zhang, Ming-Ou

    2015-04-01

    Recently, Jakobi et al. (Phys Rev A 83, 022301, 2011) suggested the first practical private database query protocol (J-protocol) based on the Scarani et al. (Phys Rev Lett 92, 057901, 2004) quantum key distribution protocol. Unfortunately, the J-protocol is just a cheat-sensitive private database query protocol. In this paper, we present an idealized relativistic quantum private database query protocol based on Minkowski causality and the properties of quantum information. Also, we prove that the protocol is secure in terms of the user security and the database security.

  11. Biological Databases for Behavioral Neurobiology

    PubMed Central

    Baker, Erich J.

    2014-01-01

    Databases are, at their core, abstractions of data and their intentionally derived relationships. They serve as a central organizing metaphor and repository, supporting or augmenting nearly all bioinformatics. Behavioral domains provide a unique stage for contemporary databases, as research in this area spans diverse data types, locations, and data relationships. This chapter provides foundational information on the diversity and prevalence of databases, how data structures support the various needs of behavioral neuroscience analysis and interpretation. The focus is on the classes of databases, data curation, and advanced applications in bioinformatics using examples largely drawn from research efforts in behavioral neuroscience. PMID:23195119

  12. A Database for Propagation Models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Rucker, James

    1997-01-01

    The Propagation Models Database is designed to allow the scientists and experimenters in the propagation field to process their data through many known and accepted propagation models. The database is an Excel 5.0 based software that houses user-callable propagation models of propagation phenomena. It does not contain a database of propagation data generated out of the experiments. The database not only provides a powerful software tool to process the data generated by the experiments, but is also a time- and energy-saving tool for plotting results, generating tables and producing impressive and crisp hard copy for presentation and filing.

  13. Clerkship pathway

    PubMed Central

    MacLellan, Anne-Marie; Brailovsky, Carlos; Miller, François; Leboeuf, Sylvie

    2012-01-01

    Abstract Objective To identify factors that help predict success for international medical graduates (IMGs) who train in Canadian residency programs and pass the Canadian certification examinations. Design A retrospective analysis of 58 variables in the files of IMGs who applied to the Collège des médecins du Québec between 2000 and 2008. Setting Quebec. Participants Eight hundred ten IMGs who applied to the Collège des médecins du Québec through either the “equivalency pathway” (ie, starting training at a residency level) or the “clerkship pathway” (ie, relearning at the level of a medical student in the last 2 years of the MD diploma). Main outcome measures Success factors in achieving certification. Data were analyzed using descriptive statistics and ANOVA (analysis of variance). Results International medical graduates who chose the “clerkship pathway” had greater success on certification examinations than those who started at the residency level did. Conclusion There are several factors that influence IMGs’ success on certification examinations, including integration issues, the acquisition of clinical decision-making skills, and the varied educational backgrounds. These factors perhaps can be better addressed by a regular clerkship pathway, in which IMGs benefit from learner-centred teaching and have more time for reflection on and understanding of the North American approach to medical education. The clerkship pathway is a useful strategy for assuring the integration of IMGs in the North American health care system. A 2-year relearning period in medical school at a clinical clerkship level deserves careful consideration. PMID:22859630

  14. Pathway Cross-Talk Analysis in Detecting Significant Pathways in Barrett’s Esophagus Patients

    PubMed Central

    Xu, Zhengyuan; Yan, Yan; He, Jian; Shan, Xinfang; Wu, Weiguo

    2017-01-01

    Background The pathological mechanism of Barrett’s esophagus (BE) is still unclear. In the present study, pathway cross-talks were analyzed to identify hub pathways for BE, with the purpose of finding an efficient and cost-effective detection method to discover BE at its early stage and take steps to prevent its progression. Material/Methods We collected and preprocessed gene expression profile data, original pathway data, and protein-protein interaction (PPI) data. Then, we constructed a background pathway cross-talk network (BPCN) based on the original pathway data and PPI data, and a disease pathway cross-talk network (DPCN) based on the differential pathways between the PPI data and the BE and normal control. Finally, a comprehensive analysis was conducted on these 2 networks to identify hub pathway cross-talks for BE, so as to better understand the pathological mechanism of BE from the pathway level. Results A total of 12 411 genes, 300 pathways (6919 genes), and 787 896 PPI interactions (16 730 genes) were separately obtained from their own databases. Then, we constructed a BPCN with 300 nodes (42 293 interactions) and a DPCN with 296 nodes (15 073 interactions). We identified 4 hub pathways: AMP signaling pathway, cGMP-PKG signaling pathway, natural killer cell-mediated cytotoxicity, and osteoclast differentiation. We found that these pathways might play important roles during the occurrence and development of BE. Conclusions We predicted that these pathways (such as AMP signaling pathway and cAMP signaling pathway) could be used as potential biomarkers for early diagnosis and therapy of BE. PMID:28263955

  15. Pathway, in silico and tissue-specific expression quantitative analyses of oesophageal squamous cell carcinoma genome-wide association studies data

    PubMed Central

    Hyland, Paula L; Zhang, Han; Yang, Qi; Yang, Howard H; Hu, Nan; Lin, Shih-Wen; Su, Hua; Wang, Lemin; Wang, Chaoyu; Ding, Ti; Fan, Jin-Hu; Qiao, You-Lin; Sung, Hyuna; Wheeler, William; Giffen, Carol; Burdett, Laurie; Wang, Zhaoming; Lee, Maxwell P; Chanock, Stephen J; Dawsey, Sanford M; Freedman, Neal D; Abnet, Christian C; Goldstein, Alisa M; Yu, Kai; Taylor, Philip R

    2016-01-01

    Background: Oesophageal cancer is the fourth leading cause of cancer death in China where essentially all cases are histologically oesophageal squamous cell carcinoma (ESCC). Agnostic pathway-based analyses of genome-wide association study (GWAS) data combined with tissue-specific expression quantitative trait loci (eQTL) analysis and publicly available functional data can identify biological pathways and/or genes enriched with functionally-relevant disease-associated variants. Method: We used the adaptive multilocus joint test to analyse 1827 pathways containing 6060 genes using GWAS data from 1942 ESCC cases and 2111 controls with Chinese ancestry. We examined the function of risk alleles using in silico and eQTL analyses in oesophageal tissues. Results: Associations with ESCC risk were observed for 36 pathways predominantly involved in apoptosis, cell cycle regulation and DNA repair and containing known GWAS-associated genes. After excluding genes with previous GWAS signals, candidate pathways (and genes) for ESCC risk included taste transduction (KEGG_hsa04742; TAS2R13, TAS2R42, TAS2R14, TAS2R46,TAS2R50), long-patch base excision repair (Reactome_pid; POLD2) and the metabolics pathway (KEGG_hsa01100; MTAP, GAPDH, DCTD, POLD2, AMDHD1). We identified and validated CASP8 rs13016963 and IDH2 rs11630814 as eQTLs, and CASP8 rs3769823 and IDH2 rs4561444 as the potential functional variants in high-linkage disequilibrium with these single nucleotide polymorphisms (SNPs), respectively. Further, IDH2 mRNA levels were down-regulated in ESCC (tumour:normal-fold change = 0.69, P = 6.75E-14). Conclusion: Agnostic pathway-based analyses and integration of multiple types of functional data provide new evidence for the contribution of genes in taste transduction and metabolism to ESCC susceptibility, and for the functionality of both established and new ESCC risk-related SNPs. PMID:26635288

  16. Public chemical compound databases.

    PubMed

    Williams, Anthony J

    2008-05-01

    The internet has rapidly become the first port of call for all information searches. The increasing array of chemistry-related resources that are now available provides chemists with a direct path to the information that was previously accessed via library services and was limited by commercial and costly resources. The diversity of the information that can be accessed online is expanding at a dramatic rate, and the support for publicly available resources offers significant opportunities in terms of the benefits to science and society. While the data online do not generally meet the quality standards of manually curated sources, there are efforts underway to gather scientists together and 'crowdsource' an improvement in the quality of the available data. This review discusses the types of public compound databases that are available online and provides a series of examples. Focus is also given to the benefits and disruptions associated with the increased availability of such data and the integration of technologies to data mine this information.

  17. iTRAQ-Based Quantitative Proteomic Analysis of Cotton Roots and Leaves Reveals Pathways Associated with Salt Stress.

    PubMed

    Chen, Tingting; Zhang, Lei; Shang, Haihong; Liu, Shaodong; Peng, Jun; Gong, Wankui; Shi, Yuzhen; Zhang, Siping; Li, Junwen; Gong, Juwu; Ge, Qun; Liu, Aiying; Ma, Huijuan; Zhao, Xinhua; Yuan, Youlu

    2016-01-01

    Salinity is a major abiotic stress that affects plant growth and development. In this study, we performed a proteomic analysis of cotton roots and leaf tissue following exposure to saline stress. 611 and 1477 proteins were differentially expressed in the roots and leaves, respectively. In the roots, 259 (42%) proteins were up-regulated and 352 (58%) were down-regulated. In the leaves, 748 (51%) proteins were up-regulated and 729 (49%) were down-regulated. On the basis of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we concluded that the phenylalanine metabolism and starch and sucrose metabolism were active for energy homeostasis to cope with salt stress in cotton roots. Moreover, photosynthesis, pyruvate metabolism, glycolysis / gluconeogenesis, carbon fixation in photosynthetic organisms and phenylalanine metabolism were inhabited to reduce energy consumption. Characterization of the signaling pathways will help elucidate the mechanism activated by cotton in response to salt stress.

  18. kpath: integration of metabolic pathway linked data.

    PubMed

    Navas-Delgado, Ismael; García-Godoy, María Jesús; López-Camacho, Esteban; Rybinski, Maciej; Reyes-Palomares, Armando; Medina, Miguel Ángel; Aldana-Montes, José F

    2015-01-01

    In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases.

  19. pathDIP: an annotated resource for known and predicted human gene-pathway associations and pathway enrichment analysis

    PubMed Central

    Rahmati, Sara; Abovsky, Mark; Pastrello, Chiara; Jurisica, Igor

    2017-01-01

    Molecular pathway data are essential in current computational and systems biology research. While there are many primary and integrated pathway databases, several challenges remain, including low proteome coverage (57%), low overlap across different databases, unavailability of direct information about underlying physical connectivity of pathway members, and high fraction of protein-coding genes without any pathway annotations, i.e. ‘pathway orphans’. In order to address all these challenges, we developed pathDIP, which integrates data from 20 source pathway databases, ‘core pathways’, with physical protein–protein interactions to predict biologically relevant protein–pathway associations, referred to as ‘extended pathways’. Cross-validation determined 71% recovery rate of our predictions. Data integration and predictions increase coverage of pathway annotations for protein-coding genes to 86%, and provide novel annotations for 5732 pathway orphans. PathDIP (http://ophid.utoronto.ca/pathdip) annotates 17 070 protein-coding genes with 4678 pathways, and provides multiple query, analysis and output options. PMID:27899558

  20. PANDA: pathway and annotation explorer for visualizing and interpreting gene-centric data.

    PubMed

    Hart, Steven N; Moore, Raymond M; Zimmermann, Michael T; Oliver, Gavin R; Egan, Jan B; Bryce, Alan H; Kocher, Jean-Pierre A

    2015-01-01

    Objective. Bringing together genomics, transcriptomics, proteomics, and other -omics technologies is an important step towards developing highly personalized medicine. However, instrumentation has advances far beyond expectations and now we are able to generate data faster than it can be interpreted. Materials and Methods. We have developed PANDA (Pathway AND Annotation) Explorer, a visualization tool that integrates gene-level annotation in the context of biological pathways to help interpret complex data from disparate sources. PANDA is a web-based application that displays data in the context of well-studied pathways like KEGG, BioCarta, and PharmGKB. PANDA represents data/annotations as icons in the graph while maintaining the other data elements (i.e., other columns for the table of annotations). Custom pathways from underrepresented diseases can be imported when existing data sources are inadequate. PANDA also allows sharing annotations among collaborators. Results. In our first use case, we show how easy it is to view supplemental data from a manuscript in the context of a user's own data. Another use-case is provided describing how PANDA was leveraged to design a treatment strategy from the somatic variants found in the tumor of a patient with metastatic sarcomatoid renal cell carcinoma. Conclusion. PANDA facilitates the interpretation of gene-centric annotations by visually integrating this information with context of biological pathways. The application can be downloaded or used directly from our website: http://bioinformaticstools.mayo.edu/research/panda-viewer/.

  1. Content Independence in Multimedia Databases.

    ERIC Educational Resources Information Center

    de Vries, Arjen P.

    2001-01-01

    Investigates the role of data management in multimedia digital libraries, and its implications for the design of database management systems. Introduces the notions of content abstraction and content independence. Proposes a blueprint of a new class of database technology, which supports the basic functionality for the management of both content…

  2. Hanford Site technical baseline database

    SciTech Connect

    Porter, P.E., Westinghouse Hanford

    1996-05-10

    This document includes a cassette tape that contains the Hanford specific files that make up the Hanford Site Technical Baseline Database as of May 10, 1996. The cassette tape also includes the delta files that delineate the differences between this revision and revision 3 (April 10, 1996) of the Hanford Site Technical Baseline Database.

  3. XCOM: Photon Cross Sections Database

    National Institute of Standards and Technology Data Gateway

    SRD 8 XCOM: Photon Cross Sections Database (Web, free access)   A web database is provided which can be used to calculate photon cross sections for scattering, photoelectric absorption and pair production, as well as total attenuation coefficients, for any element, compound or mixture (Z <= 100) at energies from 1 keV to 100 GeV.

  4. Atomic Spectroscopic Databases at NIST

    NASA Technical Reports Server (NTRS)

    Reader, J.; Kramida, A. E.; Ralchenko, Yu.

    2006-01-01

    We describe recent work at NIST to develop and maintain databases for spectra, transition probabilities, and energy levels of atoms that are astrophysically important. Our programs to critically compile these data as well as to develop a new database to compare plasma calculations for atoms that are not in local thermodynamic equilibrium are also summarized.

  5. The Student-Designed Database.

    ERIC Educational Resources Information Center

    Thomas, Rick

    1988-01-01

    This discussion of the design of data files for databases to be created by secondary school students uses AppleWorks software as an example. Steps needed to create and use a database are explained, the benefits of group activity are described, and other possible projects are listed. (LRW)

  6. Data manipulation in heterogeneous databases

    SciTech Connect

    Chatterjee, A.; Segev, A.

    1991-10-01

    Many important information systems applications require access to data stored in multiple heterogeneous databases. This paper examines a problem in inter-database data manipulation within a heterogeneous environment, where conventional techniques are no longer useful. To solve the problem, a broader definition for join operator is proposed. Also, a method to probabilistically estimate the accuracy of the join is discussed.

  7. Database Licensing: A Future View.

    ERIC Educational Resources Information Center

    Flanagan, Michael

    1993-01-01

    Access to database information in libraries will increase as licenses for tape loading of data onto public access catalogs becomes more widespread. Institutions with adequate storage capacity will have full text databases, and the adoption of the Z39.50 standard, which allows differing computer systems to interface with each other, will increase…

  8. Wind turbine reliability database update.

    SciTech Connect

    Peters, Valerie A.; Hill, Roger Ray; Stinebaugh, Jennifer A.; Veers, Paul S.

    2009-03-01

    This report documents the status of the Sandia National Laboratories' Wind Plant Reliability Database. Included in this report are updates on the form and contents of the Database, which stems from a fivestep process of data partnerships, data definition and transfer, data formatting and normalization, analysis, and reporting. Selected observations are also reported.

  9. The EUVE satellite survey database

    NASA Technical Reports Server (NTRS)

    Craig, N.; Chen, T.; Hawkins, I.; Fruscione, A.

    1993-01-01

    The EUVE survey database contains fundamental science data for 9000 potential source locations (pigeonholes) in the sky. The first release of the Bright Source List is now available to the public through an interface with the NASA Astrophysical Data System. We describe the database schema design and the EUVE source categorization algorithm that compares sources to the ROSAT Wide Field Camera source list.

  10. Mathematical Notation in Bibliographic Databases.

    ERIC Educational Resources Information Center

    Pasterczyk, Catherine E.

    1990-01-01

    Discusses ways in which using mathematical symbols to search online bibliographic databases in scientific and technical areas can improve search results. The representations used for Greek letters, relations, binary operators, arrows, and miscellaneous special symbols in the MathSci, Inspec, Compendex, and Chemical Abstracts databases are…

  11. The Yield of Bibliographic Databases.

    ERIC Educational Resources Information Center

    Kowalski, Kazimierz; Hackett, Timothy P.

    1992-01-01

    Demonstrates a means for estimating the number of retrieved items using well-established selective dissemination of information (SDI) profiles in the SCI, INSPEC, ISMEC, CAS, and PASCAL databases. A correlation between individual database size and number of retrieved documents in technical fields is also examined. (17 references) (LAE)

  12. GOTTCHA Database, Version 1

    SciTech Connect

    Freitas, Tracey; Chain, Patrick; Lo, Chien-Chi; Li, Po-E

    2015-08-03

    One major challenge in the field of shotgun metagenomics is the accurate identification of the organisms present within the community, based on classification of short sequence reads. Though microbial community profiling methods have emerged to attempt to rapidly classify the millions of reads output from contemporary sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling tool with significantly smaller FDR, which is also capable of classifying never-before seen genomes into the appropriate parent taxa.The algorithm is based upon three primary computational phases: (I) genomic decomposition into bit vectors, (II) bit vector intersections to identify shared regions, and (III) bit vector subtractions to remove shared regions and reveal unique, signature regions.In the Decomposition phase, genomic data is first masked to highlight only the valid (non-ambiguous) regions and then decomposed into overlapping 24-mers. The k-mers are sorted along with their start positions, de-replicated, and then prefixed, to minimize data duplication. The prefixes are indexed and an identical data structure is created for the start positions to mimic that of the k-mer data structure.During the Intersection phase -- which is the most computationally intensive phase -- as an all-vs-all comparison is made, the number of comparisons is first reduced by four methods: (a) Prefix restriction, (b) Overlap detection, (c) Overlap restriction, and (d) Result recording. In Prefix restriction, only k-mers of the same prefix are compared. Within that group, potential overlap of k-mer suffixes that would result in a non-empty set intersection are screened for. If such an overlap exists, the region which intersects is

  13. A database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Suwitra, Krisjani S.

    1992-01-01

    In June 1991, a paper at the fifteenth NASA Propagation Experimenters Meeting (NAPEX 15) was presented outlining the development of a database for propagation models. The database is designed to allow the scientists and experimenters in the propagation field to process their data through any known and accepted propagation model. The architecture of the database also incorporates the possibility of changing the standard models in the database to fit the scientist's or the experimenter's needs. The database not only provides powerful software to process the data generated by the experiments, but is also a time- and energy-saving tool for plotting results, generating tables, and producing impressive and crisp hard copy for presentation and filing.

  14. Pathways associated with lignin biosynthesis in lignomaniac jute fibres.

    PubMed

    Chakraborty, Avrajit; Sarkar, Debabrata; Satya, Pratik; Karmakar, Pran Gobinda; Singh, Nagendra Kumar

    2015-08-01

    We generated the bast transcriptomes of a deficient lignified phloem fibre mutant and its wild-type jute (Corchorus capsularis) using Illumina paired-end sequencing. A total of 34,163 wild-type and 29,463 mutant unigenes, with average lengths of 1442 and 1136 bp, respectively, were assembled de novo, ~77-79 % of which were functionally annotated. These annotated unigenes were assigned to COG (~37-40 %) and GO (~22-28 %) classifications and mapped to 189 KEGG pathways (~19-21 %). We discovered 38 and 43 isoforms of 16 and 10 genes of the upstream shikimate-aromatic amino acid and downstream monolignol biosynthetic pathways, respectively, rendered their sequence similarities, confirmed the identities of 22 of these candidate gene families by phylogenetic analyses and reconstructed the pathway leading to lignin biosynthesis in jute fibres. We also identified major genes and bast-related transcription factors involved in secondary cell wall (SCW) formation. The quantitative RT-PCRs revealed that phenylalanine ammonia-lyase 1 (CcPAL1) was co-down-regulated with several genes of the upstream shikimate pathway in mutant bast tissues at an early growth stage, although its expression relapsed to the normal level at the later growth stage. However, cinnamyl alcohol dehydrogenase 7 (CcCAD7) was strongly down-regulated in mutant bast tissues irrespective of growth stages. CcCAD7 disruption at an early growth stage was accompanied by co-up-regulation of SCW-specific genes cellulose synthase A7 (CcCesA7) and fasciclin-like arabinogalactan 6 (CcFLA6), which was predicted to be involved in coordinating the S-layers' deposition in the xylan-type jute fibres. Our results identified CAD as a promising target for developing low-lignin jute fibres using genomics-assisted molecular approaches.

  15. OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder.

    PubMed

    Privitera, Anna P; Distefano, Rosario; Wefer, Hugo A; Ferro, Alfredo; Pulvirenti, Alfredo; Giugno, Rosalba

    2015-01-01

    Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data

  16. AlzPathway: a comprehensive map of signaling pathways of Alzheimer’s disease

    PubMed Central

    2012-01-01

    Background Alzheimer’s disease (AD) is the most common cause of dementia among the elderly. To clarify pathogenesis of AD, thousands of reports have been accumulating. However, knowledge of signaling pathways in the field of AD has not been compiled as a database before. Description Here, we have constructed a publicly available pathway map called “AlzPathway” that comprehensively catalogs signaling pathways in the field of AD. We have collected and manually curated over 100 review articles related to AD, and have built an AD pathway map using CellDesigner. AlzPathway is currently composed of 1347 molecules and 1070 reactions in neuron, brain blood barrier, presynaptic, postsynaptic, astrocyte, and microglial cells and their cellular localizations. AlzPathway is available as both the SBML (Systems Biology Markup Language) map for CellDesigner and the high resolution image map. AlzPathway is also available as a web service (online map) based on Payao system, a community-based, collaborative web service platform for pathway model curation, enabling continuous updates by AD researchers. Conclusions AlzPathway is the first comprehensive map of intra, inter and extra cellular AD signaling pathways which can enable mechanistic deciphering of AD pathogenesis. The AlzPathway map is accessible at http://alzpathway.org/. PMID:22647208

  17. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    DOE PAGES

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; ...

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less

  18. The Berlin Emissivity Database

    NASA Astrophysics Data System (ADS)

    Helbert, Jorn

    Remote sensing infrared spectroscopy is the principal field of investigation for planetary surfaces composition. Past, present and future missions to the solar system bodies include in their payload instruments measuring the emerging radiation in the infrared range. TES on Mars Global Surveyor and THEMIS on Mars Odyssey have in many ways changed our views of Mars. The PFS instrument on the ESA Mars Express mission has collected spectra since the beginning of 2004. In spring 2006 the VIRTIS experiment started its operation on the ESA Venus Express mission, allowing for the first time to map the surface of Venus using the 1 µm emission from the surface. The MERTIS spectrometer is included in the payload of the ESA BepiColombo mission to Mercury, scheduled for 2013. For the interpretation of the measured data an emissivity spectral library of planetary analogue materials is needed. The Berlin Emissivity Database (BED) presented here is focused on relatively fine-grained size separates, providing a realistic basis for interpretation of thermal emission spectra of planetary regoliths. The BED is therefore complimentary to existing thermal emission libraries, like the ASU library for example. The BED contains currently entries for plagioclase and potassium feldspars, low Ca and high Ca pyroxenes, olivine, elemental sulphur, common martian analogues (JSC Mars-1, Salten Skov, palagonites, montmorillonite) and a lunar highland soil sample measured in the wavelength range from 3 to 50 µm as a function of particle size. For each sample, the spectra of four well defined particle size separates (¡25 µm , 25-63 µm, 63-125 µm, 125-250 µm) are measured with a 4 cm-1 spectral resolution. These size separates have been selected as typical representations for most of the planetary surfaces. Following an ongoing upgrade of the Planetary Emmissivity Laboratory (PEL) at DLR in Berlin measurements can be obtained at temperatures up to 500° C - realistic for the dayside conditions

  19. Database of Properties of Meteors

    NASA Technical Reports Server (NTRS)

    Suggs, Rob; Anthea, Coster

    2006-01-01

    A database of properties of meteors, and software that provides access to the database, are being developed as a contribution to continuing efforts to model the characteristics of meteors with increasing accuracy. Such modeling is necessary for evaluation of the risk of penetration of spacecraft by meteors. For each meteor in the database, the record will include an identification, date and time, radiant properties, ballistic coefficient, radar cross section, size, density, and orbital elements. The property of primary interest in the present case is density, and one of the primary goals in this case is to derive densities of meteors from their atmospheric decelerations. The database and software are expected to be valid anywhere in the solar system. The database will incorporate new data plus results of meteoroid analyses that, heretofore, have not been readily available to the aerospace community. Taken together, the database and software constitute a model that is expected to provide improved estimates of densities and to result in improved risk analyses for interplanetary spacecraft. It is planned to distribute the database and software on a compact disk.

  20. Database tomography for commercial application

    NASA Technical Reports Server (NTRS)

    Kostoff, Ronald N.; Eberhart, Henry J.

    1994-01-01

    Database tomography is a method for extracting themes and their relationships from text. The algorithms, employed begin with word frequency and word proximity analysis and build upon these results. When the word 'database' is used, think of medical or police records, patents, journals, or papers, etc. (any text information that can be computer stored). Database tomography features a full text, user interactive technique enabling the user to identify areas of interest, establish relationships, and map trends for a deeper understanding of an area of interest. Database tomography concepts and applications have been reported in journals and presented at conferences. One important feature of the database tomography algorithm is that it can be used on a database of any size, and will facilitate the users ability to understand the volume of content therein. While employing the process to identify research opportunities it became obvious that this promising technology has potential applications for business, science, engineering, law, and academe. Examples include evaluating marketing trends, strategies, relationships and associations. Also, the database tomography process would be a powerful component in the area of competitive intelligence, national security intelligence and patent analysis. User interests and involvement cannot be overemphasized.

  1. Unifying Memory and Database Transactions

    NASA Astrophysics Data System (ADS)

    Dias, Ricardo J.; Lourenço, João M.

    Software Transactional Memory is a concurrency control technique gaining increasing popularity, as it provides high-level concurrency control constructs and eases the development of highly multi-threaded applications. But this easiness comes at the expense of restricting the operations that can be executed within a memory transaction, and operations such as terminal and file I/O are either not allowed or incur in serious performance penalties. Database I/O is another example of operations that usually are not allowed within a memory transaction. This paper proposes to combine memory and database transactions in a single unified model, benefiting from the ACID properties of the database transactions and from the speed of main memory data processing. The new unified model covers, without differentiating, both memory and database operations. Thus, the users are allowed to freely intertwine memory and database accesses within the same transaction, knowing that the memory and database contents will always remain consistent and that the transaction will atomically abort or commit the operations in both memory and database. This approach allows to increase the granularity of the in-memory atomic actions and hence, simplifies the reasoning about them.

  2. Databases of the marine metagenomics.

    PubMed

    Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  3. Draft secure medical database standard.

    PubMed

    Pangalos, George

    2002-01-01

    Medical database security is a particularly important issue for all Healthcare establishments. Medical information systems are intended to support a wide range of pertinent health issues today, for example: assure the quality of care, support effective management of the health services institutions, monitor and contain the cost of care, implement technology into care without violating social values, ensure the equity and availability of care, preserve humanity despite the proliferation of technology etc.. In this context, medical database security aims primarily to support: high availability, accuracy and consistency of the stored data, the medical professional secrecy and confidentiality, and the protection of the privacy of the patient. These properties, though of technical nature, basically require that the system is actually helpful for medical care and not harmful to patients. These later properties require in turn not only that fundamental ethical principles are not violated by employing database systems, but instead, are effectively enforced by technical means. This document reviews the existing and emerging work on the security of medical database systems. It presents in detail the related problems and requirements related to medical database security. It addresses the problems of medical database security policies, secure design methodologies and implementation techniques. It also describes the current legal framework and regulatory requirements for medical database security. The issue of medical database security guidelines is also examined in detailed. The current national and international efforts in the area are studied. It also gives an overview of the research work in the area. The document also presents in detail the most complete to our knowledge set of security guidelines for the development and operation of medical database systems.

  4. USGS Dam Removal Science Database

    USGS Publications Warehouse

    Bellmore, J. Ryan; Vittum, Katherine; Duda, Jeff J.; Greene, Samantha L.

    2015-01-01

    This database is the result of an extensive literature search aimed at identifying documents relevant to the emerging field of dam removal science. In total the database contains 179 citations that contain empirical monitoring information associated with 130 different dam removals across the United States and abroad. Data includes publications through 2014 and supplemented with the U.S. Army Corps of Engineers National Inventory of Dams database, U.S. Geological Survey National Water Information System and aerial photos to estimate locations when coordinates were not provided. Publications were located using the Web of Science, Google Scholar, and Clearinghouse for Dam Removal Information.

  5. Biological databases for human research.

    PubMed

    Zou, Dong; Ma, Lina; Yu, Jun; Zhang, Zhang

    2015-02-01

    The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation.

  6. Can databasing optimise patient care?

    PubMed

    Trojano, Maria

    2004-09-01

    Long-term, prospective databasing of multiple sclerosis (MS) information provides a useful resource for natural history studies. Furthermore, it is the only way to address the question of whether early treatment eliminates or delays the inevitable and irreversible clinical worsening that is the hallmark of the late phase of the illness. Due to the variable nature of MS, it is useful to monitor large numbers of individuals over time. The limitations of single databases may be overcome by regional, national or international pooling of data. In this paper, the Italian Multiple Sclerosis Database Network (MSDN) and the international web-based MSBase registry are described.

  7. Prototyping a genetics deductive database

    SciTech Connect

    Hearne, C.; Cui, Zhan; Parsons, S.; Hajnal, S.

    1994-12-31

    We are developing a laboratory notebook system known as the Genetics Deductive Database. Currently our prototype provides storage for biological facts and rules with flexible access via an interactive graphical display. We have introduced a formal basis for the representation and reasoning necessary to order genome map data and handle the uncertainty inherent in biological data. We aim to support laboratory activities by introducing an experiment planner into our prototype. The Genetics Deductive Database is built using new database technology which provides an object-oriented conceptual model, a declarative rule language, and a procedural update language. This combination of features allows the implementation of consistency maintenance, automated reasoning, and data verification.

  8. International forensic automotive paint database

    NASA Astrophysics Data System (ADS)

    Bishea, Gregory A.; Buckle, Joe L.; Ryland, Scott G.

    1999-02-01

    The Technical Working Group for Materials Analysis (TWGMAT) is supporting an international forensic automotive paint database. The Federal Bureau of Investigation and the Royal Canadian Mounted Police (RCMP) are collaborating on this effort through TWGMAT. This paper outlines the support and further development of the RCMP's Automotive Paint Database, `Paint Data Query'. This cooperative agreement augments and supports a current, validated, searchable, automotive paint database that is used to identify make(s), model(s), and year(s) of questioned paint samples in hit-and-run fatalities and other associated investigations involving automotive paint.

  9. Database of recent tsunami deposits

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce E.

    2010-01-01

    This report describes a database of sedimentary characteristics of tsunami deposits derived from published accounts of tsunami deposit investigations conducted shortly after the occurrence of a tsunami. The database contains 228 entries, each entry containing data from up to 71 categories. It includes data from 51 publications covering 15 tsunamis distributed between 16 countries. The database encompasses a wide range of depositional settings including tropical islands, beaches, coastal plains, river banks, agricultural fields, and urban environments. It includes data from both local tsunamis and teletsunamis. The data are valuable for interpreting prehistorical, historical, and modern tsunami deposits, and for the development of criteria to identify tsunami deposits in the geologic record.

  10. The Automatic Library Tracking Database

    SciTech Connect

    Fahey, Mark R; Jones, Nicholas A; Hadri, Bilel

    2010-01-01

    A library tracking database has been developed and put into production at the National Institute for Computational Sciences and the Oak Ridge Leadership Computing Facility (both located at Oak Ridge National Laboratory.) The purpose of the library tracking database is to track which libraries are used at link time on Cray XT5 Supercomputers. The database stores the libraries used at link time and also records the executables run in a batch job. With this data, many operationally important questions can be answered such as which libraries are most frequently used and which users are using deprecated libraries or applications. The infrastructure design and reporting mechanisms are presented along with collected production data.

  11. New geothermal database for Utah

    USGS Publications Warehouse

    Blackett, Robert E.; ,

    1993-01-01

    The Utah Geological Survey complied a preliminary database consisting of over 800 records on thermal wells and springs in Utah with temperatures of 20??C or greater. Each record consists of 35 fields, including location of the well or spring, temperature, depth, flow-rate, and chemical analyses of water samples. Developed for applications on personal computers, the database will be useful for geochemical, statistical, and other geothermal related studies. A preliminary map of thermal wells and springs in Utah, which accompanies the database, could eventually incorporate heat-flow information, bottom-hole temperatures from oil and gas wells, traces of Quaternary faults, and locations of young volcanic centers.

  12. Freshwater Biological Traits Database (Final Report)

    EPA Science Inventory

    EPA announced the release of the final report, Freshwater Biological Traits Database. This report discusses the development of a database of freshwater biological traits. The database combines several existing traits databases into an online format. The database is also...

  13. ASDCD: Antifungal Synergistic Drug Combination Database

    PubMed Central

    Chen, Ming; Liu, Ming-Xi; Ren, Wei; Wang, Quan-Xin; Zhang, Li-Xin; Yan, Gui-Ying

    2014-01-01

    Finding effective drugs to treat fungal infections has important clinical significance based on high mortality rates, especially in an immunodeficient population. Traditional antifungal drugs with single targets have been reported to cause serious side effects and drug resistance. Nowadays, however, drug combinations, particularly with respect to synergistic interaction, have attracted the attention of researchers. In fact, synergistic drug combinations could simultaneously affect multiple subpopulations, targets, and diseases. Therefore, a strategy that employs synergistic antifungal drug combinations could eliminate the limitations noted above and offer the opportunity to explore this emerging bioactive chemical space. However, it is first necessary to build a powerful database in order to facilitate the analysis of drug combinations. To address this gap in our knowledge, we have built the first Antifungal Synergistic Drug Combination Database (ASDCD), including previously published synergistic antifungal drug combinations, chemical structures, targets, target-related signaling pathways, indications, and other pertinent data. Its current version includes 210 antifungal synergistic drug combinations and 1225 drug-target interactions, involving 105 individual drugs from more than 12,000 references. ASDCD is freely available at http://ASDCD.amss.ac.cn. PMID:24475134

  14. EADB: An Estrogenic Activity Database for Assessing ...

    EPA Pesticide Factsheets

    Endocrine-active chemicals can potentially have adverse effects on both humans and wildlife. They can interfere with the body’s endocrine system through direct or indirect interactions with many protein targets. Estrogen receptors (ERs) are one of the major targets, and many endocrine disruptors are estrogenic and affect the normal estrogen signaling pathways. However, ERs can also serve as therapeutic targets for various medical conditions, such as menopausal symptoms, osteoporosis, and ER-positive breast cancer. Because of the decades-long interest in the safety and therapeutic utility of estrogenic chemicals, a large number of chemicals have been assayed for estrogenic activity, but these data exist in various sources and different formats that restrict the ability of regulatory and industry scientists to utilize them fully for assessing risk-benefit. To address this issue, we have developed an Estrogenic Activity Database (EADB; http://www.fda.gov/ScienceResearch/ BioinformaticsTools/EstrogenicActivityDatabaseEADB/default. htm) and made it freely available to the public. EADB contains 18,114 estrogenic activity data points collected for 8212 chemicals tested in 1284 binding, reporter gene, cell proliferation, and in vivo assays in 11 different species. The chemicals cover a broad chemical structure space and the data span a wide range of activities. A set of tools allow users to access EADB and evaluate potential endocrine activity of

  15. CGMD: An integrated database of cancer genes and markers.

    PubMed

    Pradeepkiran, Jangampalli Adi; Sainath, Sri Bhashyam; Kumar, Konidala Kramthi; Balasubramanyam, Lokanada; Prabhakar, Kodali Vidya; Bhaskar, Matcha

    2015-07-10

    Integrating cancer genes and markers with experimental evidence might provide valuable information for the further investigation of crosstalk between tumor genes and markers in cancer biology. To achieve this objective, we developed a database known as the Cancer Gene Marker Database (CGMD), which integrates data on tumor genes and markers based on experimental evidence. The major goal of CGMD is to provide the following: 1) current systematic treatment approaches and recent advances in different cancer treatments; 2) the aggregation of different genes and markers by their molecular characteristics and pathway associations; and 3) free access to the data compiled by CGMD at http://cgmd.in/. The database consists of 309 genes and 206 markers, as well as a list of 40 different human cancers, with detailed descriptions of all characterized markers. CGMD provides complete cancer annotations and molecular descriptions of cancer genes and markers such as CpG islands, promoters, exons, PDB structures, active sites and domains.

  16. CGMD: An integrated database of cancer genes and markers

    PubMed Central

    Pradeepkiran, Jangampalli Adi; Sainath, Sri Bhashyam; Kramthi Kumar, Konidala; Balasubramanyam, Lokanada; Vidya Prabhakar, Kodali; Bhaskar, Matcha

    2015-01-01

    Integrating cancer genes and markers with experimental evidence might provide valuable information for the further investigation of crosstalk between tumor genes and markers in cancer biology. To achieve this objective, we developed a database known as the Cancer Gene Marker Database (CGMD), which integrates data on tumor genes and markers based on experimental evidence. The major goal of CGMD is to provide the following: 1) current systematic treatment approaches and recent advances in different cancer treatments; 2) the aggregation of different genes and markers by their molecular characteristics and pathway associations; and 3) free access to the data compiled by CGMD at http://cgmd.in/. The database consists of 309 genes and 206 markers, as well as a list of 40 different human cancers, with detailed descriptions of all characterized markers. CGMD provides complete cancer annotations and molecular descriptions of cancer genes and markers such as CpG islands, promoters, exons, PDB structures, active sites and domains. PMID:26160459

  17. Fun Databases: My Top Ten.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1992-01-01

    Provides reviews of 10 online databases: Consumer Reports; Public Opinion Online; Encyclopedia of Associations; Official Airline Guide Adventure Atlas and Events Calendar; CENDATA; Hollywood Hotline; Fearless Taster; Soap Opera Summaries; and Human Sexuality. (LRW)

  18. Development, databases and the Internet.

    PubMed

    Bard, J B; Davies, J A

    1995-11-01

    There is now a rapidly expanding population of interlinked developmental biology databases on the World Wide Web that can be readily accessed from a desk-top PC using programs such as Netscape or Mosaic. These databases cover popular organisms (Arabidopsis, Caenorhabditis, Drosophila, zebrafish, mouse, etc.) and include gene and protein sequences, lists of mutants, information on resources and techniques, and teaching aids. More complex are databases relating domains of gene expression to embryonic anatomy and these range from existing text-based systems for specific organs such as kidney, to a massive project under development, that will cover gene expression during the whole of mouse embryogenesis. In this brief article, we review selected examples of databases currently available, look forward to what will be available soon, and explain how to gain access to the World Wide Web.

  19. SUPERSITES INTEGRATED RELATIONAL DATABASE (SIRD)

    EPA Science Inventory

    As part of EPA's Particulate Matter (PM) Supersites Program (Program), the University of Maryland designed and developed the Supersites Integrated Relational Database (SIRD). Measurement data in SIRD include comprehensive air quality data from the 7 Supersite program locations f...

  20. Freshwater Biological Traits Database (Traits)

    EPA Pesticide Factsheets

    The traits database was compiled for a project on climate change effects on river and stream ecosystems. The traits data, gathered from multiple sources, focused on information published or otherwise well-documented by trustworthy sources.

  1. InterAction Database (IADB)

    Cancer.gov

    The InterAction Database includes demographic and prescription information for more than 500,000 patients in the northern and middle Netherlands and has been integrated with other systems to enhance data collection and analysis.

  2. Marine and Hydrokinetic Technology Database

    DOE Data Explorer

    DOE’s Marine and Hydrokinetic Technology Database provides up-to-date information on marine and hydrokinetic renewable energy, both in the U.S. and around the world. The database includes wave, tidal, current, and ocean thermal energy, and contains information on the various energy conversion technologies, companies active in the field, and development of projects in the water. Depending on the needs of the user, the database can present a snapshot of projects in a given region, assess the progress of a certain technology type, or provide a comprehensive view of the entire marine and hydrokinetic energy industry. Results are displayed as a list of technologies, companies, or projects. Data can be filtered by a number of criteria, including country/region, technology type, generation capacity, and technology or project stage. The database was updated in 2009 to include ocean thermal energy technologies, companies, and projects.

  3. The Danish quality database for prehospital emergency medical services

    PubMed Central

    Frischknecht Christensen, Erika; Berlac, Peter Anthony; Nielsen, Henrik; Christiansen, Christian Fynbo

    2016-01-01

    Aim of database The aim of the Danish quality database for prehospital emergency medical services (QEMS) is to assess, monitor, and improve the quality of prehospital emergency medical service care in the entire prehospital patient pathway. The aim of this review is to describe the design and the implementation of QEMS. Study population The study population consists of all “112 patient contacts” defined as emergency patients, where the entrance to health care is a 112 call forwarded to one of the five regional emergency medical coordination centers in Denmark since January 1, 2014. Estimated annual number of included “112 patients” is 300,000–350,000. Main variables We defined nine quality indicators and the following variables: time stamps for emergency calls received at one of the five regional emergency medical coordination centers, dispatch of prehospital unit(s), arrival of first prehospital unit, arrival of first supplemental prehospital unit, and mission completion. Finally, professional level and type of the prehospital resource dispatched to an incident and end-of-mission status (mission completed by phone, on scene, or admission to hospital) are registered. Descriptive data Descriptive data included age, region, and Danish Index for Emergency Care including urgency level. Conclusion QEMS is a new database under establishment and is expected to provide the basis for quality improvement in the prehospital setting and in the entire patient care pathway, for example, by providing prehospital data for research and other quality databases. PMID:27843347

  4. ARN: analysis and prediction by adipogenic professional database.

    PubMed

    Huang, Yan; Wang, Li; Zan, And Lin-Sen

    2016-08-08

    Adipogenesis is the process of cell differentiation by which mesenchymal stem cells become adipocytes. Extensive research is ongoing to identify genes, their protein products, and microRNAs that correlate with fat cell development. The existing databases have focused on certain types of regulatory factors and interactions. However, there is no relationship between the results of the experimental studies on adipogenesis and these databases because of the lack of an information center. This information fragmentation hampers the identification of key regulatory genes and pathways. Thus, it is necessary to provide an information center that is quickly and easily accessible to researchers in this field. We selected and integrated data from eight external databases based on the results of text-mining, and constructed a publicly available database and web interface (URL: http://210.27.80.93/arn/ ), which contained 30873 records related to adipogenic differentiation. Then, we designed an online analysis tool to analyze the experimental data or form a scientific hypothesis about adipogenesis through Swanson's literature-based discovery process. Furthermore, we calculated the "Impact Factor" ("IF") value that reflects the importance of each node by counting the numbers of relation records, expression records, and prediction records for each node. This platform can support ongoing adipogenesis research and contribute to the discovery of key regulatory genes and pathways.

  5. Materials Inventory Database for the Light Water Reactor Sustainability Program

    SciTech Connect

    Kazi Ahmed; Shannon M. Bragg-Sitton

    2013-08-01

    Scientific research involves the purchasing, processing, characterization, and fabrication of many sample materials. The history of such materials can become complicated over their lifetime – materials might be cut into pieces or moved to various storage locations, for example. A database with built-in functions to track these kinds of processes facilitates well-organized research. The Material Inventory Database Accounting System (MIDAS) is an easy-to-use tracking and reference system for such items. The Light Water Reactor Sustainability Program (LWRS), which seeks to advance the long-term reliability and productivity of existing nuclear reactors in the United States through multiple research pathways, proposed MIDAS as an efficient way to organize and track all items used in its research. The database software ensures traceability of all items used in research using built-in functions which can emulate actions on tracked items – fabrication, processing, splitting, and more – by performing operations on the data. MIDAS can recover and display the complete history of any item as a simple report. To ensure the database functions suitably for the organization of research, it was developed alongside a specific experiment to test accident tolerant nuclear fuel cladding under the LWRS Advanced Light Water Reactor Nuclear Fuels Pathway. MIDAS kept track of materials used in this experiment from receipt at the laboratory through all processes, test conduct and, ultimately, post-test analysis. By the end of this process, the database proved to be right tool for this program. The database software will help LWRS more efficiently conduct research experiments, from simple characterization tests to in-reactor experiments. Furthermore, MIDAS is a universal tool that any other research team could use to organize their material inventory.

  6. Small Business Innovations (Integrated Database)

    NASA Technical Reports Server (NTRS)

    1992-01-01

    Because of the diversity of NASA's information systems, it was necessary to develop DAVID as a central database management system. Under a Small Business Innovation Research (SBIR) grant, Ken Wanderman and Associates, Inc. designed software tools enabling scientists to interface with DAVID and commercial database management systems, as well as artificial intelligence programs. The software has been installed at a number of data centers and is commercially available.

  7. Atomic and Molecular Databases, VAMDC

    NASA Astrophysics Data System (ADS)

    Dubernet, M. L.; Zwölf, C. M.; Moreau, N.; Ba, Y. A.

    2016-10-01

    The VAMDC Consortium is a worldwide consortium which federates Atomic and Molecular databases through an e-science infrastructure and a political organisation. About 90% of the inter-connected databases handle data that are used for the interpretation of spectra and for the modeling of media of many fields of astrophysics. This paper presents how the VAMDC Consortium is organised in order to publish atomic and molecular data for astrophysics.

  8. Air Compliance Complaint Database (ACCD)

    EPA Pesticide Factsheets

    THIS DATA ASSET NO LONGER ACTIVE: This is metadata documentation for the Region 7 Air Compliance Complaint Database (ACCD) which logs all air pollution complaints received by Region 7. It contains information about the complaint along with how the complaint was addressed. The Air and Waste Management Division is the primary managing entity for this database. This work falls under objectives for EPA's 2003-2008 Strategic Plan (Goal 1) for Clean Air & Global Climate Change, which are to achieve healthier outdoor air.

  9. Ariel Database Rule System Project

    DTIC Science & Technology

    1992-01-14

    NOTES EL CT a Distribution unlimited UL 13. ABSTRACT (Mmmuum 200 we~ The Ariel project has culminated in several advancements in active database...4] Moez Chaabouni. A top-level discrimination network for database rule systems. Master’s thesis, Dept. of Computer Science and Eng., Wright State... Moez Chaabouni. The IBS-tree: A data structure for finding all intervals that overlap a point. Technical Report WSU-CS-90-11, Dept. of Computer

  10. World electric power plants database

    SciTech Connect

    2006-06-15

    This global database provides records for 104,000 generating units in over 220 countries. These units include installed and projected facilities, central stations and distributed plants operated by utilities, independent power companies and commercial and self-generators. Each record includes information on: geographic location and operating company; technology, fuel and boiler; generator manufacturers; steam conditions; unit capacity and age; turbine/engine; architect/engineer and constructor; and pollution control equipment. The database is issued quarterly.

  11. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

    PubMed

    Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

    2016-09-01

    Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms.

  12. LncRNA mediated regulation of aging pathways in Drosophila melanogaster during dietary restriction

    PubMed Central

    Tu, Jianbo; Gaur, Uma; Mao, Xueping; Fan, Xiaolan; Li, Diyan; Li, Ying; Yang, Mingyao

    2016-01-01

    Dietary restriction (DR) extends lifespan in many species which is a well-known phenomenon. Long non-coding RNAs (lncRNAs) play an important role in regulation of cell senescence and important age-related signaling pathways. Here, we profiled the lncRNA and mRNA transcriptome of fruit flies at 7 day and 42 day during DR and fully-fed conditions, respectively. In general, 102 differentially expressed lncRNAs and 1406 differentially expressed coding genes were identified. Most informatively we found a large number of differentially expressed lncRNAs and their targets enriched in GO and KEGG analysis. We discovered some new aging related signaling pathways during DR, such as hippo signaling pathway-fly, phototransduction-fly and protein processing in endoplasmic reticulum etc. Novel lncRNAs XLOC_092363 and XLOC_166557 are found to be located in 10 kb upstream sequences of hairy and ems promoters, respectively. Furthermore, tissue specificity of some novel lncRNAs had been analyzed at 7 day of DR in fly head, gut and fat body. Also the silencing of lncRNA XLOC_076307 resulted in altered expression level of its targets including Gadd45 (involved in FoxO signaling pathway). Together, the results implicated many lncRNAs closely associated with dietary restriction, which could provide a resource for lncRNA in aging and age-related disease field. PMID:27687893

  13. DMTB: the magnetotactic bacteria database

    NASA Astrophysics Data System (ADS)

    Pan, Y.; Lin, W.

    2012-12-01

    Magnetotactic bacteria (MTB) are of interest in biogeomagnetism, rock magnetism, microbiology, biomineralization, and advanced magnetic materials because of their ability to synthesize highly ordered intracellular nano-sized magnetic minerals, magnetite or greigite. Great strides for MTB studies have been made in the past few decades. More than 600 articles concerning MTB have been published. These rapidly growing data are stimulating cross disciplinary studies in such field as biogeomagnetism. We have compiled the first online database for MTB, i.e., Database of Magnestotactic Bacteria (DMTB, http://database.biomnsl.com). It contains useful information of 16S rRNA gene sequences, oligonucleotides, and magnetic properties of MTB, and corresponding ecological metadata of sampling sites. The 16S rRNA gene sequences are collected from the GenBank database, while all other data are collected from the scientific literature. Rock magnetic properties for both uncultivated and cultivated MTB species are also included. In the DMTB database, data are accessible through four main interfaces: Site Sort, Phylo Sort, Oligonucleotides, and Magnetic Properties. References in each entry serve as links to specific pages within public databases. The online comprehensive DMTB will provide a very useful data resource for researchers from various disciplines, e.g., microbiology, rock magnetism and paleomagnetism, biogeomagnetism, magnetic material sciences and others.

  14. The new international GLE database

    NASA Astrophysics Data System (ADS)

    Duldig, M. L.; Watts, D. J.

    2001-08-01

    The Australian Antarctic Division has agreed to host the international GLE database. Access to the database is via a world-wide-web interface and initially covers all GLEs since the start of the 22nd solar cycle. Access restriction for recent events is controlled by password protection and these data are available only to those groups contributing data to the database. The restrictions to data will be automatically removed for events older than 2 years, in accordance with the data exchange provisions of the Antarctic Treaty. Use of the data requires acknowledgment of the database as the source of the data and acknowledgment of the specific groups that provided the data used. Furthermore, some groups that provide data to the database have specific acknowledgment requirements or wording. A new submission format has been developed that will allow easier exchange of data, although the old format will be acceptable for some time. Data download options include direct web based download and email. Data may also be viewed as listings or plots with web browsers. Search options have also been incorporated. Development of the database will be ongoing with extension to viewing and delivery options, addition of earlier data and the development of mirror sites. It is expected that two mirror sites, one in North America and one in Europe, will be developed to enable fast access for the whole cosmic ray community.

  15. Database Reports Over the Internet

    NASA Technical Reports Server (NTRS)

    Smith, Dean Lance

    2002-01-01

    Most of the summer was spent developing software that would permit existing test report forms to be printed over the web on a printer that is supported by Adobe Acrobat Reader. The data is stored in a DBMS (Data Base Management System). The client asks for the information from the database using an HTML (Hyper Text Markup Language) form in a web browser. JavaScript is used with the forms to assist the user and verify the integrity of the entered data. Queries to a database are made in SQL (Sequential Query Language), a widely supported standard for making queries to databases. Java servlets, programs written in the Java programming language running under the control of network server software, interrogate the database and complete a PDF form template kept in a file. The completed report is sent to the browser requesting the report. Some errors are sent to the browser in an HTML web page, others are reported to the server. Access to the databases was restricted since the data are being transported to new DBMS software that will run on new hardware. However, the SQL queries were made to Microsoft Access, a DBMS that is available on most PCs (Personal Computers). Access does support the SQL commands that were used, and a database was created with Access that contained typical data for the report forms. Some of the problems and features are discussed below.

  16. Rice Glycosyltransferase (GT) Phylogenomic Database

    DOE Data Explorer

    Ronald, Pamela

    The Ronald Laboratory staff at the University of California-Davis has a primary research focus on the genes of the rice plant. They study the role that genetics plays in the way rice plants respond to their environment. They created the Rice GT Database in order to integrate functional genomic information for putative rice Glycosyltransferases (GTs). This database contains information on nearly 800 putative rice GTs (gene models) identified by sequence similarity searches based on the Carbohydrate Active enZymes (CAZy) database. The Rice GT Database provides a platform to display user-selected functional genomic data on a phylogenetic tree. This includes sequence information, mutant line information, expression data, etc. An interactive chromosomal map shows the position of all rice GTs, and links to rice annotation databases are included. The format is intended to "facilitate the comparison of closely related GTs within different families, as well as perform global comparisons between sets of related families." [From http://ricephylogenomics.ucdavis.edu/cellwalls/gt/genInfo.shtml] See also the primary paper discussing this work: Peijian Cao, Laura E. Bartley, Ki-Hong Jung and Pamela C. Ronalda. Construction of a Rice Glycosyltransferase Phylogenomic Database and Identification of Rice-Diverged Glycosyltransferases. Molecular Plant, 2008, 1(5): 858-877.

  17. Computational identification of altered metabolism using gene expression and metabolic pathways.

    PubMed

    Nam, Hojung; Lee, Jinwon; Lee, Doheon

    2009-07-01

    Understanding altered metabolism is an important issue because altered metabolism is often revealed as a cause or an effect in pathogenesis. It has also been shown to be an important factor in the manipulation of an organism's metabolism in metabolic engineering. Unfortunately, it is not yet possible to measure the concentration levels of all metabolites in the genome-wide scale of a metabolic network; consequently, a method that infers the alteration of metabolism is beneficial. The present study proposes a computational method that identifies genome-wide altered metabolism by analyzing functional units of KEGG pathways. As control of a metabolic pathway is accomplished by altering the activity of at least one rate-determining step enzyme, not all gene expressions of enzymes in the pathway demonstrate significant changes even if the pathway is altered. Therefore, we measure the alteration levels of a metabolic pathway by selectively observing expression levels of significantly changed genes in a pathway. The proposed method was applied to two strains of Saccharomyces cerevisiae gene expression profiles measured in very high-gravity (VHG) fermentation. The method identified altered metabolic pathways whose properties are related to ethanol and osmotic stress responses which had been known to be observed in VHG fermentation because of the high sugar concentration in growth media and high ethanol concentration in fermentation products. With the identified altered pathways, the proposed method achieved best accuracy and sensitivity rates for the Red Star (RS) strain compared to other three related studies (gene-set enrichment analysis (GSEA), significance analysis of microarray to gene set (SAM-GS), reporter metabolite), and for the CEN.PK 113-7D (CEN) strain, the proposed method and the GSEA method showed comparably similar performances.

  18. Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases.

    PubMed

    Sanderson, Lacey-Anne; Ficklin, Stephen P; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A; Bett, Kirstin E; Main, Dorrie

    2013-01-01

    Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including 'Feature Map', 'Genetic', 'Publication', 'Project', 'Contact' and the 'Natural Diversity' modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. DATABASE URL: http://tripal.info/.

  19. Electron Inelastic-Mean-Free-Path Database

    National Institute of Standards and Technology Data Gateway

    SRD 71 NIST Electron Inelastic-Mean-Free-Path Database (PC database, no charge)   This database provides values of electron inelastic mean free paths (IMFPs) for use in quantitative surface analyses by AES and XPS.

  20. The Comparative Toxicogenomics Database: update 2017

    PubMed Central

    Davis, Allan Peter; Grondin, Cynthia J.; Johnson, Robin J.; Sciaky, Daniela; King, Benjamin L.; McMorran, Roy; Wiegers, Jolene; Wiegers, Thomas C.; Mattingly, Carolyn J.

    2017-01-01

    The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between chemicals and gene products, and their relationships to diseases. Core CTD content (chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature) are integrated with each other as well as with select external datasets to generate expanded networks and predict novel associations. Today, core CTD includes more than 30.5 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, Gene Ontology (GO) annotations, pathways, and gene interaction modules. In this update, we report a 33% increase in our core data content since 2015, describe our new exposure module (that harmonizes exposure science information with core toxicogenomic data) and introduce a novel dataset of GO-disease inferences (that identify common molecular underpinnings for seemingly unrelated pathologies). These advancements centralize and contextualize real-world chemical exposures with molecular pathways to help scientists generate testable hypotheses in an effort to understand the etiology and mechanisms underlying environmentally influenced diseases. PMID:27651457

  1. Curation and Computational Design of Bioenergy-Related Metabolic Pathways

    SciTech Connect

    Karp, Peter D.

    2014-09-12

    Pathway Tools is a systems-biology software package written by SRI International (SRI) that produces Pathway/Genome Databases (PGDBs) for organisms with a sequenced genome. Pathway Tools also provides a wide range of capabilities for analyzing predicted metabolic networks and user-generated omics data. More than 5,000 academic, industrial, and government groups have licensed Pathway Tools. This user community includes researchers at all three DOE bioenergy centers, as well as academic and industrial metabolic engineering (ME) groups. An integral part of the Pathway Tools software is MetaCyc, a large, multiorganism database of metabolic pathways and enzymes that SRI and its academic collaborators manually curate. This project included two main goals: I. Enhance the MetaCyc content of bioenergy-related enzymes and pathways. II. Develop computational tools for engineering metabolic pathways that satisfy specified design goals, in particular for bioenergy-related pathways. In part I, SRI proposed to significantly expand the coverage of bioenergy-related metabolic information in MetaCyc, followed by the generation of organism-specific PGDBs for all energy-relevant organisms sequenced at the DOE Joint Genome Institute (JGI). Part I objectives included: 1: Expand the content of MetaCyc to include bioenergy-related enzymes and pathways. 2: Enhance the Pathway Tools software to enable display of complex polymer degradation processes. 3: Create new PGDBs for the energy-related organisms sequenced by JGI, update existing PGDBs with new MetaCyc content, and make these data available to JBEI via the BioCyc website. In part II, SRI proposed to develop an efficient computational tool for the engineering of metabolic pathways. Part II objectives included: 4: Develop computational tools for generating metabolic pathways that satisfy specified design goals, enabling users to specify parameters such as starting and ending compounds, and preferred or disallowed intermediate compounds

  2. The IPD and IMGT/HLA database: allele variant databases.

    PubMed

    Robinson, James; Halliwell, Jason A; Hayhurst, James D; Flicek, Paul; Parham, Peter; Marsh, Steven G E

    2015-01-01

    The Immuno Polymorphism Database (IPD) was developed to provide a centralized system for the study of polymorphism in genes of the immune system. Through the IPD project we have established a central platform for the curation and publication of locus-specific databases involved either directly or related to the function of the Major Histocompatibility Complex in a number of different species. We have collaborated with specialist groups or nomenclature committees that curate the individual sections before they are submitted to IPD for online publication. IPD consists of five core databases, with the IMGT/HLA Database as the primary database. Through the work of the various nomenclature committees, the HLA Informatics Group and in collaboration with the European Bioinformatics Institute we are able to provide public access to this data through the website http://www.ebi.ac.uk/ipd/. The IPD project continues to develop with new tools being added to address scientific developments, such as Next Generation Sequencing, and to address user feedback and requests. Regular updates to the website ensure that new and confirmatory sequences are dispersed to the immunogenetics community, and the wider research and clinical communities.

  3. The IPD and IMGT/HLA database: allele variant databases

    PubMed Central

    Robinson, James; Halliwell, Jason A.; Hayhurst, James D.; Flicek, Paul; Parham, Peter; Marsh, Steven G. E.

    2015-01-01

    The Immuno Polymorphism Database (IPD) was developed to provide a centralized system for the study of polymorphism in genes of the immune system. Through the IPD project we have established a central platform for the curation and publication of locus-specific databases involved either directly or related to the function of the Major Histocompatibility Complex in a number of different species. We have collaborated with specialist groups or nomenclature committees that curate the individual sections before they are submitted to IPD for online publication. IPD consists of five core databases, with the IMGT/HLA Database as the primary database. Through the work of the various nomenclature committees, the HLA Informatics Group and in collaboration with the European Bioinformatics Institute we are able to provide public access to this data through the website http://www.ebi.ac.uk/ipd/. The IPD project continues to develop with new tools being added to address scientific developments, such as Next Generation Sequencing, and to address user feedback and requests. Regular updates to the website ensure that new and confirmatory sequences are dispersed to the immunogenetics community, and the wider research and clinical communities. PMID:25414341

  4. De Novo Transcriptome Analysis Provides Insights into Immune Related Genes and the RIG-I-Like Receptor Signaling Pathway in the Freshwater Planarian (Dugesia japonica)

    PubMed Central

    Deng, Hongkuan; Zhang, Yichao; Sun, Xiaowen; Zhu, Guangzhong; Liu, Baohua; Zhao, Bosheng

    2016-01-01

    Background The freshwater planarian Dugesia japonica (D. japonica) possesses extraordinary ability to regenerate lost organs or body parts. Interestingly, in the process of regeneration, there is little wound infection, suggesting that D. japonica has a formidable innate immune system. The importance of immune system prompted us to search for immune-related genes and RIG-I-like receptor signaling pathways. Results Transcriptome sequencing of D. japonica was performed on an IlluminaHiSeq2000 platform. A total of 27,180 transcripts were obtained by Trinity assembler. CEGMA analysis and mapping of all trimmed reads back to the assembly result showed that our transcriptome assembly covered most of the whole transcriptome. 23,888 out of 27,180 transcripts contained ORF (open reading fragment), and were highly similar to those in Schistosoma mansoni using BLASTX analysis. 8,079 transcripts (29.7%) and 8,668 (31.9%) were annotated by Blast2GO and KEGG respectively. A DYNLRB-like gene was cloned to verify its roles in the immune response. Finally, the expression patterns of 4 genes (RIG-I, TRAF3, TRAF6, P38) in the RIG-I-like receptor signaling pathway were detected, and the results showed they are very likely to be involved in planarian immune response. Conclusion RNA-Seq analysis based on the next-generation sequencing technology was an efficient approach to discover critical genes and to understand their corresponding biological functions. Through GO and KEGG analysis, several critical and conserved signaling pathways and genes related to RIG-I-like receptor signaling pathway were identified. Four candidate genes were selected to identify their expression dynamics in the process of pathogen stimulation. These annotated transcripts of D. japonica provide a useful resource for subsequent investigation of other important pathways. PMID:26986572

  5. Reactome: a knowledgebase of biological pathways

    PubMed Central

    Joshi-Tope, G.; Gillespie, M.; Vastrik, I.; D'Eustachio, P.; Schmidt, E.; de Bono, B.; Jassal, B.; Gopinath, G.R.; Wu, G.R.; Matthews, L.; Lewis, S.; Birney, E.; Stein, L.

    2005-01-01

    Reactome, located at http://www.reactome.org is a curated, peer-reviewed resource of human biological processes. Given the genetic makeup of an organism, the complete set of possible reactions constitutes its reactome. The basic unit of the Reactome database is a reaction; reactions are then grouped into causal chains to form pathways. The Reactome data model allows us to represent many diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, and signal transduction, and high-level processes, such as the cell cycle. Reactome provides a qualitative framework, on which quantitative data can be superimposed. Tools have been developed to facilitate custom data entry and annotation by expert biologists, and to allow visualization and exploration of the finished dataset as an interactive process map. Although our primary curational domain is pathways from Homo sapiens, we regularly create electronic projections of human pathways onto other organisms via putative orthologs, thus making Reactome relevant to model organism research communities. The database is publicly available under open source terms, which allows both its content and its software infrastructure to be freely used and redistributed. PMID:15608231

  6. Reactome Pathway Analysis to Enrich Biological Discovery in Proteomics Datasets

    PubMed Central

    Haw, Robin; Hermjakob, Henning; D’Eustachio, Peter; Stein, Lincoln

    2012-01-01

    Reactome (http://www.reactome.org) is an open source, expert-authored, peer-reviewed, manually curated database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Pathway Browser is a Systems Biology Graphical Notation (SBGN)-like visualization system that supports manual navigation of pathways by zooming, scrolling and event highlighting, and that exploits PSI Common Query Interface (PSIQUIC) web services to overlay pathways with molecular interaction data from the Reactome Functional Interaction (FI) Network and interaction databases such as IntAct, ChEMBL, and BioGRID. Pathway and Expression Analysis tools employ web services to provide ID mapping, pathway assignment and over-representation analysis of user-supplied datasets. By applying Ensembl Compara to curated human proteins and reactions, Reactome generates pathway inferences for 20 other species. The Species Comparison tool provides a summary of results for each of these species as a table showing numbers of orthologous proteins found by pathway from which users can navigate to inferred details for specific proteins and reactions. Reactome’s diverse pathway knowledge and suite of data analysis tools provide a platform for data mining, modeling and the analysis of large-scale proteomics datasets. PMID:21751369

  7. The challenge of constructing, classifying, and representing metabolic pathways.

    PubMed

    Caspi, Ron; Dreher, Kate; Karp, Peter D

    2013-08-01

    Scientists, educators, and students benefit from having free and centralized access to the wealth of metabolic information that has been gathered over the decades. Curators of the MetaCyc database work to present this information in an easily understandable pathway-based framework. MetaCyc is used not only as an encyclopedic resource for metabolic information but also as a template for the pathway prediction software that generates pathway/genome databases for thousands of organisms with sequenced genomes (available at www.biocyc.org). Curators need to define pathway boundaries and classify pathways within a broader pathway ontology to maximize the utility of the pathways to both users and the pathway prediction software. These seemingly simple tasks pose several challenges. This review describes these challenges as well as the criteria that need to be considered, and the rules that have been developed by MetaCyc curators as they make decisions regarding the representation and classification of metabolic pathway information in MetaCyc. The functional consequences of these decisions in regard to pathway prediction in new species are also discussed.

  8. Towards imaging metabolic pathways in tissues.

    PubMed

    Dekker, Tim J A; Jones, Emrys A; Corver, Willem E; van Zeijl, René J M; Deelder, André M; Tollenaar, Rob A E M; Mesker, Wilma E; Morreau, Hans; McDonnell, Liam A

    2015-03-01

    Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging using 9-aminoacridine as the matrix leads to the detection of low mass metabolites and lipids directly from cancer tissues. These included lactate and pyruvate for studying the Warburg effect, as well as succinate and fumarate, metabolites whose accumulation is associated with specific syndromes. By using the pathway information present in the human metabolome database, it was possible to identify regions within tumor tissue samples with distinct metabolic signatures that were consistent with known tumor biology. We present a data analysis workflow for assessing metabolic pathways in their histopathological context.

  9. The Chicago Thoracic Oncology Database Consortium: A Multisite Database Initiative

    PubMed Central

    Carey, George B; Tan, Yi-Hung Carol; Bokhary, Ujala; Itkonen, Michelle; Szeto, Kyle; Wallace, James; Campbell, Nicholas; Hensing, Thomas; Salgia, Ravi

    2016-01-01

    Objective: An increasing amount of clinical data is available to biomedical researchers, but specifically designed database and informatics infrastructures are needed to handle this data effectively. Multiple research groups should be able to pool and share this data in an efficient manner. The Chicago Thoracic Oncology Database Consortium (CTODC) was created to standardize data collection and facilitate the pooling and sharing of data at institutions throughout Chicago and across the world. We assessed the CTODC by conducting a proof of principle investigation on lung cancer patients who took erlotinib. This study does not look into epidermal growth factor receptor (EGFR) mutations and tyrosine kinase inhibitors, but rather it discusses the development and utilization of the database involved. Methods:  We have implemented the Thoracic Oncology Program Database Project (TOPDP) Microsoft Access, the Thoracic Oncology Research Program (TORP) Velos, and the TORP REDCap databases for translational research efforts. Standard operating procedures (SOPs) were created to document the construction and proper utilization of these databases. These SOPs have been made available freely to other institutions that have implemented their own databases patterned on these SOPs. Results: A cohort of 373 lung cancer patients who took erlotinib was identified. The EGFR mutation statuses of patients were analyzed. Out of the 70 patients that were tested, 55 had mutations while 15 did not. In terms of overall survival and duration of treatment, the cohort demonstrated that EGFR-mutated patients had a longer duration of erlotinib treatment and longer overall survival compared to their EGFR wild-type counterparts who received erlotinib. Discussion: The investigation successfully yielded data from all institutions of the CTODC. While the investigation identified challenges, such as the difficulty of data transfer and potential duplication of patient data, these issues can be resolved

  10. REDIdb: the RNA editing database.

    PubMed

    Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

    2007-01-01

    The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.

  11. WDDD: Worm Developmental Dynamics Database.

    PubMed

    Kyoda, Koji; Adachi, Eru; Masuda, Eriko; Nagai, Yoko; Suzuki, Yoko; Oguro, Taeko; Urai, Mitsuru; Arai, Ryoko; Furukawa, Mari; Shimada, Kumiko; Kuramochi, Junko; Nagai, Eriko; Onami, Shuichi

    2013-01-01

    During animal development, cells undergo dynamic changes in position and gene expression. A collection of quantitative information about morphological dynamics under a wide variety of gene perturbations would provide a rich resource for understanding the molecular mechanisms of development. Here, we created a database, the Worm Developmental Dynamics Database (http://so.qbic.riken.jp/wddd/), which stores a collection of quantitative information about cell division dynamics in early Caenorhabditis elegans embryos with single genes silenced by RNA-mediated interference. The information contains the three-dimensional coordinate values of the outlines of nuclear regions and the dynamics of the outlines over time. The database provides free access to 50 sets of quantitative data for wild-type embryos and 136 sets of quantitative data for RNA-mediated interference embryos corresponding to 72 of the 97 essential embryonic genes on chromosome III. The database also provides sets of four-dimensional differential interference contrast microscopy images on which the quantitative data were based. The database will provide a novel opportunity for the development of computational methods to obtain fresh insights into the mechanisms of development. The quantitative information and microscopy images can be synchronously viewed through a web browser, which is designed for easy access by experimental biologists.

  12. An Alaska Soil Carbon Database

    NASA Astrophysics Data System (ADS)

    Johnson, Kristofer; Harden, Jennifer

    2009-05-01

    Database Collaborator's Meeting; Fairbanks, Alaska, 4 March 2009; Soil carbon pools in northern high-latitude regions and their response to climate changes are highly uncertain, and collaboration is required from field scientists and modelers to establish baseline data for carbon cycle studies. The Global Change Program at the U.S. Geological Survey has funded a 2-year effort to establish a soil carbon network and database for Alaska based on collaborations from numerous institutions. To initiate a community effort, a workshop for the development of an Alaska soil carbon database was held at the University of Alaska Fairbanks. The database will be a resource for spatial and biogeochemical models of Alaska ecosystems and will serve as a prototype for a nationwide community project: the National Soil Carbon Network (http://www.soilcarb.net). Studies will benefit from the combination of multiple academic and government data sets. This collaborative effort is expected to identify data gaps and uncertainties more comprehensively. Future applications of information contained in the database will identify specific vulnerabilities of soil carbon in Alaska to climate change, disturbance, and vegetation change.

  13. Developing a DNA variant database.

    PubMed

    Fung, David C Y

    2008-01-01

    Disease- and locus-specific variant databases have been a valuable resource to clinical and research geneticists. With the recent rapid developments in technologies, the number of DNA variants detected in a typical molecular genetics laboratory easily exceeds 1,000. To keep track of the growing inventory of DNA variants, many laboratories employ information technology to store the data as well as distributing the data and its associated information to clinicians and researchers via the Web. While it is a valuable resource, the hosting of a web-accessible database requires collaboration between bioinformaticians and biologists and careful planning to ensure its usability and availability. In this chapter, a series of tutorials on building a local DNA variant database out of a sample dataset will be provided. However, this tutorial will not include programming details on building a web interface and on constructing the web application necessary for web hosting. Instead, an introduction to the two commonly used methods for hosting web-accessible variant databases will be described. Apart from the tutorials, this chapter will also consider the resources and planning required for making a variant database project successful.

  14. WDDD: Worm Developmental Dynamics Database

    PubMed Central

    Kyoda, Koji; Adachi, Eru; Masuda, Eriko; Nagai, Yoko; Suzuki, Yoko; Oguro, Taeko; Urai, Mitsuru; Arai, Ryoko; Furukawa, Mari; Shimada, Kumiko; Kuramochi, Junko; Nagai, Eriko; Onami, Shuichi

    2013-01-01

    During animal development, cells undergo dynamic changes in position and gene expression. A collection of quantitative information about morphological dynamics under a wide variety of gene perturbations would provide a rich resource for understanding the molecular mechanisms of development. Here, we created a database, the Worm Developmental Dynamics Database (http://so.qbic.riken.jp/wddd/), which stores a collection of quantitative information about cell division dynamics in early Caenorhabditis elegans embryos with single genes silenced by RNA-mediated interference. The information contains the three-dimensional coordinate values of the outlines of nuclear regions and the dynamics of the outlines over time. The database provides free access to 50 sets of quantitative data for wild-type embryos and 136 sets of quantitative data for RNA-mediated interference embryos corresponding to 72 of the 97 essential embryonic genes on chromosome III. The database also provides sets of four-dimensional differential interference contrast microscopy images on which the quantitative data were based. The database will provide a novel opportunity for the development of computational methods to obtain fresh insights into the mechanisms of development. The quantitative information and microscopy images can be synchronously viewed through a web browser, which is designed for easy access by experimental biologists. PMID:23172286

  15. The Giardia genome project database.

    PubMed

    McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L

    2000-08-15

    The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.

  16. Searching NCBI Databases Using Entrez.

    PubMed

    Gibney, Gretchen; Baxevanis, Andreas D

    2011-10-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An alternate protocol builds upon the first basic protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The support protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  17. Searching NCBI databases using Entrez.

    PubMed

    Baxevanis, Andreas D

    2008-12-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two Basic Protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An Alternate Protocol builds upon the first Basic Protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The Support Protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  18. Searching NCBI databases using Entrez.

    PubMed

    Gibney, Gretchen; Baxevanis, Andreas D

    2011-06-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An alternate protocol builds upon the first basic protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The support protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  19. Stratospheric emissions effects database development

    NASA Technical Reports Server (NTRS)

    Baughcum, Steven L.; Henderson, Stephen C.; Hertel, Peter S.; Maggiora, Debra R.; Oncina, Carlos A.

    1994-01-01

    This report describes the development of a stratospheric emissions effects database (SEED) of aircraft fuel burn and emissions from projected Year 2015 subsonic aircraft fleets and from projected fleets of high-speed civil transports (HSCT's). This report also describes the development of a similar database of emissions from Year 1990 scheduled commercial passenger airline and air cargo traffic. The objective of this work was to initiate, develop, and maintain an engineering database for use by atmospheric scientists conducting the Atmospheric Effects of Stratospheric Aircraft (AESA) modeling studies. Fuel burn and emissions of nitrogen oxides (NO(x) as NO2), carbon monoxide, and hydrocarbons (as CH4) have been calculated on a 1-degree latitude x 1-degree longitude x 1-kilometer altitude grid and delivered to NASA as electronic files. This report describes the assumptions and methodology for the calculations and summarizes the results of these calculations.

  20. ADASS Web Database XML Project

    NASA Astrophysics Data System (ADS)

    Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.

    In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.

  1. The Life Support Database system

    NASA Technical Reports Server (NTRS)

    Likens, William C.

    1991-01-01

    The design and implementation of the database system are described with specific reference to data available from the Build-1 version and techniques for its utilization. The review of the initial documents for the Life Support Database is described in terms of title format and sequencing, and the users are defined as participants in NASA-sponsored life-support research. The software and hardware selections are based respectively on referential integrity and compatibility, and the implementation of the user interface is achieved by means of an applications-programming tool. The current Beta-Test implementation of the system includes several thousand acronyms and bibliographic references as well as chemical properties and exposure limits, equipment, construction materials, and mission data. In spite of modifications in the database the system is found to be effective and a potentially significant resource for the aerospace community.

  2. DOE Global Energy Storage Database

    DOE Data Explorer

    The DOE International Energy Storage Database has more than 400 documented energy storage projects from 34 countries around the world. The database provides free, up-to-date information on grid-connected energy storage projects and relevant state and federal policies. More than 50 energy storage technologies are represented worldwide, including multiple battery technologies, compressed air energy storage, flywheels, gravel energy storage, hydrogen energy storage, pumped hydroelectric, superconducting magnetic energy storage, and thermal energy storage. The policy section of the database shows 18 federal and state policies addressing grid-connected energy storage, from rules and regulations to tariffs and other financial incentives. It is funded through DOE’s Sandia National Laboratories, and has been operating since January 2012.

  3. A veterinary digital anatomical database.

    PubMed

    Snell, J R; Green, R; Stott, G; Van Baerle, S

    1991-01-01

    This paper describes the Veterinary Digital Anatomical Database Project. The purpose of the project is to investigate the construction and use of digitally stored anatomical models. We will be discussing the overall project goals and the results to date. Digital anatomical models are 3 dimensional, solid model representations of normal anatomy. The digital representations are electronically stored and can be manipulated and displayed on a computer graphics workstation. A digital database of anatomical structures can be used in conjunction with gross dissection in teaching normal anatomy to first year students in the professional curriculum. The computer model gives students the opportunity to "discover" relationships between anatomical structures that may have been destroyed or may not be obvious in the gross dissection. By using a digital database, the student will have the ability to view and manipulate anatomical structures in ways that are not available through interactive video disk (IVD). IVD constrains the student to preselected views and sections stored on the disk.

  4. National Residential Efficiency Measures Database

    DOE Data Explorer

    The National Residential Efficiency Measures Database is a publicly available, centralized resource of residential building retrofit measures and costs for the U.S. building industry. With support from the U.S. Department of Energy, NREL developed this tool to help users determine the most cost-effective retrofit measures for improving energy efficiency of existing homes. Software developers who require residential retrofit performance and cost data for applications that evaluate residential efficiency measures are the primary audience for this database. In addition, home performance contractors and manufacturers of residential materials and equipment may find this information useful. The database offers the following types of retrofit measures: 1) Appliances, 2) Domestic Hot Water, 3) Enclosure, 4) Heating, Ventilating, and Air Conditioning (HVAC), 5) Lighting, 6) Miscellaneous.

  5. The new IAGOS Database Portal

    NASA Astrophysics Data System (ADS)

    Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Fontaine, Alain

    2016-04-01

    IAGOS (In-service Aircraft for a Global Observing System) is a European Research Infrastructure which aims at the provision of long-term, regular and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. It contains IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data. The IAGOS Database Portal (http://www.iagos.fr, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data center AERIS (http://www.aeris-data.fr). The new IAGOS Database Portal has been released in December 2015. The main improvement is the interoperability implementation with international portals or other databases in order to improve IAGOS data discovery. In the frame of the IGAS project (IAGOS for the Copernicus Atmospheric Service), a data network has been setup. It is composed of three data centers: the IAGOS database in Toulouse; the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de); and the CAMS data center in Jülich (http://join.iek.fz-juelich.de). The CAMS (Copernicus Atmospheric Monitoring Service) project is a prominent user of the IGAS data network. The new portal provides improved and new services such as the download in NetCDF or NASA Ames formats, plotting tools (maps, time series, vertical profiles, etc.) and user management. Added value products are available on the portal: back trajectories, origin of air masses, co-location with satellite data, etc. The link with the CAMS data center, through JOIN (Jülich OWS Interface), allows to combine model outputs with IAGOS data for inter-comparison. Finally IAGOS metadata has been standardized (ISO 19115) and now provides complete information about data traceability and quality.

  6. The Danish Cardiac Rehabilitation Database

    PubMed Central

    Zwisler, Ann-Dorthe; Rossau, Henriette Knold; Nakano, Anne; Foghmar, Sussie; Eichhorst, Regina; Prescott, Eva; Cerqueira, Charlotte; Soja, Anne Merete Boas; Gislason, Gunnar H; Larsen, Mogens Lytken; Andersen, Ulla Overgaard; Gustafsson, Ida; Thomsen, Kristian K; Boye Hansen, Lene; Hammer, Signe; Viggers, Lone; Christensen, Bo; Kvist, Birgitte; Lindström Egholm, Cecilie; May, Ole

    2016-01-01

    Aim of database The Danish Cardiac Rehabilitation Database (DHRD) aims to improve the quality of cardiac rehabilitation (CR) to the benefit of patients with coronary heart disease (CHD). Study population Hospitalized patients with CHD with stenosis on coronary angiography treated with percutaneous coronary intervention, coronary artery bypass grafting, or medication alone. Reporting is mandatory for all hospitals in Denmark delivering CR. The database was initially implemented in 2013 and was fully running from August 14, 2015, thus comprising data at a patient level from the latter date onward. Main variables Patient-level data are registered by clinicians at the time of entry to CR directly into an online system with simultaneous linkage to other central patient registers. Follow-up data are entered after 6 months. The main variables collected are related to key outcome and performance indicators of CR: referral and adherence, lifestyle, patient-related outcome measures, risk factor control, and medication. Program-level online data are collected every third year. Descriptive data Based on administrative data, approximately 14,000 patients with CHD are hospitalized at 35 hospitals annually, with 75% receiving one or more outpatient rehabilitation services by 2015. The database has not yet been running for a full year, which explains the use of approximations. Conclusion The DHRD is an online, national quality improvement database on CR, aimed at patients with CHD. Mandatory registration of data at both patient level as well as program level is done on the database. DHRD aims to systematically monitor the quality of CR over time, in order to improve the quality of CR throughout Denmark to benefit patients. PMID:27822083

  7. The CATDAT damaging earthquakes database

    NASA Astrophysics Data System (ADS)

    Daniell, J. E.; Khazai, B.; Wenzel, F.; Vervaeck, A.

    2011-08-01

    The global CATDAT damaging earthquakes and secondary effects (tsunami, fire, landslides, liquefaction and fault rupture) database was developed to validate, remove discrepancies, and expand greatly upon existing global databases; and to better understand the trends in vulnerability, exposure, and possible future impacts of such historic earthquakes. Lack of consistency and errors in other earthquake loss databases frequently cited and used in analyses was a major shortcoming in the view of the authors which needed to be improved upon. Over 17 000 sources of information have been utilised, primarily in the last few years, to present data from over 12 200 damaging earthquakes historically, with over 7000 earthquakes since 1900 examined and validated before insertion into the database. Each validated earthquake includes seismological information, building damage, ranges of social losses to account for varying sources (deaths, injuries, homeless, and affected), and economic losses (direct, indirect, aid, and insured). Globally, a slightly increasing trend in economic damage due to earthquakes is not consistent with the greatly increasing exposure. The 1923 Great Kanto (214 billion USD damage; 2011 HNDECI-adjusted dollars) compared to the 2011 Tohoku (>300 billion USD at time of writing), 2008 Sichuan and 1995 Kobe earthquakes show the increasing concern for economic loss in urban areas as the trend should be expected to increase. Many economic and social loss values not reported in existing databases have been collected. Historical GDP (Gross Domestic Product), exchange rate, wage information, population, HDI (Human Development Index), and insurance information have been collected globally to form comparisons. This catalogue is the largest known cross-checked global historic damaging earthquake database and should have far-reaching consequences for earthquake loss estimation, socio-economic analysis, and the global reinsurance field.

  8. Integrating pathway analysis and genetics of gene expression for genome-wide association study of basal cell carcinoma.

    PubMed

    Zhang, Mingfeng; Liang, Liming; Morar, Nilesh; Dixon, Anna L; Lathrop, G Mark; Ding, Jun; Moffatt, Miriam F; Cookson, William O C; Kraft, Peter; Qureshi, Abrar A; Han, Jiali

    2012-04-01

    Genome-wide association studies (GWASs) have primarily focused on marginal effects for individual markers and have incorporated external functional information only after identifying robust statistical associations. We applied a new approach combining the genetics of gene expression and functional classification of genes to the GWAS of basal cell carcinoma (BCC) to identify potential biological pathways associated with BCC. We first identified 322,324 expression-associated single-nucleotide polymorphisms (eSNPs) from two existing GWASs of global gene expression in lymphoblastoid cell lines (n = 955), and evaluated the association of these functionally annotated SNPs with BCC among 2,045 BCC cases and 6,013 controls in Caucasians. We then grouped them into 99 KEGG pathways for pathway analysis and identified two pathways associated with BCC with p value <0.05 and false discovery rate (FDR) <0.5: the autoimmune thyroid disease pathway (mainly HLA class I and II antigens, p < 0.001, FDR = 0.24) and Janus kinase-signal transducer and activator of transcription (JAK-STAT) signaling pathway (p = 0.02, FDR = 0.49). Seventy-nine (25.7%) out of 307 significant eSNPs in the JAK-STAT pathway were associated with BCC risk (p < 0.05) in an independent replication set of 278 BCC cases and 1,262 controls. In addition, the association of JAK-STAT signaling pathway was marginally validated using 16,691 eSNPs identified from 110 normal skin samples (p = 0.08). Based on the evidence of biological functions of the JAK-STAT pathway on oncogenesis, it is plausible that this pathway is involved in BCC pathogenesis.

  9. Diaretinopathy database –A Gene database for diabetic retinopathy

    PubMed Central

    Vidhya, Gopalakrishnan; Anusha, Bhaskar

    2014-01-01

    Diabetic retinopathy, is a microvascular complication of diabetes mellitus and is a major cause of adult blindness. Despite advances in diagnosis and treatment the pathogenesis of diabetic retinopathy is not well understood. Results from epidemiological studies of diabetic patients suggest that there are familial predispositions to diabetes and to diabetic retinopathy. Therefore the main purpose of this database is to help both scientists and doctors in studying the candidate genes responsible for causing diabetic retinopathy. For each candidate gene official symbol, chromosome map, number of exons, GT-AG introns, motif, polymorphic variation and 3D structure are given respectively. In addition to molecular class and function of these genes, this database also provides links to download the corresponding nucleotide and amino acid sequences in FASTA format which may be further used for computational approaches. Therefore this database will increase the understanding of the genetics underlying the development or progression of diabetic retinopathy and will have an impact on future diagnostic, prevention and intervention strategies. Availability The database is freely available at http: diaretinopathydatabase.com PMID:24966527

  10. SEISMIC-REFLECTOR DATABASE SOFTWARE.

    USGS Publications Warehouse

    Wright, Evelyn L.; Hosom, John-Paul; ,

    1986-01-01

    The seismic data analysis (SDA) software system facilitates generation of marine seismic reflector databases composed of reflector depths, travel times, root-mean-square and interval velocities, geographic coordinates, and identifying information. System processes include digitizing of seismic profiles and velocity semblance curves, merging of velocity and navigation data with profile travel-time data, calculation of reflector depths in meters, profile and map graphic displays, data editing and smoothing, and entry of finalized data into a comprehensive database. An overview of concepts, file structures, and programs is presented.

  11. Data exploration systems for databases

    NASA Technical Reports Server (NTRS)

    Greene, Richard J.; Hield, Christopher

    1992-01-01

    Data exploration systems apply machine learning techniques, multivariate statistical methods, information theory, and database theory to databases to identify significant relationships among the data and summarize information. The result of applying data exploration systems should be a better understanding of the structure of the data and a perspective of the data enabling an analyst to form hypotheses for interpreting the data. This paper argues that data exploration systems need a minimum amount of domain knowledge to guide both the statistical strategy and the interpretation of the resulting patterns discovered by these systems.

  12. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Masuyama, Keiichi

    CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.

  13. DDD: Dynamic Database for Diatomics

    NASA Technical Reports Server (NTRS)

    Schwenke, David

    2004-01-01

    We have developed as web-based database containing spectra of diatomic moiecuies. All data is computed from first principles, and if a user requests data for a molecule/ion that is not in the database, new calculations are automatically carried out on that species. Rotational, vibrational, and electronic transitions are included. Different levels of accuracy can be selected from qualitatively correct to the best calculations that can be carried out. The user can view and modify spectroscopic constants, view potential energy curves, download detailed high temperature linelists, or view synthetic spectra.

  14. Coal quality databases: Practical applications

    SciTech Connect

    Finkelman, R.B.; Gross, P.M.K.

    1999-07-01

    Domestic and worldwide coal use will be influenced by concerns about the effects of coal combustion on the local, regional and global environment. Reliable coal quality data can help decision-makers to better assess risks and determine impacts of coal constituents on technological behavior, economic byproduct recovery, and environmental and human health issues. The US Geological Survey (USGS) maintains an existing coal quality database (COALQUAL) that contains analyses of approximately 14,000 col samples from every major coal-producing basin in the US. For each sample, the database contains results of proximate and ultimate analyses; sulfur form data; and major, minor, and trace element concentrations for approximately 70 elements

  15. Quality control of EUVE databases

    NASA Technical Reports Server (NTRS)

    John, L. M.; Drake, J.

    1992-01-01

    The publicly accessible databases for the Extreme Ultraviolet Explorer include: the EUVE Archive mailserver; the CEA ftp site; the EUVE Guest Observer Mailserver; and the Astronomical Data System node. The EUVE Performance Assurance team is responsible for verifying that these public EUVE databases are working properly, and that the public availability of EUVE data contained therein does not infringe any data rights which may have been assigned. In this poster, we describe the Quality Assurance (QA) procedures we have developed from the approach of QA as a service organization, thus reflecting the overall EUVE philosophy of Quality Assurance integrated into normal operating procedures, rather than imposed as an external, post facto, control mechanism.

  16. Prokaryotic Genomes from Microbes Online Database

    DOE Data Explorer

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N.; Koche, Richard P.; Keller, Keith; Dubchak, Inna L.; Arkin, Adam P.

    To describe the potential functions of genes, MicrobesOnline includes protein family analyses (from InterPro and COG), metabolic maps (from KEGG), links to research papers (from UniProt and PubMed), and operon predictions for every genome. To examine each gene's evolutionary history, MicrobesOnline includes precomputed phylogenetic trees for all the gene families. It displays gene trees with genomic context or it compares the gene tree to the species tree. The tools provided with MicrobesOnline allow users to: compute customized motifs, sequence alignments, and phylogenetic trees change expression patterns in metabolic maps annotate genes in various ways. A browse tree tool and a genome browser are available, along with specialized search capabilities. (Specialized Interface)

  17. Federal Register Document Image Database, Volume 1

    National Institute of Standards and Technology Data Gateway

    NIST Federal Register Document Image Database, Volume 1 (PC database for purchase)   NIST has produced a new document image database for evaluating document analysis and recognition technologies and information retrieval systems. NIST Special Database 25 contains page images from the 1994 Federal Register and much more.

  18. Building Databases for Education. ERIC Digest.

    ERIC Educational Resources Information Center

    Klausmeier, Jane A.

    This digest provides a brief explanation of what a database is; explains how a database can be used; identifies important factors that should be considered when choosing database management system software; and provides citations to sources for finding reviews and evaluations of database management software. The digest is concerned primarily with…

  19. WMC Database Evaluation. Case Study Report

    SciTech Connect

    Palounek, Andrea P. T

    2015-10-29

    The WMC Database is ultimately envisioned to hold a collection of experimental data, design information, and information from computational models. This project was a first attempt at using the Database to access experimental data and extract information from it. This evaluation shows that the Database concept is sound and robust, and that the Database, once fully populated, should remain eminently usable for future researchers.

  20. Online Petroleum Industry Bibliographic Databases: A Review.

    ERIC Educational Resources Information Center

    Anderson, Margaret B.

    This paper discusses the present status of the bibliographic database industry, reviews the development of online databases of interest to the petroleum industry, and considers future developments in online searching and their effect on libraries and information centers. Three groups of databases are described: (1) databases developed by the…

  1. The care pathway: concepts and theories: an introduction

    PubMed Central

    Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

    2012-01-01

    This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day. PMID:23593066

  2. Protein Analysis of Sapienic Acid-Treated Porphyromonas gingivalis Suggests Differential Regulation of Multiple Metabolic Pathways

    PubMed Central

    Dawson, Deborah V.; Blanchette, Derek R.; Drake, David R.; Wertz, Philip W.; Brogden, Kim A.

    2015-01-01

    ABSTRACT Lipids endogenous to skin and mucosal surfaces exhibit potent antimicrobial activity against Porphyromonas gingivalis, an important colonizer of the oral cavity implicated in periodontitis. Our previous work demonstrated the antimicrobial activity of the fatty acid sapienic acid (C16:1Δ6) against P. gingivalis and found that sapienic acid treatment alters both protein and lipid composition from those in controls. In this study, we further examined whole-cell protein differences between sapienic acid-treated bacteria and untreated controls, and we utilized open-source functional association and annotation programs to explore potential mechanisms for the antimicrobial activity of sapienic acid. Our analyses indicated that sapienic acid treatment induces a unique stress response in P. gingivalis resulting in differential expression of proteins involved in a variety of metabolic pathways. This network of differentially regulated proteins was enriched in protein-protein interactions (P = 2.98 × 10−8), including six KEGG pathways (P value ranges, 2.30 × 10−5 to 0.05) and four Gene Ontology (GO) molecular functions (P value ranges, 0.02 to 0.04), with multiple suggestive enriched relationships in KEGG pathways and GO molecular functions. Upregulated metabolic pathways suggest increases in energy production, lipid metabolism, iron acquisition and processing, and respiration. Combined with a suggested preferential metabolism of serine, which is necessary for fatty acid biosynthesis, these data support our previous findings that the site of sapienic acid antimicrobial activity is likely at the bacterial membrane. IMPORTANCE P. gingivalis is an important opportunistic pathogen implicated in periodontitis. Affecting nearly 50% of the population, periodontitis is treatable, but the resulting damage is irreversible and eventually progresses to tooth loss. There is a great need for natural products that can be used to treat and/or prevent the overgrowth of

  3. 3-MCPD 1-Palmitate Induced Tubular Cell Apoptosis In Vivo via JNK/p53 Pathways.

    PubMed

    Liu, Man; Huang, Guoren; Wang, Thomas T Y; Sun, Xiangjun; Yu, Liangli Lucy

    2016-05-01

    Fatty acid esters of 3-chloro-1, 2-propanediol (3-MCPD esters) are a group of processing induced food contaminants with nephrotoxicity but the molecular mechanism(s) remains unclear. This study investigated whether and how the JNK/p53 pathway may play a role in the nephrotoxic effect of 3-MCPD esters using 3-MCPD 1-palmitate (MPE) as a probe compound in Sprague Dawley rats. Microarray analysis of the kidney from the Sprague Dawley rats treated with MPE, using Gene Ontology categories and KEGG pathways, revealed that MPE altered mRNA expressions of the genes involved in the mitogen-activated protein kinase (JNK and ERK), p53, and apoptotic signal transduction pathways. The changes in the mRNA expressions were confirmed by qRT-PCR and Western blot analyses and were consistent with the induction of tubular cell apoptosis as determined by histopathological, TUNEL, and immunohistochemistry analyses in the kidneys of the Sprague Dawley rats. Additionally, p53 knockout attenuated the apoptosis, and the apoptosis-related protein bax expression and cleaved caspase-3 activation induced by MPE in the p53 knockout C57BL/6 mice, whereas JNK inhibitor SP600125 but not ERK inhibitor U0126 inhibited MPE-induced apoptosis, supporting the conclusion that JNK/p53 might play a critical role in the tubular cell apoptosis induced by MPE and other 3-MCPD fatty acid esters.

  4. NLTE4 Plasma Population Kinetics Database

    National Institute of Standards and Technology Data Gateway

    SRD 159 NLTE4 Plasma Population Kinetics Database (Web database for purchase)   This database contains benchmark results for simulation of plasma population kinetics and emission spectra. The data were contributed by the participants of the 4th Non-LTE Code Comparison Workshop who have unrestricted access to the database. The only limitation for other users is in hidden labeling of the output results. Guest users can proceed to the database entry page without entering userid and password.

  5. Guide on Logical Database Design.

    ERIC Educational Resources Information Center

    Fong, Elizabeth N.; And Others

    This report discusses an iterative methodology for logical database design (LDD). The methodology includes four phases: local information-flow modeling, global information-flow modeling, conceptual schema design, and external schema modeling. These phases are intended to make maximum use of available information and user expertise, including the…

  6. Data-Based Teacher Development.

    ERIC Educational Resources Information Center

    Borg, Simon

    1998-01-01

    Describes how data from English language teaching (ELT) classroom research can be exploited in teacher development activities. The contribution data-based activities can make to teacher development is outlined, and examples that illustrate the principles underlying their design are presented. A case is made for using such activities to facilitate…

  7. Safeguarding Databases Basic Concepts Revisited.

    ERIC Educational Resources Information Center

    Cardinali, Richard

    1995-01-01

    Discusses issues of database security and integrity, including computer crime and vandalism, human error, computer viruses, employee and user access, and personnel policies. Suggests some precautions to minimize system vulnerability such as careful personnel screening, audit systems, passwords, and building and software security systems. (JKP)

  8. The New NRL Crystallographic Database

    NASA Astrophysics Data System (ADS)

    Mehl, Michael; Curtarolo, Stefano; Hicks, David; Toher, Cormac; Levy, Ohad; Hart, Gus

    For many years the Naval Research Laboratory maintained an online graphical database of crystal structures for a wide variety of materials. This database has now been redesigned, updated and integrated with the AFLOW framework for high throughput computational materials discovery (http://materials.duke.edu/aflow.html). For each structure we provide an image showing the atomic positions; the primitive vectors of the lattice and the basis vectors of every atom in the unit cell; the space group and Wyckoff positions; Pearson symbols; common names; and Strukturbericht designations, where available. References for each structure are provided, as well as a Crystallographic Information File (CIF). The database currently includes almost 300 entries and will be continuously updated and expanded. It enables easy search of the various structures based on their underlying symmetries, either by Bravais lattice, Pearson symbol, Strukturbericht designation or commonly used prototypes. The talk will describe the features of the database, and highlight its utility for high throughput computational materials design. Work at NRL is funded by a Contract with the Duke University Department of Mechanical Engineering.

  9. Databases and the Professional Evaluator.

    ERIC Educational Resources Information Center

    Schellenberg, Stephen J.

    The role of the professional evaluator within a school district is essentially to provide data for use in informed decision making. In School District 4J in Eugene, Oregon, this role involves performing tasks in three basic categories: (1) maintaining and interpreting ongoing databases, (2) finding and analyzing information to answer specific…

  10. Using Databases in History Teaching.

    ERIC Educational Resources Information Center

    Knight, P.; Timmins, G.

    1986-01-01

    Discusses advantages and limitations of database software in meeting the educational objectives of history instruction; reviews five currently available computer programs (FACTFILE, QUEST, QUARRY BANK 1851, Census Analysis, and Beta Base); highlights major considerations that arise in designing such programs; and describes their classroom use.…

  11. Technostress: Surviving a Database Crash.

    ERIC Educational Resources Information Center

    Dobb, Linda S.

    1990-01-01

    Discussion of technostress in libraries focuses on a database crash at California Polytechnic State University, San Luis Obispo. Steps taken to restore the data are explained, strategies for handling technological accidents are suggested, the impact on library staff is discussed, and a 10-item annotated bibliography on technostress is provided.…

  12. Database Transformations for Biological Applications

    SciTech Connect

    Overton, C.; Davidson, S. B.; Buneman, P.; Tannen, V.

    2001-04-11

    The goal of this project was to develop tools to facilitate data transformations between heterogeneous data sources found throughout biomedical applications. Such transformations are necessary when sharing data between different groups working on related problems as well as when querying data spread over different databases, files and software analysis packages.

  13. Online Databases. ASCII Full Texts.

    ERIC Educational Resources Information Center

    Tenopir, Carol

    1995-01-01

    Defines the American Standard Code for Information Interchange (ASCII) full text, and reviews its past, present, and future uses in libraries. Discusses advantages, disadvantages, and uses of searchable and nonsearchable full-text databases. Also comments on full-text CD-ROM products and on technological advancements made by library vendors. (JMV)

  14. Maize Genetics and Genomics Database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The 2007 report for MaizeGDB lists the new hires who will focus on curation/outreach and the genome sequence, respectively. Currently all sequence in the database comes from a PlantGDB pipeline and is presented with deep links to external resources such as PlantGDB, Dana Farber, GenBank, the Arizona...

  15. Interactive bibliographical database on color

    NASA Astrophysics Data System (ADS)

    Caivano, Jose L.

    2002-06-01

    The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.

  16. Worldwide Ocean Optics Database (WOOD)

    DTIC Science & Technology

    2001-09-30

    user can obtain values computed from empirical algorithms (e.g., beam attenuation estimated from diffuse attenuation and backscatter data). Error ...from empirical algorithms (e.g., beam attenuation estimated from diffuse attenuation and backscatter data). Error estimates will also be provided for...properties, including diffuse attenuation, beam attenuation, and scattering. The database shall be easy to use, Internet accessible, and frequently updated

  17. The NASA Fireball Network Database

    NASA Technical Reports Server (NTRS)

    Moser, Danielle E.

    2011-01-01

    The NASA Meteoroid Environment Office (MEO) has been operating an automated video fireball network since late-2008. Since that time, over 1,700 multi-station fireballs have been observed. A database containing orbital data and trajectory information on all these events has recently been compiled and is currently being mined for information. Preliminary results are presented here.

  18. Begin: Online Database Searching Now!

    ERIC Educational Resources Information Center

    Lodish, Erica K.

    1986-01-01

    Because of the increasing importance of online databases, school library media specialists are encouraged to introduce students to online searching. Four books that would help media specialists gain a basic background are reviewed and it is noted that although they are very technical, they can be adapted to individual needs. (EM)

  19. Danish Colorectal Cancer Group Database

    PubMed Central

    Ingeholm, Peter; Gögenur, Ismail; Iversen, Lene H

    2016-01-01

    Aim of database The aim of the database, which has existed for registration of all patients with colorectal cancer in Denmark since 2001, is to improve the prognosis for this patient group. Study population All Danish patients with newly diagnosed colorectal cancer who are either diagnosed or treated in a surgical department of a public Danish hospital. Main variables The database comprises an array of surgical, radiological, oncological, and pathological variables. The surgeons record data such as diagnostics performed, including type and results of radiological examinations, lifestyle factors, comorbidity and performance, treatment including the surgical procedure, urgency of surgery, and intra- and postoperative complications within 30 days after surgery. The pathologists record data such as tumor type, number of lymph nodes and metastatic lymph nodes, surgical margin status, and other pathological risk factors. Descriptive data The database has had >95% completeness in including patients with colorectal adenocarcinoma with >54,000 patients registered so far with approximately one-third rectal cancers and two-third colon cancers and an overrepresentation of men among rectal cancer patients. The stage distribution has been more or less constant until 2014 with a tendency toward a lower rate of stage IV and higher rate of stage I after introduction of the national screening program in 2014. The 30-day mortality rate after elective surgery has been reduced from >7% in 2001–2003 to <2% since 2013. Conclusion The database is a national population-based clinical database with high patient and data completeness for the perioperative period. The resolution of data is high for description of the patient at the time of diagnosis, including comorbidities, and for characterizing diagnosis, surgical interventions, and short-term outcomes. The database does not have high-resolution oncological data and does not register recurrences after primary surgery. The Danish

  20. Toward An Unstructured Mesh Database

    NASA Astrophysics Data System (ADS)

    Rezaei Mahdiraji, Alireza; Baumann, Peter Peter

    2014-05-01

    Unstructured meshes are used in several application domains such as earth sciences (e.g., seismology), medicine, oceanography, cli- mate modeling, GIS as approximate representations of physical objects. Meshes subdivide a domain into smaller geometric elements (called cells) which are glued together by incidence relationships. The subdivision of a domain allows computational manipulation of complicated physical structures. For instance, seismologists model earthquakes using elastic wave propagation solvers on hexahedral meshes. The hexahedral con- tains several hundred millions of grid points and millions of hexahedral cells. Each vertex node in the hexahedrals stores a multitude of data fields. To run simulation on such meshes, one needs to iterate over all the cells, iterate over incident cells to a given cell, retrieve coordinates of cells, assign data values to cells, etc. Although meshes are used in many application domains, to the best of our knowledge there is no database vendor that support unstructured mesh features. Currently, the main tool for querying and manipulating unstructured meshes are mesh libraries, e.g., CGAL and GRAL. Mesh li- braries are dedicated libraries which includes mesh algorithms and can be run on mesh representations. The libraries do not scale with dataset size, do not have declarative query language, and need deep C++ knowledge for query implementations. Furthermore, due to high coupling between the implementations and input file structure, the implementations are less reusable and costly to maintain. A dedicated mesh database offers the following advantages: 1) declarative querying, 2) ease of maintenance, 3) hiding mesh storage structure from applications, and 4) transparent query optimization. To design a mesh database, the first challenge is to define a suitable generic data model for unstructured meshes. We proposed ImG-Complexes data model as a generic topological mesh data model which extends incidence graph model to multi