pathway interaction database: Topics by Science.gov

Sample records for pathway interaction database

Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

PubMed Central

Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

2003-01-01

Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.

PubMed

Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui

2017-01-01

The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks

PubMed Central

Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui

2017-01-01

The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways. PMID:29049295
HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

PubMed

Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

2015-04-01

The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.
HBVPathDB: a database of HBV infection-related molecular interaction network.

PubMed

Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi

2005-03-21

To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
VisANT 3.0: new modules for pathway visualization, editing, prediction and construction.

PubMed

Hu, Zhenjun; Ng, David M; Yamada, Takuji; Chen, Chunnuan; Kawashima, Shuichi; Mellor, Joe; Linghu, Bolan; Kanehisa, Minoru; Stuart, Joshua M; DeLisi, Charles

2007-07-01

With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.
MIMO: an efficient tool for molecular interaction maps overlap

PubMed Central

2013-01-01

Background Molecular pathways represent an ensemble of interactions occurring among molecules within the cell and between cells. The identification of similarities between molecular pathways across organisms and functions has a critical role in understanding complex biological processes. For the inference of such novel information, the comparison of molecular pathways requires to account for imperfect matches (flexibility) and to efficiently handle complex network topologies. To date, these characteristics are only partially available in tools designed to compare molecular interaction maps. Results Our approach MIMO (Molecular Interaction Maps Overlap) addresses the first problem by allowing the introduction of gaps and mismatches between query and template pathways and permits -when necessary- supervised queries incorporating a priori biological information. It then addresses the second issue by relying directly on the rich graph topology described in the Systems Biology Markup Language (SBML) standard, and uses multidigraphs to efficiently handle multiple queries on biological graph databases. The algorithm has been here successfully used to highlight the contact point between various human pathways in the Reactome database. Conclusions MIMO offers a flexible and efficient graph-matching tool for comparing complex biological pathways. PMID:23672344
Text mining for metabolic pathways, signaling cascades, and protein networks.

PubMed

Hoffmann, Robert; Krallinger, Martin; Andres, Eduardo; Tamames, Javier; Blaschke, Christian; Valencia, Alfonso

2005-05-10

The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.
WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research.

PubMed

Slenter, Denise N; Kutmon, Martina; Hanspers, Kristina; Riutta, Anders; Windsor, Jacob; Nunes, Nuno; Mélius, Jonathan; Cirillo, Elisa; Coort, Susan L; Digles, Daniela; Ehrhart, Friederike; Giesbertz, Pieter; Kalafati, Marianthi; Martens, Marvin; Miller, Ryan; Nishida, Kozo; Rieswijk, Linda; Waagmeester, Andra; Eijssen, Lars M T; Evelo, Chris T; Pico, Alexander R; Willighagen, Egon L

2018-01-04

WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
BioPAX – A community standard for pathway data sharing

PubMed Central

Demir, Emek; Cary, Michael P.; Paley, Suzanne; Fukuda, Ken; Lemer, Christian; Vastrik, Imre; Wu, Guanming; D’Eustachio, Peter; Schaefer, Carl; Luciano, Joanne; Schacherer, Frank; Martinez-Flores, Irma; Hu, Zhenjun; Jimenez-Jacinto, Veronica; Joshi-Tope, Geeta; Kandasamy, Kumaran; Lopez-Fuentes, Alejandra C.; Mi, Huaiyu; Pichler, Elgar; Rodchenkov, Igor; Splendiani, Andrea; Tkachev, Sasha; Zucker, Jeremy; Gopinath, Gopal; Rajasimha, Harsha; Ramakrishnan, Ranjani; Shah, Imran; Syed, Mustafa; Anwar, Nadia; Babur, Ozgun; Blinov, Michael; Brauner, Erik; Corwin, Dan; Donaldson, Sylva; Gibbons, Frank; Goldberg, Robert; Hornbeck, Peter; Luna, Augustin; Murray-Rust, Peter; Neumann, Eric; Reubenacker, Oliver; Samwald, Matthias; van Iersel, Martijn; Wimalaratne, Sarala; Allen, Keith; Braun, Burk; Whirl-Carrillo, Michelle; Dahlquist, Kam; Finney, Andrew; Gillespie, Marc; Glass, Elizabeth; Gong, Li; Haw, Robin; Honig, Michael; Hubaut, Olivier; Kane, David; Krupa, Shiva; Kutmon, Martina; Leonard, Julie; Marks, Debbie; Merberg, David; Petri, Victoria; Pico, Alex; Ravenscroft, Dean; Ren, Liya; Shah, Nigam; Sunshine, Margot; Tang, Rebecca; Whaley, Ryan; Letovksy, Stan; Buetow, Kenneth H.; Rzhetsky, Andrey; Schachter, Vincent; Sobral, Bruno S.; Dogrusoz, Ugur; McWeeney, Shannon; Aladjem, Mirit; Birney, Ewan; Collado-Vides, Julio; Goto, Susumu; Hucka, Michael; Le Novère, Nicolas; Maltsev, Natalia; Pandey, Akhilesh; Thomas, Paul; Wingender, Edgar; Karp, Peter D.; Sander, Chris; Bader, Gary D.

2010-01-01

BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data (http://www.biopax.org). Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery. PMID:20829833
The Pathway Tools software.

PubMed

Karp, Peter D; Paley, Suzanne; Romero, Pedro

2002-01-01

Bioinformatics requires reusable software tools for creating model-organism databases (MODs). The Pathway Tools is a reusable, production-quality software environment for creating a type of MOD called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc (see http://ecocyc.org) integrates our evolving understanding of the genes, proteins, metabolic network, and genetic network of an organism. This paper provides an overview of the four main components of the Pathway Tools: The PathoLogic component supports creation of new PGDBs from the annotated genome of an organism. The Pathway/Genome Navigator provides query, visualization, and Web-publishing services for PGDBs. The Pathway/Genome Editors support interactive updating of PGDBs. The Pathway Tools ontology defines the schema of PGDBs. The Pathway Tools makes use of the Ocelot object database system for data management services for PGDBs. The Pathway Tools has been used to build PGDBs for 13 organisms within SRI and by external users.
The systematic annotation of the three main GPCR families in Reactome.

PubMed

Jassal, Bijay; Jupe, Steven; Caudy, Michael; Birney, Ewan; Stein, Lincoln; Hermjakob, Henning; D'Eustachio, Peter

2010-07-29

Reactome is an open-source, freely available database of human biological pathways and processes. A major goal of our work is to provide an integrated view of cellular signalling processes that spans from ligand-receptor interactions to molecular readouts at the level of metabolic and transcriptional events. To this end, we have built the first catalogue of all human G protein-coupled receptors (GPCRs) known to bind endogenous or natural ligands. The UniProt database has records for 797 proteins classified as GPCRs and sorted into families A/1, B/2 and C/3 on the basis of amino acid sequence. To these records we have added details from the IUPHAR database and our own manual curation of relevant literature to create reactions in which 563 GPCRs bind ligands and also interact with specific G-proteins to initiate signalling cascades. We believe the remaining 234 GPCRs are true orphans. The Reactome GPCR pathway can be viewed as a detailed interactive diagram and can be exported in many forms. It provides a template for the orthology-based inference of GPCR reactions for diverse model organism species, and can be overlaid with protein-protein interaction and gene expression datasets to facilitate overrepresentation studies and other forms of pathway analysis. Database URL: http://www.reactome.org.
EcoCyc: a comprehensive database resource for Escherichia coli

PubMed Central

Keseler, Ingrid M.; Collado-Vides, Julio; Gama-Castro, Socorro; Ingraham, John; Paley, Suzanne; Paulsen, Ian T.; Peralta-Gil, Martín; Karp, Peter D.

2005-01-01

The EcoCyc database (http://EcoCyc.org/) is a comprehensive source of information on the biology of the prototypical model organism Escherichia coli K12. The mission for EcoCyc is to contain both computable descriptions of, and detailed comments describing, all genes, proteins, pathways and molecular interactions in E.coli. Through ongoing manual curation, extensive information such as summary comments, regulatory information, literature citations and evidence types has been extracted from 8862 publications and added to Version 8.5 of the EcoCyc database. The EcoCyc database can be accessed through a World Wide Web interface, while the downloadable Pathway Tools software and data files enable computational exploration of the data and provide enhanced querying capabilities that web interfaces cannot support. For example, EcoCyc contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein–ligand interactions. PMID:15608210
Reconstruction of metabolic pathways by combining probabilistic graphical model-based and knowledge-based methods

PubMed Central

2014-01-01

Automatic reconstruction of metabolic pathways for an organism from genomics and transcriptomics data has been a challenging and important problem in bioinformatics. Traditionally, known reference pathways can be mapped into an organism-specific ones based on its genome annotation and protein homology. However, this simple knowledge-based mapping method might produce incomplete pathways and generally cannot predict unknown new relations and reactions. In contrast, ab initio metabolic network construction methods can predict novel reactions and interactions, but its accuracy tends to be low leading to a lot of false positives. Here we combine existing pathway knowledge and a new ab initio Bayesian probabilistic graphical model together in a novel fashion to improve automatic reconstruction of metabolic networks. Specifically, we built a knowledge database containing known, individual gene / protein interactions and metabolic reactions extracted from existing reference pathways. Known reactions and interactions were then used as constraints for Bayesian network learning methods to predict metabolic pathways. Using individual reactions and interactions extracted from different pathways of many organisms to guide pathway construction is new and improves both the coverage and accuracy of metabolic pathway construction. We applied this probabilistic knowledge-based approach to construct the metabolic networks from yeast gene expression data and compared its results with 62 known metabolic networks in the KEGG database. The experiment showed that the method improved the coverage of metabolic network construction over the traditional reference pathway mapping method and was more accurate than pure ab initio methods. PMID:25374614
Integrating In Silico Resources to Map a Signaling Network

PubMed Central

Liu, Hanqing; Beck, Tim N.; Golemis, Erica A.; Serebriiskii, Ilya G.

2013-01-01

The abundance of publicly available life science databases offer a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and discuss how the available tools are best utilized for different purposes. While emphasizing protein-protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug-protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol to building customized protein-protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature. PMID:24233784
Functional Interaction Network Construction and Analysis for Disease Discovery.

PubMed

Wu, Guanming; Haw, Robin

2017-01-01

Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology

PubMed Central

Paley, Suzanne M.; Krummenacker, Markus; Latendresse, Mario; Dale, Joseph M.; Lee, Thomas J.; Kaipa, Pallavi; Gilham, Fred; Spaulding, Aaron; Popescu, Liviu; Altman, Tomer; Paulsen, Ian; Keseler, Ingrid M.; Caspi, Ron

2010-01-01

Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry. PMID:19955237
In silico study of protein to protein interaction analysis of AMP-activated protein kinase and mitochondrial activity in three different farm animal species

NASA Astrophysics Data System (ADS)

Prastowo, S.; Widyas, N.

2018-03-01

AMP-activated protein kinase (AMPK) is cellular energy censor which works based on ATP and AMP concentration. This protein interacts with mitochondria in determine its activity to generate energy for cell metabolism purposes. For that, this paper aims to compare the protein to protein interaction of AMPK and mitochondrial activity genes in the metabolism of known animal farm (domesticated) that are cattle (Bos taurus), pig (Sus scrofa) and chicken (Gallus gallus). In silico study was done using STRING V.10 as prominent protein interaction database, followed with biological function comparison in KEGG PATHWAY database. Set of genes (12 in total) were used as input analysis that are PRKAA1, PRKAA2, PRKAB1, PRKAB2, PRKAG1, PRKAG2, PRKAG3, PPARGC1, ACC, CPT1B, NRF2 and SOD. The first 7 genes belong to gene in AMPK family, while the last 5 belong to mitochondrial activity genes. The protein interaction result shows 11, 8 and 5 metabolism pathways in Bos taurus, Sus scrofa and Gallus gallus, respectively. The top pathway in Bos taurus is AMPK signaling pathway (10 genes), Sus scrofa is Adipocytokine signaling pathway (8 genes) and Gallus gallus is FoxO signaling pathway (5 genes). Moreover, the common pathways found in those 3 species are Adipocytokine signaling pathway, Insulin signaling pathway and FoxO signaling pathway. Genes clustered in Adipocytokine and Insulin signaling pathway are PRKAA2, PPARGC1A, PRKAB1 and PRKAG2. While, in FoxO signaling pathway are PRKAA2, PRKAB1, PRKAG2. According to that, we found PRKAA2, PRKAB1 and PRKAG2 are the common genes. Based on the bioinformatics analysis, we can demonstrate that protein to protein interaction shows distinct different of metabolism in different species. However, further validation is needed to give a clear explanation.
A novel method to identify hub pathways of rheumatoid arthritis based on differential pathway networks.

PubMed

Wei, Shi-Tong; Sun, Yong-Hua; Zong, Shi-Hua

2017-09-01

The aim of the current study was to identify hub pathways of rheumatoid arthritis (RA) using a novel method based on differential pathway network (DPN) analysis. The present study proposed a DPN where protein‑protein interaction (PPI) network was integrated with pathway‑pathway interactions. Pathway data was obtained from background PPI network and the Reactome pathway database. Subsequently, pathway interactions were extracted from the pathway data by building randomized gene‑gene interactions and a weight value was assigned to each pathway interaction using Spearman correlation coefficient (SCC) to identify differential pathway interactions. Differential pathway interactions were visualized using Cytoscape to construct a DPN. Topological analysis was conducted to identify hub pathways that possessed the top 5% degree distribution of DPN. Modules of DPN were mined according to ClusterONE. A total of 855 pathways were selected to build pathway interactions. By filtrating pathway interactions of weight values >0.7, a DPN with 312 nodes and 791 edges was obtained. Topological degree analysis revealed 15 hub pathways, such as heparan sulfate/heparin‑glycosaminoglycan (HS‑GAG) degradation, HS‑GAG metabolism and keratan sulfate degradation for RA based on DPN. Furthermore, hub pathways were also important in modules, which validated the significance of hub pathways. In conclusion, the proposed method is a computationally efficient way to identify hub pathways of RA, which identified 15 hub pathways that may be potential biomarkers and provide insight to future investigation and treatment of RA.
SpirPro: A Spirulina proteome database and web-based tools for the analysis of protein-protein interactions at the metabolic level in Spirulina (Arthrospira) platensis C1.

PubMed

Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee

2015-07-29

Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th . SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.

Modelling the structure of a ceRNA-theoretical, bipartite microRNA-mRNA interaction network regulating intestinal epithelial cellular pathways using R programming.

PubMed

Robinson, J M; Henderson, W A

2018-01-12

We report a method using functional-molecular databases and network modelling to identify hypothetical mRNA-miRNA interaction networks regulating intestinal epithelial barrier function. The model forms a data-analysis component of our cell culture experiments, which produce RNA expression data from Nanostring Technologies nCounter ® system. The epithelial tight-junction (TJ) and actin cytoskeleton interact as molecular components of the intestinal epithelial barrier. Upstream regulation of TJ-cytoskeleton interaction is effected by the Rac/Rock/Rho signaling pathway and other associated pathways which may be activated or suppressed by extracellular signaling from growth factors, hormones, and immune receptors. Pathway activations affect epithelial homeostasis, contributing to degradation of the epithelial barrier associated with osmotic dysregulation, inflammation, and tumor development. The complexity underlying miRNA-mRNA interaction networks represents a roadblock for prediction and validation of competing-endogenous RNA network function. We developed a network model to identify hypothetical co-regulatory motifs in a miRNA-mRNA interaction network related to epithelial function. A mRNA-miRNA interaction list was generated using KEGG and miRWalk2.0 databases. R-code was developed to quantify and visualize inherent network structures. We identified a sub-network with a high number of shared, targeting miRNAs, of genes associated with cellular proliferation and cancer, including c-MYC and Cyclin D.
Investigating multiple dysregulated pathways in rheumatoid arthritis based on pathway interaction network.

PubMed

Song, Xian-Dong; Song, Xian-Xu; Liu, Gui-Bo; Ren, Chun-Hui; Sun, Yuan-Bo; Liu, Ke-Xin; Liu, Bo; Liang, Shuang; Zhu, Zhu

2018-03-01

The traditional methods of identifying biomarkers in rheumatoid arthritis (RA) have focussed on the differentially expressed pathways or individual pathways, which however, neglect the interactions between pathways. To better understand the pathogenesis of RA, we aimed to identify dysregulated pathway sets using a pathway interaction network (PIN), which considered interactions among pathways. Firstly, RA-related gene expression profile data, protein-protein interactions (PPI) data and pathway data were taken up from the corresponding databases. Secondly, principal component analysis method was used to calculate the pathway activity of each of the pathway, and then a seed pathway was identified using data gleaned from the pathway activity. A PIN was then constructed based on the gene expression profile, pathway data, and PPI information. Finally, the dysregulated pathways were extracted from the PIN based on the seed pathway using the method of support vector machines and an area under the curve (AUC) index. The PIN comprised of a total of 854 pathways and 1064 pathway interactions. The greatest change in the activity score between RA and control samples was observed in the pathway of epigenetic regulation of gene expression, which was extracted and regarded as the seed pathway. Starting with this seed pathway, one maximum pathway set containing 10 dysregulated pathways was extracted from the PIN, having an AUC of 0.8249, and the result indicated that this pathway set could distinguish RA from the controls. These 10 dysregulated pathways might be potential biomarkers for RA diagnosis and treatment in the future.
NRF2-ome: an integrated web resource to discover protein interaction and regulatory networks of NRF2.

PubMed

Türei, Dénes; Papp, Diána; Fazekas, Dávid; Földvári-Nagy, László; Módos, Dezső; Lenti, Katalin; Csermely, Péter; Korcsmáros, Tamás

2013-01-01

NRF2 is the master transcriptional regulator of oxidative and xenobiotic stress responses. NRF2 has important roles in carcinogenesis, inflammation, and neurodegenerative diseases. We developed an online resource, NRF2-ome, to provide an integrated and systems-level database for NRF2. The database contains manually curated and predicted interactions of NRF2 as well as data from external interaction databases. We integrated NRF2 interactome with NRF2 target genes, NRF2 regulating TFs, and miRNAs. We connected NRF2-ome to signaling pathways to allow mapping upstream NRF2 regulatory components that could directly or indirectly influence NRF2 activity totaling 35,967 protein-protein and signaling interactions. The user-friendly website allows researchers without computational background to search, browse, and download the database. The database can be downloaded in SQL, CSV, BioPAX, SBML, PSI-MI, and in a Cytoscape CYS file formats. We illustrated the applicability of the website by suggesting a posttranscriptional negative feedback of NRF2 by MAFG protein and raised the possibility of a connection between NRF2 and the JAK/STAT pathway through STAT1 and STAT3. NRF2-ome can also be used as an evaluation tool to help researchers and drug developers to understand the hidden regulatory mechanisms in the complex network of NRF2.
The Biomolecular Interaction Network Database and related tools 2005 update

PubMed Central

Alfarano, C.; Andrade, C. E.; Anthony, K.; Bahroos, N.; Bajec, M.; Bantoft, K.; Betel, D.; Bobechko, B.; Boutilier, K.; Burgess, E.; Buzadzija, K.; Cavero, R.; D'Abreo, C.; Donaldson, I.; Dorairajoo, D.; Dumontier, M. J.; Dumontier, M. R.; Earles, V.; Farrall, R.; Feldman, H.; Garderman, E.; Gong, Y.; Gonzaga, R.; Grytsan, V.; Gryz, E.; Gu, V.; Haldorsen, E.; Halupa, A.; Haw, R.; Hrvojic, A.; Hurrell, L.; Isserlin, R.; Jack, F.; Juma, F.; Khan, A.; Kon, T.; Konopinsky, S.; Le, V.; Lee, E.; Ling, S.; Magidin, M.; Moniakis, J.; Montojo, J.; Moore, S.; Muskat, B.; Ng, I.; Paraiso, J. P.; Parker, B.; Pintilie, G.; Pirone, R.; Salama, J. J.; Sgro, S.; Shan, T.; Shu, Y.; Siew, J.; Skinner, D.; Snyder, K.; Stasiuk, R.; Strumpf, D.; Tuekam, B.; Tao, S.; Wang, Z.; White, M.; Willis, R.; Wolting, C.; Wong, S.; Wrong, A.; Xin, C.; Yao, R.; Yates, B.; Zhang, S.; Zheng, K.; Pawson, T.; Ouellette, B. F. F.; Hogue, C. W. V.

2005-01-01

The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues. PMID:15608229
Benchmarking pathway interaction network for colorectal cancer to identify dysregulated pathways.

PubMed

Wang, Q; Shi, C-J; Lv, S-H

2017-03-30

Different pathways act synergistically to participate in many biological processes. Thus, the purpose of our study was to extract dysregulated pathways to investigate the pathogenesis of colorectal cancer (CRC) based on the functional dependency among pathways. Protein-protein interaction (PPI) information and pathway data were retrieved from STRING and Reactome databases, respectively. After genes were aligned to the pathways, each pathway activity was calculated using the principal component analysis (PCA) method, and the seed pathway was discovered. Subsequently, we constructed the pathway interaction network (PIN), where each node represented a biological pathway based on gene expression profile, PPI data, as well as pathways. Dysregulated pathways were then selected from the PIN according to classification performance and seed pathway. A PIN including 11,960 interactions was constructed to identify dysregulated pathways. Interestingly, the interaction of mRNA splicing and mRNA splicing-major pathway had the highest score of 719.8167. Maximum change of the activity score between CRC and normal samples appeared in the pathway of DNA replication, which was selected as the seed pathway. Starting with this seed pathway, a pathway set containing 30 dysregulated pathways was obtained with an area under the curve score of 0.8598. The pathway of mRNA splicing, mRNA splicing-major pathway, and RNA polymerase I had the maximum genes of 107. Moreover, we found that these 30 pathways had crosstalks with each other. The results suggest that these dysregulated pathways might be used as biomarkers to diagnose CRC.
An editor for pathway drawing and data visualization in the Biopathways Workbench.

PubMed

Byrnes, Robert W; Cotter, Dawn; Maer, Andreia; Li, Joshua; Nadeau, David; Subramaniam, Shankar

2009-10-02

Pathway models serve as the basis for much of systems biology. They are often built using programs designed for the purpose. Constructing new models generally requires simultaneous access to experimental data of diverse types, to databases of well-characterized biological compounds and molecular intermediates, and to reference model pathways. However, few if any software applications provide all such capabilities within a single user interface. The Pathway Editor is a program written in the Java programming language that allows de-novo pathway creation and downloading of LIPID MAPS (Lipid Metabolites and Pathways Strategy) and KEGG lipid metabolic pathways, and of measured time-dependent changes to lipid components of metabolism. Accessed through Java Web Start, the program downloads pathways from the LIPID MAPS Pathway database (Pathway) as well as from the LIPID MAPS web server http://www.lipidmaps.org. Data arises from metabolomic (lipidomic), microarray, and protein array experiments performed by the LIPID MAPS consortium of laboratories and is arranged by experiment. Facility is provided to create, connect, and annotate nodes and processes on a drawing panel with reference to database objects and time course data. Node and interaction layout as well as data display may be configured in pathway diagrams as desired. Users may extend diagrams, and may also read and write data and non-lipidomic KEGG pathways to and from files. Pathway diagrams in XML format, containing database identifiers referencing specific compounds and experiments, can be saved to a local file for subsequent use. The program is built upon a library of classes, referred to as the Biopathways Workbench, that convert between different file formats and database objects. An example of this feature is provided in the form of read/construct/write access to models in SBML (Systems Biology Markup Language) contained in the local file system. Inclusion of access to multiple experimental data types and of pathway diagrams within a single interface, automatic updating through connectivity to an online database, and a focus on annotation, including reference to standardized lipid nomenclature as well as common lipid names, supports the view that the Pathway Editor represents a significant, practicable contribution to current pathway modeling tools.
A Web Tool for Generating High Quality Machine-readable Biological Pathways.

PubMed

Ramirez-Gaona, Miguel; Marcu, Ana; Pon, Allison; Grant, Jason; Wu, Anthony; Wishart, David S

2017-02-08

PathWhiz is a web server built to facilitate the creation of colorful, interactive, visually pleasing pathway diagrams that are rich in biological information. The pathways generated by this online application are machine-readable and fully compatible with essentially all web-browsers and computer operating systems. It uses a specially developed, web-enabled pathway drawing interface that permits the selection and placement of different combinations of pre-drawn biological or biochemical entities to depict reactions, interactions, transport processes and binding events. This palette of entities consists of chemical compounds, proteins, nucleic acids, cellular membranes, subcellular structures, tissues, and organs. All of the visual elements in it can be interactively adjusted and customized. Furthermore, because this tool is a web server, all pathways and pathway elements are publicly accessible. This kind of pathway "crowd sourcing" means that PathWhiz already contains a large and rapidly growing collection of previously drawn pathways and pathway elements. Here we describe a protocol for the quick and easy creation of new pathways and the alteration of existing pathways. To further facilitate pathway editing and creation, the tool contains replication and propagation functions. The replication function allows existing pathways to be used as templates to create or edit new pathways. The propagation function allows one to take an existing pathway and automatically propagate it across different species. Pathways created with this tool can be "re-styled" into different formats (KEGG-like or text-book like), colored with different backgrounds, exported to BioPAX, SBGN-ML, SBML, or PWML data exchange formats, and downloaded as PNG or SVG images. The pathways can easily be incorporated into online databases, integrated into presentations, posters or publications, or used exclusively for online visualization and exploration. This protocol has been successfully applied to generate over 2,000 pathway diagrams, which are now found in many online databases including HMDB, DrugBank, SMPDB, and ECMDB.
NAViGaTing the Micronome – Using Multiple MicroRNA Prediction Databases to Identify Signalling Pathway-Associated MicroRNAs

PubMed Central

Shirdel, Elize A.; Xie, Wing; Mak, Tak W.; Jurisica, Igor

2011-01-01

Background MicroRNAs are a class of small RNAs known to regulate gene expression at the transcript level, the protein level, or both. Since microRNA binding is sequence-based but possibly structure-specific, work in this area has resulted in multiple databases storing predicted microRNA:target relationships computed using diverse algorithms. We integrate prediction databases, compare predictions to in vitro data, and use cross-database predictions to model the microRNA:transcript interactome – referred to as the micronome – to study microRNA involvement in well-known signalling pathways as well as associations with disease. We make this data freely available with a flexible user interface as our microRNA Data Integration Portal — mirDIP (http://ophid.utoronto.ca/mirDIP). Results mirDIP integrates prediction databases to elucidate accurate microRNA:target relationships. Using NAViGaTOR to produce interaction networks implicating microRNAs in literature-based, KEGG-based and Reactome-based pathways, we find these signalling pathway networks have significantly more microRNA involvement compared to chance (p<0.05), suggesting microRNAs co-target many genes in a given pathway. Further examination of the micronome shows two distinct classes of microRNAs; universe microRNAs, which are involved in many signalling pathways; and intra-pathway microRNAs, which target multiple genes within one signalling pathway. We find universe microRNAs to have more targets (p<0.0001), to be more studied (p<0.0002), and to have higher degree in the KEGG cancer pathway (p<0.0001), compared to intra-pathway microRNAs. Conclusions Our pathway-based analysis of mirDIP data suggests microRNAs are involved in intra-pathway signalling. We identify two distinct classes of microRNAs, suggesting a hierarchical organization of microRNAs co-targeting genes both within and between pathways, and implying differential involvement of universe and intra-pathway microRNAs at the disease level. PMID:21364759
DOMMINO 2.0: integrating structurally resolved protein-, RNA-, and DNA-mediated macromolecular interactions

PubMed Central

Kuang, Xingyan; Dhroso, Andi; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry

2016-01-01

Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction’s mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein–protein interactions or protein–DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1 040 000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43 000 RNA-mediated interactions, and ∼12 000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org PMID:26827237
cPath: open source software for collecting, storing, and querying biological pathways.

PubMed

Cerami, Ethan G; Bader, Gary D; Gross, Benjamin E; Sander, Chris

2006-11-13

Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling.
The Pathway Coexpression Network: Revealing pathway relationships

PubMed Central

Tanzi, Rudolph E.

2018-01-01

A goal of genomics is to understand the relationships between biological processes. Pathways contribute to functional interplay within biological processes through complex but poorly understood interactions. However, limited functional references for global pathway relationships exist. Pathways from databases such as KEGG and Reactome provide discrete annotations of biological processes. Their relationships are currently either inferred from gene set enrichment within specific experiments, or by simple overlap, linking pathway annotations that have genes in common. Here, we provide a unifying interpretation of functional interaction between pathways by systematically quantifying coexpression between 1,330 canonical pathways from the Molecular Signatures Database (MSigDB) to establish the Pathway Coexpression Network (PCxN). We estimated the correlation between canonical pathways valid in a broad context using a curated collection of 3,207 microarrays from 72 normal human tissues. PCxN accounts for shared genes between annotations to estimate significant correlations between pathways with related functions rather than with similar annotations. We demonstrate that PCxN provides novel insight into mechanisms of complex diseases using an Alzheimer’s Disease (AD) case study. PCxN retrieved pathways significantly correlated with an expert curated AD gene list. These pathways have known associations with AD and were significantly enriched for genes independently associated with AD. As a further step, we show how PCxN complements the results of gene set enrichment methods by revealing relationships between enriched pathways, and by identifying additional highly correlated pathways. PCxN revealed that correlated pathways from an AD expression profiling study include functional clusters involved in cell adhesion and oxidative stress. PCxN provides expanded connections to pathways from the extracellular matrix. PCxN provides a powerful new framework for interrogation of global pathway relationships. Comprehensive exploration of PCxN can be performed at http://pcxn.org/. PMID:29554099
Databases for Microbiologists

DOE PAGES

Zhulin, Igor B.

2015-05-26

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhulin, Igor B.

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.
Databases for Microbiologists

PubMed Central

2015-01-01

Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493
Metabolic pathways for the whole community.

PubMed

Hanson, Niels W; Konwar, Kishori M; Hawley, Alyse K; Altman, Tomer; Karp, Peter D; Hallam, Steven J

2014-07-22

A convergence of high-throughput sequencing and computational power is transforming biology into information science. Despite these technological advances, converting bits and bytes of sequence information into meaningful insights remains a challenging enterprise. Biological systems operate on multiple hierarchical levels from genomes to biomes. Holistic understanding of biological systems requires agile software tools that permit comparative analyses across multiple information levels (DNA, RNA, protein, and metabolites) to identify emergent properties, diagnose system states, or predict responses to environmental change. Here we adopt the MetaPathways annotation and analysis pipeline and Pathway Tools to construct environmental pathway/genome databases (ePGDBs) that describe microbial community metabolism using MetaCyc, a highly curated database of metabolic pathways and components covering all domains of life. We evaluate Pathway Tools' performance on three datasets with different complexity and coding potential, including simulated metagenomes, a symbiotic system, and the Hawaii Ocean Time-series. We define accuracy and sensitivity relationships between read length, coverage and pathway recovery and evaluate the impact of taxonomic pruning on ePGDB construction and interpretation. Resulting ePGDBs provide interactive metabolic maps, predict emergent metabolic pathways associated with biosynthesis and energy production and differentiate between genomic potential and phenotypic expression across defined environmental gradients. This multi-tiered analysis provides the user community with specific operating guidelines, performance metrics and prediction hazards for more reliable ePGDB construction and interpretation. Moreover, it demonstrates the power of Pathway Tools in predicting metabolic interactions in natural and engineered ecosystems.
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology

PubMed Central

Latendresse, Mario; Paley, Suzanne M.; Krummenacker, Markus; Ong, Quang D.; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M.; Caspi, Ron

2016-01-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. PMID:26454094
MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

PubMed

Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

2018-05-08

Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .
cPath: open source software for collecting, storing, and querying biological pathways

PubMed Central

Cerami, Ethan G; Bader, Gary D; Gross, Benjamin E; Sander, Chris

2006-01-01

Background Biological pathways, including metabolic pathways, protein interaction networks, signal transduction pathways, and gene regulatory networks, are currently represented in over 220 diverse databases. These data are crucial for the study of specific biological processes, including human diseases. Standard exchange formats for pathway information, such as BioPAX, CellML, SBML and PSI-MI, enable convenient collection of this data for biological research, but mechanisms for common storage and communication are required. Results We have developed cPath, an open source database and web application for collecting, storing, and querying biological pathway data. cPath makes it easy to aggregate custom pathway data sets available in standard exchange formats from multiple databases, present pathway data to biologists via a customizable web interface, and export pathway data via a web service to third-party software, such as Cytoscape, for visualization and analysis. cPath is software only, and does not include new pathway information. Key features include: a built-in identifier mapping service for linking identical interactors and linking to external resources; built-in support for PSI-MI and BioPAX standard pathway exchange formats; a web service interface for searching and retrieving pathway data sets; and thorough documentation. The cPath software is freely available under the LGPL open source license for academic and commercial use. Conclusion cPath is a robust, scalable, modular, professional-grade software platform for collecting, storing, and querying biological pathways. It can serve as the core data handling component in information systems for pathway visualization, analysis and modeling. PMID:17101041
MetNetAPI: A flexible method to access and manipulate biological network data from MetNet

PubMed Central

2010-01-01

Background Convenient programmatic access to different biological databases allows automated integration of scientific knowledge. Many databases support a function to download files or data snapshots, or a webservice that offers "live" data. However, the functionality that a database offers cannot be represented in a static data download file, and webservices may consume considerable computational resources from the host server. Results MetNetAPI is a versatile Application Programming Interface (API) to the MetNetDB database. It abstracts, captures and retains operations away from a biological network repository and website. A range of database functions, previously only available online, can be immediately (and independently from the website) applied to a dataset of interest. Data is available in four layers: molecular entities, localized entities (linked to a specific organelle), interactions, and pathways. Navigation between these layers is intuitive (e.g. one can request the molecular entities in a pathway, as well as request in what pathways a specific entity participates). Data retrieval can be customized: Network objects allow the construction of new and integration of existing pathways and interactions, which can be uploaded back to our server. In contrast to webservices, the computational demand on the host server is limited to processing data-related queries only. Conclusions An API provides several advantages to a systems biology software platform. MetNetAPI illustrates an interface with a central repository of data that represents the complex interrelationships of a metabolic and regulatory network. As an alternative to data-dumps and webservices, it allows access to a current and "live" database and exposes analytical functions to application developers. Yet it only requires limited resources on the server-side (thin server/fat client setup). The API is available for Java, Microsoft.NET and R programming environments and offers flexible query and broad data- retrieval methods. Data retrieval can be customized to client needs and the API offers a framework to construct and manipulate user-defined networks. The design principles can be used as a template to build programmable interfaces for other biological databases. The API software and tutorials are available at http://www.metnetonline.org/api. PMID:21083943
The Listeria monocytogenes strain 10403S BioCyc database

PubMed Central

Orsi, Renato H.; Bergholz, Teresa M.; Wiedmann, Martin; Boor, Kathryn J.

2015-01-01

Listeria monocytogenes is a food-borne pathogen of humans and other animals. The striking ability to survive several stresses usually used for food preservation makes L. monocytogenes one of the biggest concerns to the food industry, while the high mortality of listeriosis in specific groups of humans makes it a great concern for public health. Previous studies have shown that a regulatory network involving alternative sigma (σ) factors and transcription factors is pivotal to stress survival. However, few studies have evaluated at the metabolic networks controlled by these regulatory mechanisms. The L. monocytogenes BioCyc database uses the strain 10403S as a model. Computer-generated initial annotation for all genes also allowed for identification, annotation and display of predicted reactions and pathways carried out by a single cell. Further ongoing manual curation based on published data as well as database mining for selected genes allowed the more refined annotation of functions, which, in turn, allowed for annotation of new pathways and fine-tuning of previously defined pathways to more L. monocytogenes-specific pathways. Using RNA-Seq data, several transcription start sites and promoter regions were mapped to the 10403S genome and annotated within the database. Additionally, the identification of promoter regions and a comprehensive review of available literature allowed the annotation of several regulatory interactions involving σ factors and transcription factors. The L. monocytogenes 10403S BioCyc database is a new resource for researchers studying Listeria and related organisms. It allows users to (i) have a comprehensive view of all reactions and pathways predicted to take place within the cell in the cellular overview, as well as to (ii) upload their own data, such as differential expression data, to visualize the data in the scope of predicted pathways and regulatory networks and to carry on enrichment analyses using several different annotations available within the database. Database URL: http://biocyc.org/organism-summary?object=10403S_RAST PMID:25819074

Directed random walks and constraint programming reveal active pathways in hepatocyte growth factor signaling.

PubMed

Kittas, Aristotelis; Delobelle, Aurélien; Schmitt, Sabrina; Breuhahn, Kai; Guziolowski, Carito; Grabe, Niels

2016-01-01

An effective means to analyze mRNA expression data is to take advantage of established knowledge from pathway databases, using methods such as pathway-enrichment analyses. However, pathway databases are not case-specific and expression data could be used to infer gene-regulation patterns in the context of specific pathways. In addition, canonical pathways may not always describe the signaling mechanisms properly, because interactions can frequently occur between genes in different pathways. Relatively few methods have been proposed to date for generating and analyzing such networks, preserving the causality between gene interactions and reasoning over the qualitative logic of regulatory effects. We present an algorithm (MCWalk) integrated with a logic programming approach, to discover subgraphs in large-scale signaling networks by random walks in a fully automated pipeline. As an exemplary application, we uncover the signal transduction mechanisms in a gene interaction network describing hepatocyte growth factor-stimulated cell migration and proliferation from gene-expression measured with microarray and RT-qPCR using in-house perturbation experiments in a keratinocyte-fibroblast co-culture. The resulting subgraphs illustrate possible associations of hepatocyte growth factor receptor c-Met nodes, differentially expressed genes and cellular states. Using perturbation experiments and Answer Set programming, we are able to select those which are more consistent with the experimental data. We discover key regulator nodes by measuring the frequency with which they are traversed when connecting signaling between receptors and significantly regulated genes and predict their expression-shift consistently with the measured data. The Java implementation of MCWalk is publicly available under the MIT license at: https://bitbucket.org/akittas/biosubg. © 2015 FEBS.
Update of KDBI: Kinetic Data of Bio-molecular Interaction database

PubMed Central

Kumar, Pankaj; Han, B. C.; Shi, Z.; Jia, J.; Wang, Y. P.; Zhang, Y. T.; Liang, L.; Liu, Q. F.; Ji, Z. L.; Chen, Y. Z.

2009-01-01

Knowledge of the kinetics of biomolecular interactions is important for facilitating the study of cellular processes and underlying molecular events, and is essential for quantitative study and simulation of biological systems. Kinetic Data of Bio-molecular Interaction database (KDBI) has been developed to provide information about experimentally determined kinetic data of protein–protein, protein–nucleic acid, protein–ligand, nucleic acid–ligand binding or reaction events described in the literature. To accommodate increasing demand for studying and simulating biological systems, numerous improvements and updates have been made to KDBI, including new ways to access data by pathway and molecule names, data file in System Biology Markup Language format, more efficient search engine, access to published parameter sets of simulation models of 63 pathways, and 2.3-fold increase of data (19 263 entries of 10 532 distinctive biomolecular binding and 11 954 interaction events, involving 2635 proteins/protein complexes, 847 nucleic acids, 1603 small molecules and 45 multi-step processes). KDBI is publically available at http://bidd.nus.edu.sg/group/kdbi/kdbi.asp. PMID:18971255
BIND: the Biomolecular Interaction Network Database

PubMed Central

Bader, Gary D.; Betel, Doron; Hogue, Christopher W. V.

2003-01-01

The Biomolecular Interaction Network Database (BIND: http://bind.ca) archives biomolecular interaction, complex and pathway information. A web-based system is available to query, view and submit records. BIND continues to grow with the addition of individual submissions as well as interaction data from the PDB and a number of large-scale interaction and complex mapping experiments using yeast two hybrid, mass spectrometry, genetic interactions and phage display. We have developed a new graphical analysis tool that provides users with a view of the domain composition of proteins in interaction and complex records to help relate functional domains to protein interactions. An interaction network clustering tool has also been developed to help focus on regions of interest. Continued input from users has helped further mature the BIND data specification, which now includes the ability to store detailed information about genetic interactions. The BIND data specification is available as ASN.1 and XML DTD. PMID:12519993
Pathway collages: personalized multi-pathway diagrams.

PubMed

Paley, Suzanne; O'Maille, Paul E; Weaver, Daniel; Karp, Peter D

2016-12-13

Metabolic pathway diagrams are a classical way of visualizing a linked cascade of biochemical reactions. However, to understand some biochemical situations, viewing a single pathway is insufficient, whereas viewing the entire metabolic network results in information overload. How do we enable scientists to rapidly construct personalized multi-pathway diagrams that depict a desired collection of interacting pathways that emphasize particular pathway interactions? We define software for constructing personalized multi-pathway diagrams called pathway-collages using a combination of manual and automatic layouts. The user specifies a set of pathways of interest for the collage from a Pathway/Genome Database. Layouts for the individual pathways are generated by the Pathway Tools software, and are sent to a Javascript Pathway Collage application implemented using Cytoscape.js. That application allows the user to re-position pathways; define connections between pathways; change visual style parameters; and paint metabolomics, gene expression, and reaction flux data onto the collage to obtain a desired multi-pathway diagram. We demonstrate the use of pathway collages in two application areas: a metabolomics study of pathogen drug response, and an Escherichia coli metabolic model. Pathway collages enable facile construction of personalized multi-pathway diagrams.
Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases.

PubMed

Berger, Seth I; Posner, Jeremy M; Ma'ayan, Avi

2007-10-04

In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
An overview of bioinformatics methods for modeling biological pathways in yeast

PubMed Central

Hou, Jie; Acharya, Lipi; Zhu, Dongxiao

2016-01-01

The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein–protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae. In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways in S. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed. PMID:26476430
Identification of core pathways based on attractor and crosstalk in ischemic stroke.

PubMed

Diao, Xiufang; Liu, Aijuan

2018-02-01

Ischemic stroke is a leading cause of mortality and disability around the world. It is an important task to identify dysregulated pathways which infer molecular and functional insights existing in high-throughput experimental data. Gene expression profile of E-GEOD-16561 was collected. Pathways were obtained from the database of Kyoto Encyclopedia of Genes and Genomes and Retrieval of Interacting Genes was used to download protein-protein interaction sets. Attractor and crosstalk approaches were applied to screen dysregulated pathways. A total of 20 differentially expressed genes were identified in ischemic stroke. Thirty-nine significant differential pathways were identified according to P<0.01 and 28 pathways were identified with RP<0.01 and 17 pathways were identified with impact factor >250. On the basis of the three criteria, 11 significant dysfunctional pathways were identified. Among them, Epstein-Barr virus infection was the most significant differential pathway. In conclusion, with the method based on attractor and crosstalk, significantly dysfunctional pathways were identified. These pathways are expected to provide molecular mechanism of ischemic stroke and represents a novel potential therapeutic target for ischemic stroke treatment.
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

PubMed

Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

2016-09-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
ARMOUR - A Rice miRNA: mRNA Interaction Resource.

PubMed

Sanan-Mishra, Neeti; Tripathi, Anita; Goswami, Kavita; Shukla, Rohit N; Vasudevan, Madavan; Goswami, Hitesh

2018-01-01

ARMOUR was developed as A Rice miRNA:mRNA interaction resource. This informative and interactive database includes the experimentally validated expression profiles of miRNAs under different developmental and abiotic stress conditions across seven Indian rice cultivars. This comprehensive database covers 689 known and 1664 predicted novel miRNAs and their expression profiles in more than 38 different tissues or conditions along with their predicted/known target transcripts. The understanding of miRNA:mRNA interactome in regulation of functional cellular machinery is supported by the sequence information of the mature and hairpin structures. ARMOUR provides flexibility to users in querying the database using multiple ways like known gene identifiers, gene ontology identifiers, KEGG identifiers and also allows on the fly fold change analysis and sequence search query with inbuilt BLAST algorithm. ARMOUR database provides a cohesive platform for novel and mature miRNAs and their expression in different experimental conditions and allows searching for their interacting mRNA targets, GO annotation and their involvement in various biological pathways. The ARMOUR database includes a provision for adding more experimental data from users, with an aim to develop it as a platform for sharing and comparing experimental data contributed by research groups working on rice.
Key genes and pathways in measles and their interaction with environmental chemicals.

PubMed

Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing

2018-06-01

The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles.
Kinetic Modeling using BioPAX ontology

PubMed Central

Ruebenacker, Oliver; Moraru, Ion. I.; Schaff, James C.; Blinov, Michael L.

2010-01-01

Thousands of biochemical interactions are available for download from curated databases such as Reactome, Pathway Interaction Database and other sources in the Biological Pathways Exchange (BioPAX) format. However, the BioPAX ontology does not encode the necessary information for kinetic modeling and simulation. The current standard for kinetic modeling is the System Biology Markup Language (SBML), but only a small number of models are available in SBML format in public repositories. Additionally, reusing and merging SBML models presents a significant challenge, because often each element has a value only in the context of the given model, and information encoding biological meaning is absent. We describe a software system that enables a variety of operations facilitating the use of BioPAX data to create kinetic models that can be visualized, edited, and simulated using the Virtual Cell (VCell), including improved conversion to SBML (for use with other simulation tools that support this format). PMID:20862270
Prediction of novel target genes and pathways involved in bevacizumab-resistant colorectal cancer

PubMed Central

Makondi, Precious Takondwa; Lee, Chia-Hwa; Huang, Chien-Yu; Chu, Chi-Ming; Chang, Yu-Jia

2018-01-01

Bevacizumab combined with cytotoxic chemotherapy is the backbone of metastatic colorectal cancer (mCRC) therapy; however, its treatment efficacy is hampered by therapeutic resistance. Therefore, understanding the mechanisms underlying bevacizumab resistance is crucial to increasing the therapeutic efficacy of bevacizumab. The Gene Expression Omnibus (GEO) database (dataset, GSE86525) was used to identify the key genes and pathways involved in bevacizumab-resistant mCRC. The GEO2R web tool was used to identify differentially expressed genes (DEGs). Functional and pathway enrichment analyses of the DEGs were performed using the Database for Annotation, Visualization, and Integrated Discovery(DAVID). Protein–protein interaction (PPI) networks were established using the Search Tool for the Retrieval of Interacting Genes/Proteins database(STRING) and visualized using Cytoscape software. A total of 124 DEGs were obtained, 57 of which upregulated and 67 were downregulated. PPI network analysis showed that seven upregulated genes and nine downregulated genes exhibited high PPI degrees. In the functional enrichment, the DEGs were mainly enriched in negative regulation of phosphate metabolic process and positive regulation of cell cycle process gene ontologies (GOs); the enriched pathways were the phosphoinositide 3-kinase-serine/threonine kinase signaling pathway, bladder cancer, and microRNAs in cancer. Cyclin-dependent kinase inhibitor 1A(CDKN1A), toll-like receptor 4 (TLR4), CD19 molecule (CD19), breast cancer 1, early onset (BRCA1), platelet-derived growth factor subunit A (PDGFA), and matrix metallopeptidase 1 (MMP1) were the DEGs involved in the pathways and the PPIs. The clinical validation of the DEGs in mCRC (TNM clinical stages 3 and 4) revealed that high PDGFA expression levels were associated with poor overall survival, whereas high BRCA1 and MMP1 expression levels were associated with favorable progress free survival(PFS). The identified genes and pathways can be potential targets and predictors of therapeutic resistance and prognosis in bevacizumab-treated patients with mCRC. PMID:29342159
The Listeria monocytogenes strain 10403S BioCyc database.

PubMed

Orsi, Renato H; Bergholz, Teresa M; Wiedmann, Martin; Boor, Kathryn J

2015-01-01

Listeria monocytogenes is a food-borne pathogen of humans and other animals. The striking ability to survive several stresses usually used for food preservation makes L. monocytogenes one of the biggest concerns to the food industry, while the high mortality of listeriosis in specific groups of humans makes it a great concern for public health. Previous studies have shown that a regulatory network involving alternative sigma (σ) factors and transcription factors is pivotal to stress survival. However, few studies have evaluated at the metabolic networks controlled by these regulatory mechanisms. The L. monocytogenes BioCyc database uses the strain 10403S as a model. Computer-generated initial annotation for all genes also allowed for identification, annotation and display of predicted reactions and pathways carried out by a single cell. Further ongoing manual curation based on published data as well as database mining for selected genes allowed the more refined annotation of functions, which, in turn, allowed for annotation of new pathways and fine-tuning of previously defined pathways to more L. monocytogenes-specific pathways. Using RNA-Seq data, several transcription start sites and promoter regions were mapped to the 10403S genome and annotated within the database. Additionally, the identification of promoter regions and a comprehensive review of available literature allowed the annotation of several regulatory interactions involving σ factors and transcription factors. The L. monocytogenes 10403S BioCyc database is a new resource for researchers studying Listeria and related organisms. It allows users to (i) have a comprehensive view of all reactions and pathways predicted to take place within the cell in the cellular overview, as well as to (ii) upload their own data, such as differential expression data, to visualize the data in the scope of predicted pathways and regulatory networks and to carry on enrichment analyses using several different annotations available within the database. © The Author(s) 2015. Published by Oxford University Press.
Exploring of the molecular mechanism of rhinitis via bioinformatics methods

PubMed Central

Song, Yufen; Yan, Zhaohui

2018-01-01

The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
AN OVERVIEW OF COMPUTATIONAL LIFE SCIENCE DATABASES & EXCHANGE FORMATS OF RELEVANCE TO CHEMICAL BIOLOGY RESEARCH

PubMed Central

Hall, Aaron Smalter; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh

2016-01-01

Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities. PMID:22934944
An overview of computational life science databases & exchange formats of relevance to chemical biology research.

PubMed

Smalter Hall, Aaron; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh

2013-03-01

Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities.
GPCR & company: databases and servers for GPCRs and interacting partners.

PubMed

Kowalsman, Noga; Niv, Masha Y

2014-01-01

G-protein-coupled receptors (GPCRs) are a large superfamily of membrane receptors that are involved in a wide range of signaling pathways. To fulfill their tasks, GPCRs interact with a variety of partners, including small molecules, lipids and proteins. They are accompanied by different proteins during all phases of their life cycle. Therefore, GPCR interactions with their partners are of great interest in basic cell-signaling research and in drug discovery.Due to the rapid development of computers and internet communication, knowledge and data can be easily shared within the worldwide research community via freely available databases and servers. These provide an abundance of biological, chemical and pharmacological information.This chapter describes the available web resources for investigating GPCR interactions. We review about 40 freely available databases and servers, and provide a few sentences about the essence and the data they supply. For simplification, the databases and servers were grouped under the following topics: general GPCR-ligand interactions; particular families of GPCRs and their ligands; GPCR oligomerization; GPCR interactions with intracellular partners; and structural information on GPCRs. In conclusion, a multitude of useful tools are currently available. Summary tables are provided to ease navigation between the numerous and partially overlapping resources. Suggestions for future enhancements of the online tools include the addition of links from general to specialized databases and enabling usage of user-supplied template for GPCR structural modeling.
Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features.

PubMed

García-Jiménez, Beatriz; Pons, Tirso; Sanchis, Araceli; Valencia, Alfonso

2014-01-01

Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.
A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

PubMed Central

RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

2015-01-01

The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425
Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

PubMed

Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

2017-01-01

Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).

Autophagy Regulatory Network - a systems-level bioinformatics resource for studying the mechanism and regulation of autophagy.

PubMed

Türei, Dénes; Földvári-Nagy, László; Fazekas, Dávid; Módos, Dezső; Kubisch, János; Kadlecsik, Tamás; Demeter, Amanda; Lenti, Katalin; Csermely, Péter; Vellai, Tibor; Korcsmáros, Tamás

2015-01-01

Autophagy is a complex cellular process having multiple roles, depending on tissue, physiological, or pathological conditions. Major post-translational regulators of autophagy are well known, however, they have not yet been collected comprehensively. The precise and context-dependent regulation of autophagy necessitates additional regulators, including transcriptional and post-transcriptional components that are listed in various datasets. Prompted by the lack of systems-level autophagy-related information, we manually collected the literature and integrated external resources to gain a high coverage autophagy database. We developed an online resource, Autophagy Regulatory Network (ARN; http://autophagy-regulation.org), to provide an integrated and systems-level database for autophagy research. ARN contains manually curated, imported, and predicted interactions of autophagy components (1,485 proteins with 4,013 interactions) in humans. We listed 413 transcription factors and 386 miRNAs that could regulate autophagy components or their protein regulators. We also connected the above-mentioned autophagy components and regulators with signaling pathways from the SignaLink 2 resource. The user-friendly website of ARN allows researchers without computational background to search, browse, and download the database. The database can be downloaded in SQL, CSV, BioPAX, SBML, PSI-MI, and in a Cytoscape CYS file formats. ARN has the potential to facilitate the experimental validation of novel autophagy components and regulators. In addition, ARN helps the investigation of transcription factors, miRNAs and signaling pathways implicated in the control of the autophagic pathway. The list of such known and predicted regulators could be important in pharmacological attempts against cancer and neurodegenerative diseases.
Analysis of molecular pathways in pancreatic ductal adenocarcinomas with a bioinformatics approach.

PubMed

Wang, Yan; Li, Yan

2015-01-01

Pancreatic ductal adenocarcinoma (PDAC) is a leading cause of cancer death worldwide. Our study aimed to reveal molecular mechanisms. Microarray data of GSE15471 (including 39 matching pairs of pancreatic tumor tissues and patient-matched normal tissues) was downloaded from Gene Expression Omnibus (GEO) database. We identified differentially expressed genes (DEGs) in PDAC tissues compared with normal tissues by limma package in R language. Then GO and KEGG pathway enrichment analyses were conducted with online DAVID. In addition, principal component analysis was performed and a protein-protein interaction network was constructed to study relationships between the DEGs through database STRING. A total of 532 DEGs were identified in the 38 PDAC tissues compared with 33 normal tissues. The results of principal component analysis of the top 20 DEGs could differentiate the PDAC tissues from normal tissues directly. In the PPI network, 8 of the 20 DEGs were all key genes of the collagen family. Additionally, FN1 (fibronectin 1) was also a hub node in the network. The genes of the collagen family as well as FN1 were significantly enriched in complement and coagulation cascades, ECM-receptor interaction and focal adhesion pathways. Our results suggest that genes of collagen family and FN1 may play an important role in PDAC progression. Meanwhile, these DEGs and enriched pathways, such as complement and coagulation cascades, ECM-receptor interaction and focal adhesion may be important molecular mechanisms involved in the development and progression of PDAC.
PathwayAccess: CellDesigner plugins for pathway databases.

PubMed

Van Hemert, John L; Dickerson, Julie A

2010-09-15

CellDesigner provides a user-friendly interface for graphical biochemical pathway description. Many pathway databases are not directly exportable to CellDesigner models. PathwayAccess is an extensible suite of CellDesigner plugins, which connect CellDesigner directly to pathway databases using respective Java application programming interfaces. The process is streamlined for creating new PathwayAccess plugins for specific pathway databases. Three PathwayAccess plugins, MetNetAccess, BioCycAccess and ReactomeAccess, directly connect CellDesigner to the pathway databases MetNetDB, BioCyc and Reactome. PathwayAccess plugins enable CellDesigner users to expose pathway data to analytical CellDesigner functions, curate their pathway databases and visually integrate pathway data from different databases using standard Systems Biology Markup Language and Systems Biology Graphical Notation. Implemented in Java, PathwayAccess plugins run with CellDesigner version 4.0.1 and were tested on Ubuntu Linux, Windows XP and 7, and MacOSX. Source code, binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv.
Target identification in Fusobacterium nucleatum by subtractive genomics approach and enrichment analysis of host-pathogen protein-protein interactions.

PubMed

Kumar, Amit; Thotakura, Pragna Lakshmi; Tiwary, Basant Kumar; Krishna, Ramadas

2016-05-12

Fusobacterium nucleatum, a well studied bacterium in periodontal diseases, appendicitis, gingivitis, osteomyelitis and pregnancy complications has recently gained attention due to its association with colorectal cancer (CRC) progression. Treatment with berberine was shown to reverse F. nucleatum-induced CRC progression in mice by balancing the growth of opportunistic pathogens in tumor microenvironment. Intestinal microbiota imbalance and the infections caused by F. nucleatum might be regulated by therapeutic intervention. Hence, we aimed to predict drug target proteins in F. nucleatum, through subtractive genomics approach and host-pathogen protein-protein interactions (HP-PPIs). We also carried out enrichment analysis of host interacting partners to hypothesize the possible mechanisms involved in CRC progression due to F. nucleatum. In subtractive genomics approach, the essential, virulence and resistance related proteins were retrieved from RefSeq proteome of F. nucleatum by searching against Database of Essential Genes (DEG), Virulence Factor Database (VFDB) and Antibiotic Resistance Gene-ANNOTation (ARG-ANNOT) tool respectively. A subsequent hierarchical screening to identify non-human homologous, metabolic pathway-independent/pathway-specific and druggable proteins resulted in eight pathway-independent and 27 pathway-specific druggable targets. Co-aggregation of F. nucleatum with host induces proinflammatory gene expression thereby potentiates tumorigenesis. Hence, proteins from IBDsite, a database for inflammatory bowel disease (IBD) research and those involved in colorectal adenocarcinoma as interpreted from The Cancer Genome Atlas (TCGA) were retrieved to predict drug targets based on HP-PPIs with F. nucleatum proteome. Prediction of HP-PPIs exhibited 186 interactions contributed by 103 host and 76 bacterial proteins. Bacterial interacting partners were accounted as putative targets. And enrichment analysis of host interacting partners showed statistically enriched terms that were in positive correlation with CRC, atherosclerosis, cardiovascular, osteoporosis, Alzheimer's and other diseases. Subtractive genomics analysis provided a set of target proteins suggested to be indispensable for survival and pathogenicity of F. nucleatum. These target proteins might be considered for designing potent inhibitors to abrogate F. nucleatum infections. From enrichment analysis, it was hypothesized that F. nucleatum infection might enhance CRC progression by simultaneously regulating multiple signaling cascades which could lead to up-regulation of proinflammatory responses, oncogenes, modulation of host immune defense mechanism and suppression of DNA repair system.
An overview of bioinformatics methods for modeling biological pathways in yeast.

PubMed

Hou, Jie; Acharya, Lipi; Zhu, Dongxiao; Cheng, Jianlin

2016-03-01

The advent of high-throughput genomics techniques, along with the completion of genome sequencing projects, identification of protein-protein interactions and reconstruction of genome-scale pathways, has accelerated the development of systems biology research in the yeast organism Saccharomyces cerevisiae In particular, discovery of biological pathways in yeast has become an important forefront in systems biology, which aims to understand the interactions among molecules within a cell leading to certain cellular processes in response to a specific environment. While the existing theoretical and experimental approaches enable the investigation of well-known pathways involved in metabolism, gene regulation and signal transduction, bioinformatics methods offer new insights into computational modeling of biological pathways. A wide range of computational approaches has been proposed in the past for reconstructing biological pathways from high-throughput datasets. Here we review selected bioinformatics approaches for modeling biological pathways inS. cerevisiae, including metabolic pathways, gene-regulatory pathways and signaling pathways. We start with reviewing the research on biological pathways followed by discussing key biological databases. In addition, several representative computational approaches for modeling biological pathways in yeast are discussed. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.
HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

PubMed

López, Yosvany; Nakai, Kenta; Patil, Ashwini

2015-01-01

HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.
The BioGRID Interaction Database: 2011 update

PubMed Central

Stark, Chris; Breitkreutz, Bobby-Joe; Chatr-aryamontri, Andrew; Boucher, Lorrie; Oughtred, Rose; Livstone, Michael S.; Nixon, Julie; Van Auken, Kimberly; Wang, Xiaodong; Shi, Xiaoqi; Reguly, Teresa; Rust, Jennifer M.; Winter, Andrew; Dolinski, Kara; Tyers, Mike

2011-01-01

The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions. PMID:21071413
The Comparative Toxicogenomics Database: update 2017.

PubMed

Davis, Allan Peter; Grondin, Cynthia J; Johnson, Robin J; Sciaky, Daniela; King, Benjamin L; McMorran, Roy; Wiegers, Jolene; Wiegers, Thomas C; Mattingly, Carolyn J

2017-01-04

The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between chemicals and gene products, and their relationships to diseases. Core CTD content (chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature) are integrated with each other as well as with select external datasets to generate expanded networks and predict novel associations. Today, core CTD includes more than 30.5 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, Gene Ontology (GO) annotations, pathways, and gene interaction modules. In this update, we report a 33% increase in our core data content since 2015, describe our new exposure module (that harmonizes exposure science information with core toxicogenomic data) and introduce a novel dataset of GO-disease inferences (that identify common molecular underpinnings for seemingly unrelated pathologies). These advancements centralize and contextualize real-world chemical exposures with molecular pathways to help scientists generate testable hypotheses in an effort to understand the etiology and mechanisms underlying environmentally influenced diseases. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
PathNER: a tool for systematic identification of biological pathway mentions in the literature

PubMed Central

2013-01-01

Background Biological pathways are central to many biomedical studies and are frequently discussed in the literature. Several curated databases have been established to collate the knowledge of molecular processes constituting pathways. Yet, there has been little focus on enabling systematic detection of pathway mentions in the literature. Results We developed a tool, named PathNER (Pathway Named Entity Recognition), for the systematic identification of pathway mentions in the literature. PathNER is based on soft dictionary matching and rules, with the dictionary generated from public pathway databases. The rules utilise general pathway-specific keywords, syntactic information and gene/protein mentions. Detection results from both components are merged. On a gold-standard corpus, PathNER achieved an F1-score of 84%. To illustrate its potential, we applied PathNER on a collection of articles related to Alzheimer's disease to identify associated pathways, highlighting cases that can complement an existing manually curated knowledgebase. Conclusions In contrast to existing text-mining efforts that target the automatic reconstruction of pathway details from molecular interactions mentioned in the literature, PathNER focuses on identifying specific named pathway mentions. These mentions can be used to support large-scale curation and pathway-related systems biology applications, as demonstrated in the example of Alzheimer's disease. PathNER is implemented in Java and made freely available online at http://sourceforge.net/projects/pathner/. PMID:24555844
Key genes and pathways in measles and their interaction with environmental chemicals

PubMed Central

Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing

2018-01-01

The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles. PMID:29805511
An advanced web query interface for biological databases

PubMed Central

Latendresse, Mario; Karp, Peter D.

2010-01-01

Although most web-based biological databases (DBs) offer some type of web-based form to allow users to author DB queries, these query forms are quite restricted in the complexity of DB queries that they can formulate. They can typically query only one DB, and can query only a single type of object at a time (e.g. genes) with no possible interaction between the objects—that is, in SQL parlance, no joins are allowed between DB objects. Writing precise queries against biological DBs is usually left to a programmer skillful enough in complex DB query languages like SQL. We present a web interface for building precise queries for biological DBs that can construct much more precise queries than most web-based query forms, yet that is user friendly enough to be used by biologists. It supports queries containing multiple conditions, and connecting multiple object types without using the join concept, which is unintuitive to biologists. This interactive web interface is called the Structured Advanced Query Page (SAQP). Users interactively build up a wide range of query constructs. Interactive documentation within the SAQP describes the schema of the queried DBs. The SAQP is based on BioVelo, a query language based on list comprehension. The SAQP is part of the Pathway Tools software and is available as part of several bioinformatics web sites powered by Pathway Tools, including the BioCyc.org site that contains more than 500 Pathway/Genome DBs. PMID:20624715
Groups: knowledge spreadsheets for symbolic biocomputing.

PubMed

Travers, Michael; Paley, Suzanne M; Shrager, Jeff; Holland, Timothy A; Karp, Peter D

2013-01-01

Knowledge spreadsheets (KSs) are a visual tool for interactive data analysis and exploration. They differ from traditional spreadsheets in that rather than being oriented toward numeric data, they work with symbolic knowledge representation structures and provide operations that take into account the semantics of the application domain. 'Groups' is an implementation of KSs within the Pathway Tools system. Groups allows Pathway Tools users to define a group of objects (e.g. groups of genes or metabolites) from a Pathway/Genome Database. Groups can be transformed (e.g. by transforming a metabolite group to the group of pathways in which those metabolites are substrates); combined through set operations; analysed (e.g. through enrichment analysis); and visualized (e.g. by painting onto a metabolic map diagram). Users of the Pathway Tools-based BioCyc.org website have made extensive use of Groups, and an informal survey of Groups users suggests that Groups has achieved the goal of allowing biologists themselves to perform some data manipulations that previously would have required the assistance of a programmer. Database URL: BioCyc.org.
Pathway Distiller - multisource biological pathway consolidation

PubMed Central

2012-01-01

Background One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. Methods After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. Results We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. Conclusions By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments. PMID:23134636
Pathway Distiller - multisource biological pathway consolidation.

PubMed

Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong

2012-01-01

One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

PubMed Central

Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

2006-01-01

Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651
LASSO-ing Potential Nuclear Receptor Agonists and Antagonists: A New Computational Method for Database Screening

EPA Science Inventory

Nuclear receptors (NRs) are important biological macromolecular transcription factors that are implicated in multiple biological pathways and may interact with other xenobiotics that are endocrine disruptors present in the environment. Examples of important NRs include the androg...
Investigation of candidate genes for osteoarthritis based on gene expression profiles.

PubMed

Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei

2016-12-01

To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

PubMed Central

Mukhopadhyay, Anirban; Maulik, Ujjwal; Bandyopadhyay, Sanghamitra

2012-01-01

Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed. PMID:22539940
DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions

PubMed Central

Arjunan, Selvam; Sastri, Narayan P.; Chandra, Nagasuma

2016-01-01

Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue–human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue–human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/). PMID:27618709
DenHunt - A Comprehensive Database of the Intricate Network of Dengue-Human Interactions.

PubMed

Karyala, Prashanthi; Metri, Rahul; Bathula, Christopher; Yelamanchi, Syam K; Sahoo, Lipika; Arjunan, Selvam; Sastri, Narayan P; Chandra, Nagasuma

2016-09-01

Dengue virus (DENV) is a human pathogen and its etiology has been widely established. There are many interactions between DENV and human proteins that have been reported in literature. However, no publicly accessible resource for efficiently retrieving the information is yet available. In this study, we mined all publicly available dengue-human interactions that have been reported in the literature into a database called DenHunt. We retrieved 682 direct interactions of human proteins with dengue viral components, 382 indirect interactions and 4120 differentially expressed human genes in dengue infected cell lines and patients. We have illustrated the importance of DenHunt by mapping the dengue-human interactions on to the host interactome and observed that the virus targets multiple host functional complexes of important cellular processes such as metabolism, immune system and signaling pathways suggesting a potential role of these interactions in viral pathogenesis. We also observed that 7 percent of the dengue virus interacting human proteins are also associated with other infectious and non-infectious diseases. Finally, the understanding that comes from such analyses could be used to design better strategies to counteract the diseases caused by dengue virus. The whole dataset has been catalogued in a searchable database, called DenHunt (http://proline.biochem.iisc.ernet.in/DenHunt/).

Identification of key genes and pathways associated with neuropathic pain in uninjured dorsal root ganglion by using bioinformatic analysis.

PubMed

Chen, Chao-Jin; Liu, De-Zhao; Yao, Wei-Feng; Gu, Yu; Huang, Fei; Hei, Zi-Qing; Li, Xiang

2017-01-01

Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs) drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs) and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL) by using bioinformatic analysis. The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein-protein interaction (PPI) network and module analysis. Real-time polymerase chain reaction (PCR) and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model. A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were upregulated in both L4 and L5 DRGs. This study provides insight into the functional gene sets and pathways associated with neuropathic pain in L4 uninjured DRG after L5 SNL, which might promote our understanding of the molecular mechanisms underlying the development of neuropathic pain.
SuperTarget goes quantitative: update on drug–target interactions

PubMed Central

Hecker, Nikolai; Ahmed, Jessica; von Eichborn, Joachim; Dunkel, Mathias; Macha, Karel; Eckert, Andreas; Gilson, Michael K.; Bourne, Philip E.; Preissner, Robert

2012-01-01

There are at least two good reasons for the on-going interest in drug–target interactions: first, drug-effects can only be fully understood by considering a complex network of interactions to multiple targets (so-called off-target effects) including metabolic and signaling pathways; second, it is crucial to consider drug-target-pathway relations for the identification of novel targets for drug development. To address this on-going need, we have developed a web-based data warehouse named SuperTarget, which integrates drug-related information associated with medical indications, adverse drug effects, drug metabolism, pathways and Gene Ontology (GO) terms for target proteins. At present, the updated database contains >6000 target proteins, which are annotated with >330 000 relations to 196 000 compounds (including approved drugs); the vast majority of interactions include binding affinities and pointers to the respective literature sources. The user interface provides tools for drug screening and target similarity inclusion. A query interface enables the user to pose complex queries, for example, to find drugs that target a certain pathway, interacting drugs that are metabolized by the same cytochrome P450 or drugs that target proteins within a certain affinity range. SuperTarget is available at http://bioinformatics.charite.de/supertarget. PMID:22067455
Cytoscape: a software environment for integrated models of biomolecular interaction networks.

PubMed

Shannon, Paul; Markiel, Andrew; Ozier, Owen; Baliga, Nitin S; Wang, Jonathan T; Ramage, Daniel; Amin, Nada; Schwikowski, Benno; Ideker, Trey

2003-11-01

Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
SolCyc: a database hub at the Sol Genomics Network (SGN) for the manual curation of metabolic networks in Solanum and Nicotiana specific databases

PubMed Central

Foerster, Hartmut; Bombarely, Aureliano; Battey, James N D; Sierro, Nicolas; Ivanov, Nikolai V; Mueller, Lukas A

2018-01-01

Abstract SolCyc is the entry portal to pathway/genome databases (PGDBs) for major species of the Solanaceae family hosted at the Sol Genomics Network. Currently, SolCyc comprises six organism-specific PGDBs for tomato, potato, pepper, petunia, tobacco and one Rubiaceae, coffee. The metabolic networks of those PGDBs have been computationally predicted by the pathologic component of the pathway tools software using the manually curated multi-domain database MetaCyc (http://www.metacyc.org/) as reference. SolCyc has been recently extended by taxon-specific databases, i.e. the family-specific SolanaCyc database, containing only curated data pertinent to species of the nightshade family, and NicotianaCyc, a genus-specific database that stores all relevant metabolic data of the Nicotiana genus. Through manual curation of the published literature, new metabolic pathways have been created in those databases, which are complemented by the continuously updated, relevant species-specific pathways from MetaCyc. At present, SolanaCyc comprises 199 pathways and 29 superpathways and NicotianaCyc accounts for 72 pathways and 13 superpathways. Curator-maintained, taxon-specific databases such as SolanaCyc and NicotianaCyc are characterized by an enrichment of data specific to these taxa and free of falsely predicted pathways. Both databases have been used to update recently created Nicotiana-specific databases for Nicotiana tabacum, Nicotiana benthamiana, Nicotiana sylvestris and Nicotiana tomentosiformis by propagating verifiable data into those PGDBs. In addition, in-depth curation of the pathways in N.tabacum has been carried out which resulted in the elimination of 156 pathways from the 569 pathways predicted by pathway tools. Together, in-depth curation of the predicted pathway network and the supplementation with curated data from taxon-specific databases has substantially improved the curation status of the species–specific N.tabacum PGDB. The implementation of this strategy will significantly advance the curation status of all organism-specific databases in SolCyc resulting in the improvement on database accuracy, data analysis and visualization of biochemical networks in those species. Database URL https://solgenomics.net/tools/solcyc/ PMID:29762652
Interleukins and their signaling pathways in the Reactome biological pathway database.

PubMed

Jupe, Steve; Ray, Keith; Roca, Corina Duenas; Varusai, Thawfeek; Shamovsky, Veronica; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

2018-04-01

There is a wealth of biological pathway information available in the scientific literature, but it is spread across many thousands of publications. Alongside publications that contain definitive experimental discoveries are many others that have been dismissed as spurious, found to be irreproducible, or are contradicted by later results and consequently now considered controversial. Many descriptions and images of pathways are incomplete stylized representations that assume the reader is an expert and familiar with the established details of the process, which are consequently not fully explained. Pathway representations in publications frequently do not represent a complete, detailed, and unambiguous description of the molecules involved; their precise posttranslational state; or a full account of the molecular events they undergo while participating in a process. Although this might be sufficient to be interpreted by an expert reader, the lack of detail makes such pathways less useful and difficult to understand for anyone unfamiliar with the area and of limited use as the basis for computational models. Reactome was established as a freely accessible knowledge base of human biological pathways. It is manually populated with interconnected molecular events that fully detail the molecular participants linked to published experimental data and background material by using a formal and open data structure that facilitates computational reuse. These data are accessible on a Web site in the form of pathway diagrams that have descriptive summaries and annotations and as downloadable data sets in several formats that can be reused with other computational tools. The entire database and all supporting software can be downloaded and reused under a Creative Commons license. Pathways are authored by expert biologists who work with Reactome curators and editorial staff to represent the consensus in the field. Pathways are represented as interactive diagrams that include as much molecular detail as possible and are linked to literature citations that contain supporting experimental details. All newly created events undergo a peer-review process before they are added to the database and made available on the associated Web site. New content is added quarterly. The 63rd release of Reactome in December 2017 contains 10,996 human proteins participating in 11,426 events in 2,179 pathways. In addition, analytic tools allow data set submission for the identification and visualization of pathway enrichment and representation of expression profiles as an overlay on Reactome pathways. Protein-protein and compound-protein interactions from several sources, including custom user data sets, can be added to extend pathways. Pathway diagrams and analytic result displays can be downloaded as editable images, human-readable reports, and files in several standard formats that are suitable for computational reuse. Reactome content is available programmatically through a REpresentational State Transfer (REST)-based content service and as a Neo4J graph database. Signaling pathways for IL-1 to IL-38 are hierarchically classified within the pathway "signaling by interleukins." The classification used is largely derived from Akdis et al. The addition to Reactome of a complete set of the known human interleukins, their receptors, and established signaling pathways linked to annotations of relevant aspects of immune function provides a significant computationally accessible resource of information about this important family. This information can be extended easily as new discoveries become accepted as the consensus in the field. A key aim for the future is to increase coverage of gene expression changes induced by interleukin signaling. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Transcriptome profiling identified differentially expressed genes and pathways associated with tamoxifen resistance in human breast cancer

PubMed Central

Men, Xin; Ma, Jun; Wu, Tong; Pu, Junyi; Wen, Shaojia; Shen, Jianfeng; Wang, Xun; Wang, Yamin; Chen, Chao; Dai, Penggao

2018-01-01

Tamoxifen (TAM) resistance is an important clinical problem in the treatment of breast cancer. In order to identify the mechanism of TAM resistance for estrogen receptor (ER)-positive breast cancer, we screened the transcriptome using RNA-seq and compared the gene expression profiles between the MCF-7 mamma carcinoma cell line and the TAM-resistant cell line TAMR/MCF-7, 52 significant differential expression genes (DEGs) were identified including SLIT2, ROBO, LHX, KLF, VEGFC, BAMBI, LAMA1, FLT4, PNMT, DHRS2, MAOA and ALDH. The DEGs were annotated in the GO, COG and KEGG databases. Annotation of the function of the DEGs in the KEGG database revealed the top three pathways enriched with the most DEGs, including pathways in cancer, the PI3K-AKT pathway, and focal adhesion. Then we compared the gene expression profiles between the Clinical progressive disease (PD) and the complete response (CR) from the cancer genome altas (TCGA). 10 common DEGs were identified through combining the clinical and cellular analysis results. Protein-protein interaction network was applied to analyze the association of ER signal pathway with the 10 DEGs. 3 significant genes (GFRA3, NPY1R and PTPRN2) were closely related to ER related pathway. These significant DEGs regulated many biological activities such as cell proliferation and survival, motility and migration, and tumor cell invasion. The interactions between these DEGs and drug resistance phenomenon need to be further elucidated at a functional level in further studies. Based on our findings, we believed that these DEGs could be therapeutic targets, which can be explored to develop new treatment options. PMID:29423105
ChemiRs: a web application for microRNAs and chemicals.

PubMed

Su, Emily Chia-Yu; Chen, Yu-Sing; Tien, Yun-Cheng; Liu, Jeff; Ho, Bing-Ching; Yu, Sung-Liang; Singh, Sher

2016-04-18

MicroRNAs (miRNAs) are about 22 nucleotides, non-coding RNAs that affect various cellular functions, and play a regulatory role in different organisms including human. Until now, more than 2500 mature miRNAs in human have been discovered and registered, but still lack of information or algorithms to reveal the relations among miRNAs, environmental chemicals and human health. Chemicals in environment affect our health and daily life, and some of them can lead to diseases by inferring biological pathways. We develop a creditable online web server, ChemiRs, for predicting interactions and relations among miRNAs, chemicals and pathways. The database not only compares gene lists affected by chemicals and miRNAs, but also incorporates curated pathways to identify possible interactions. Here, we manually retrieved associations of miRNAs and chemicals from biomedical literature. We developed an online system, ChemiRs, which contains miRNAs, diseases, Medical Subject Heading (MeSH) terms, chemicals, genes, pathways and PubMed IDs. We connected each miRNA to miRBase, and every current gene symbol to HUGO Gene Nomenclature Committee (HGNC) for genome annotation. Human pathway information is also provided from KEGG and REACTOME databases. Information about Gene Ontology (GO) is queried from GO Online SQL Environment (GOOSE). With a user-friendly interface, the web application is easy to use. Multiple query results can be easily integrated and exported as report documents in PDF format. Association analysis of miRNAs and chemicals can help us understand the pathogenesis of chemical components. ChemiRs is freely available for public use at http://omics.biol.ntnu.edu.tw/ChemiRs .
Drug-Path: a database for drug-induced pathways

PubMed Central

Zeng, Hui; Cui, Qinghua

2015-01-01

Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches. Database URL: http://www.cuilab.cn/drugpath PMID:26130661
Meta-All: a system for managing metabolic pathway information.

PubMed

Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H

2006-10-23

Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at http://bic-gh.de/meta-all and can be downloaded free of charge and installed locally.
Meta-All: a system for managing metabolic pathway information

PubMed Central

Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H

2006-01-01

Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at and can be downloaded free of charge and installed locally. PMID:17059592
PAGER 2.0: an update to the pathway, annotated-list and gene-signature electronic repository for Human Network Biology

PubMed Central

Yue, Zongliang; Zheng, Qi; Neylon, Michael T; Yoo, Minjae; Shin, Jimin; Zhao, Zhiying; Tan, Aik Choon

2018-01-01

Abstract Integrative Gene-set, Network and Pathway Analysis (GNPA) is a powerful data analysis approach developed to help interpret high-throughput omics data. In PAGER 1.0, we demonstrated that researchers can gain unbiased and reproducible biological insights with the introduction of PAGs (Pathways, Annotated-lists and Gene-signatures) as the basic data representation elements. In PAGER 2.0, we improve the utility of integrative GNPA by significantly expanding the coverage of PAGs and PAG-to-PAG relationships in the database, defining a new metric to quantify PAG data qualities, and developing new software features to simplify online integrative GNPA. Specifically, we included 84 282 PAGs spanning 24 different data sources that cover human diseases, published gene-expression signatures, drug–gene, miRNA–gene interactions, pathways and tissue-specific gene expressions. We introduced a new normalized Cohesion Coefficient (nCoCo) score to assess the biological relevance of genes inside a PAG, and RP-score to rank genes and assign gene-specific weights inside a PAG. The companion web interface contains numerous features to help users query and navigate the database content. The database content can be freely downloaded and is compatible with third-party Gene Set Enrichment Analysis tools. We expect PAGER 2.0 to become a major resource in integrative GNPA. PAGER 2.0 is available at http://discovery.informatics.uab.edu/PAGER/. PMID:29126216
Rhizoma Dioscoreae extract protects against alveolar bone loss by regulating the cell cycle: A predictive study based on the protein‑protein interaction network.

PubMed

Zhang, Zhi-Guo; Song, Chang-Heng; Zhang, Fang-Zhen; Chen, Yan-Jing; Xiang, Li-Hua; Xiao, Gary Guishan; Ju, Da-Hong

2016-06-01

Rhizoma Dioscoreae extract (RDE) exhibits a protective effect on alveolar bone loss in ovariectomized (OVX) rats. The aim of this study was to predict the pathways or targets that are regulated by RDE, by re‑assessing our previously reported data and conducting a protein‑protein interaction (PPI) network analysis. In total, 383 differentially expressed genes (≥3‑fold) between alveolar bone samples from the RDE and OVX group rats were identified, and a PPI network was constructed based on these genes. Furthermore, four molecular clusters (A‑D) in the PPI network with the smallest P‑values were detected by molecular complex detection (MCODE) algorithm. Using Database for Annotation, Visualization and Integrated Discovery (DAVID) and Ingenuity Pathway Analysis (IPA) tools, two molecular clusters (A and B) were enriched for biological process in Gene Ontology (GO). Only cluster A was associated with biological pathways in the IPA database. GO and pathway analysis results showed that cluster A, associated with cell cycle regulation, was the most important molecular cluster in the PPI network. In addition, cyclin‑dependent kinase 1 (CDK1) may be a key molecule achieving the cell‑cycle‑regulatory function of cluster A. From the PPI network analysis, it was predicted that delayed cell cycle progression in excessive alveolar bone remodeling via downregulation of CDK1 may be another mechanism underling the anti‑osteopenic effect of RDE on alveolar bone.
SPIKE – a database, visualization and analysis tool of cellular signaling pathways

PubMed Central

Elkon, Ran; Vesterman, Rita; Amit, Nira; Ulitsky, Igor; Zohar, Idan; Weisz, Mali; Mass, Gilad; Orlev, Nir; Sternberg, Giora; Blekhman, Ran; Assa, Jackie; Shiloh, Yosef; Shamir, Ron

2008-01-01

Background Biological signaling pathways that govern cellular physiology form an intricate web of tightly regulated interlocking processes. Data on these regulatory networks are accumulating at an unprecedented pace. The assimilation, visualization and interpretation of these data have become a major challenge in biological research, and once met, will greatly boost our ability to understand cell functioning on a systems level. Results To cope with this challenge, we are developing the SPIKE knowledge-base of signaling pathways. SPIKE contains three main software components: 1) A database (DB) of biological signaling pathways. Carefully curated information from the literature and data from large public sources constitute distinct tiers of the DB. 2) A visualization package that allows interactive graphic representations of regulatory interactions stored in the DB and superposition of functional genomic and proteomic data on the maps. 3) An algorithmic inference engine that analyzes the networks for novel functional interplays between network components. SPIKE is designed and implemented as a community tool and therefore provides a user-friendly interface that allows registered users to upload data to SPIKE DB. Our vision is that the DB will be populated by a distributed and highly collaborative effort undertaken by multiple groups in the research community, where each group contributes data in its field of expertise. Conclusion The integrated capabilities of SPIKE make it a powerful platform for the analysis of signaling networks and the integration of knowledge on such networks with omics data. PMID:18289391
PyPathway: Python Package for Biological Network Analysis and Visualization.

PubMed

Xu, Yang; Luo, Xiao-Chun

2018-05-01

Life science studies represent one of the biggest generators of large data sets, mainly because of rapid sequencing technological advances. Biological networks including interactive networks and human curated pathways are essential to understand these high-throughput data sets. Biological network analysis offers a method to explore systematically not only the molecular complexity of a particular disease but also the molecular relationships among apparently distinct phenotypes. Currently, several packages for Python community have been developed, such as BioPython and Goatools. However, tools to perform comprehensive network analysis and visualization are still needed. Here, we have developed PyPathway, an extensible free and open source Python package for functional enrichment analysis, network modeling, and network visualization. The network process module supports various interaction network and pathway databases such as Reactome, WikiPathway, STRING, and BioGRID. The network analysis module implements overrepresentation analysis, gene set enrichment analysis, network-based enrichment, and de novo network modeling. Finally, the visualization and data publishing modules enable users to share their analysis by using an easy web application. For package availability, see the first Reference.
Insights into the molecular mechanisms of Polygonum multiflorum Thunb-induced liver injury: a computational systems toxicology approach.

PubMed

Wang, Yin-Yin; Li, Jie; Wu, Zeng-Rui; Zhang, Bo; Yang, Hong-Bin; Wang, Qin; Cai, Ying-Chun; Liu, Gui-Xia; Li, Wei-Hua; Tang, Yun

2017-05-01

An increasing number of cases of herb-induced liver injury (HILI) have been reported, presenting new clinical challenges. In this study, taking Polygonum multiflorum Thunb (PmT) as an example, we proposed a computational systems toxicology approach to explore the molecular mechanisms of HILI. First, the chemical components of PmT were extracted from 3 main TCM databases as well as the literature related to natural products. Then, the known targets were collected through data integration, and the potential compound-target interactions (CTIs) were predicted using our substructure-drug-target network-based inference (SDTNBI) method. After screening for hepatotoxicity-related genes by assessing the symptoms of HILI, a compound-target interaction network was constructed. A scoring function, namely, Ascore, was developed to estimate the toxicity of chemicals in the liver. We conducted network analysis to determine the possible mechanisms of the biphasic effects using the analysis tools, including BiNGO, pathway enrichment, organ distribution analysis and predictions of interactions with CYP450 enzymes. Among the chemical components of PmT, 54 components with good intestinal absorption were used for analysis, and 2939 CTIs were obtained. After analyzing the mRNA expression data in the BioGPS database, 1599 CTIs and 125 targets related to liver diseases were identified. In the top 15 compounds, seven with Ascore values >3000 (emodin, quercetin, apigenin, resveratrol, gallic acid, kaempferol and luteolin) were obviously associated with hepatotoxicity. The results from the pathway enrichment analysis suggest that multiple interactions between apoptosis and metabolism may underlie PmT-induced liver injury. Many of the pathways have been verified in specific compounds, such as glutathione metabolism, cytochrome P450 metabolism, and the p53 pathway, among others. Hepatitis symptoms, the perturbation of nine bile acids and yellow or tawny urine also had corresponding pathways, justifying our method. In conclusion, this computational systems toxicology method reveals possible toxic components and could be very helpful for understanding the mechanisms of HILI. In this way, the method might also facilitate the identification of novel hepatotoxic herbs.
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.

PubMed

Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida

2012-07-20

Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
Drug-Path: a database for drug-induced pathways.

PubMed

Zeng, Hui; Qiu, Chengxiang; Cui, Qinghua

2015-01-01

Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches. © The Author(s) 2015. Published by Oxford University Press.
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA

PubMed Central

Zhang, Peifen; Dreher, Kate; Karthikeyan, A.; Chi, Anjo; Pujar, Anuradha; Caspi, Ron; Karp, Peter; Kirkup, Vanessa; Latendresse, Mario; Lee, Cynthia; Mueller, Lukas A.; Muller, Robert; Rhee, Seung Yon

2010-01-01

Metabolic networks reconstructed from sequenced genomes or transcriptomes can help visualize and analyze large-scale experimental data, predict metabolic phenotypes, discover enzymes, engineer metabolic pathways, and study metabolic pathway evolution. We developed a general approach for reconstructing metabolic pathway complements of plant genomes. Two new reference databases were created and added to the core of the infrastructure: a comprehensive, all-plant reference pathway database, PlantCyc, and a reference enzyme sequence database, RESD, for annotating metabolic functions of protein sequences. PlantCyc (version 3.0) includes 714 metabolic pathways and 2,619 reactions from over 300 species. RESD (version 1.0) contains 14,187 literature-supported enzyme sequences from across all kingdoms. We used RESD, PlantCyc, and MetaCyc (an all-species reference metabolic pathway database), in conjunction with the pathway prediction software Pathway Tools, to reconstruct a metabolic pathway database, PoplarCyc, from the recently sequenced genome of Populus trichocarpa. PoplarCyc (version 1.0) contains 321 pathways with 1,807 assigned enzymes. Comparing PoplarCyc (version 1.0) with AraCyc (version 6.0, Arabidopsis [Arabidopsis thaliana]) showed comparable numbers of pathways distributed across all domains of metabolism in both databases, except for a higher number of AraCyc pathways in secondary metabolism and a 1.5-fold increase in carbohydrate metabolic enzymes in PoplarCyc. Here, we introduce these new resources and demonstrate the feasibility of using them to identify candidate enzymes for specific pathways and to analyze metabolite profiling data through concrete examples. These resources can be searched by text or BLAST, browsed, and downloaded from our project Web site (http://plantcyc.org). PMID:20522724
Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.

PubMed

Santos, Carlos; Eggle, Daniela; States, David J

2005-04-15

Wnt signaling is a very active area of research with highly relevant publications appearing at a rate of more than one per day. Building and maintaining databases describing signal transduction networks is a time-consuming and demanding task that requires careful literature analysis and extensive domain-specific knowledge. For instance, more than 50 factors involved in Wnt signal transduction have been identified as of late 2003. In this work we describe a natural language processing (NLP) system that is able to identify references to biological interaction networks in free text and automatically assembles a protein association and interaction map. A 'gold standard' set of names and assertions was derived by manual scanning of the Wnt genes website (http://www.stanford.edu/~rnusse/wntwindow.html) including 53 interactions involved in Wnt signaling. This system was used to analyze a corpus of peer-reviewed articles related to Wnt signaling including 3369 Pubmed and 1230 full text papers. Names for key Wnt-pathway associated proteins and biological entities are identified using a chi-squared analysis of noun phrases over-represented in the Wnt literature as compared to the general signal transduction literature. Interestingly, we identified several instances where generic terms were used on the website when more specific terms occur in the literature, and one typographic error on the Wnt canonical pathway. Using the named entity list and performing an exhaustive assertion extraction of the corpus, 34 of the 53 interactions in the 'gold standard' Wnt signaling set were successfully identified (64% recall). In addition, the automated extraction found several interactions involving key Wnt-related molecules which were missing or different from those in the canonical diagram, and these were confirmed by manual review of the text. These results suggest that a combination of NLP techniques for information extraction can form a useful first-pass tool for assisting human annotation and maintenance of signal pathway databases. The pipeline software components are freely available on request to the authors. dstates@umich.edu http://stateslab.bioinformatics.med.umich.edu/software.html.
Tcof1-Related Molecular Networks in Treacher Collins Syndrome.

PubMed

Dai, Jiewen; Si, Jiawen; Wang, Minjiao; Huang, Li; Fang, Bing; Shi, Jun; Wang, Xudong; Shen, Guofang

2016-09-01

Treacher Collins syndrome (TCS) is a rare, autosomal-dominant disorder characterized by craniofacial deformities, and is primarily caused by mutations in the Tcof1 gene. This article was aimed to perform a comprehensive literature review and systematic bioinformatic analysis of Tcof1-related molecular networks in TCS. First, the up- and down-regulated genes in Tcof1 heterozygous haploinsufficient mutant mice embryos and Tcof1 knockdown and Tcof1 over-expressed neuroblastoma N1E-115 cells were obtained from the Gene Expression Omnibus database. The GeneDecks database was used to calculate the 500 genes most closely related to Tcof1. Then, the relationships between 4 gene sets (a predicted set and sets comparing the wildtype with the 3 Gene Expression Omnibus datasets) were analyzed using the DAVID, GeneMANIA and STRING databases. The analysis results showed that the Tcof1-related genes were enriched in various biological processes, including cell proliferation, apoptosis, cell cycle, differentiation, and migration. They were also enriched in several signaling pathways, such as the ribosome, p53, cell cycle, and WNT signaling pathways. Additionally, these genes clearly had direct or indirect interactions with Tcof1 and between each other. Literature review and bioinformatic analysis finds imply that special attention should be given to these pathways, as they may offer target points for TCS therapies.

Network pharmacology-based prediction of active compounds and molecular targets in Yijin-Tang acting on hyperlipidaemia and atherosclerosis.

PubMed

Lee, A Yeong; Park, Won; Kang, Tae-Wook; Cha, Min Ho; Chun, Jin Mi

2018-07-15

Yijin-Tang (YJT) is a traditional prescription for the treatment of hyperlipidaemia, atherosclerosis and other ailments related to dampness phlegm, a typical pathological symptom of abnormal body fluid metabolism in Traditional Korean Medicine. However, a holistic network pharmacology approach to understanding the therapeutic mechanisms underlying hyperlipidaemia and atherosclerosis has not been pursued. To examine the network pharmacological potential effects of YJT on hyperlipidaemia and atherosclerosis, we analysed components, performed target prediction and network analysis, and investigated interacting pathways using a network pharmacology approach. Information on compounds in herbal medicines was obtained from public databases, and oral bioavailability and drug-likeness was screened using absorption, distribution, metabolism, and excretion (ADME) criteria. Correlations between compounds and genes were linked using the STITCH database, and genes related to hyperlipidaemia and atherosclerosis were gathered using the GeneCards database. Human genes were identified and subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Network analysis identified 447 compounds in five herbal medicines that were subjected to ADME screening, and 21 compounds and 57 genes formed the main pathways linked to hyperlipidaemia and atherosclerosis. Among them, 10 compounds (naringenin, nobiletin, hesperidin, galangin, glycyrrhizin, homogentisic acid, stigmasterol, 6-gingerol, quercetin and glabridin) were linked to more than four genes, and are bioactive compounds and key chemicals. Core genes in this network were CASP3, CYP1A1, CYP1A2, MMP2 and MMP9. The compound-target gene network revealed close interactions between multiple components and multiple targets, and facilitates a better understanding of the potential therapeutic effects of YJT. Pharmacological network analysis can help to explain the potential effects of YJT for treating dampness phlegm-related diseases such as hyperlipidaemia and atherosclerosis. Copyright © 2018 Elsevier B.V. All rights reserved.
Identification of key genes related to high-risk gastrointestinal stromal tumors using bioinformatics analysis.

PubMed

Jin, Shuan; Zhu, Wenhua; Li, Jun

2018-01-01

The purpose of this study was to identify predictive biomarkers used for clinical therapy and prognostic evaluation of high-risk gastrointestinal stromal tumors (GISTs). In this study, microarray data GSE31802 were used to identify differentially expressed genes (DEGs) between high-risk GISTs and low-risk GISTs. Then, enrichment analysis of DEGs was conducted based on the gene ontology and kyoto encyclopedia of genes and genomes pathway database. In addition, the transcription factors and cancer-related genes in DEGs were screened according to the TRANSFAC, TSGene, and TAG database. Finally, protein-protein interaction (PPI) network was constructed and analyzed to look for critical genes involved in high-risk GISTs. A total of forty DEGs were obtained and these genes were mainly involved in four pathways, including melanogenesis, neuroactive ligand-receptor interaction, malaria, and hematopoietic cell lineage. The enriched biological processes were related to the regulation of insulin secretion, integrin activation, and neuropeptide signaling pathway. Transcription factor analysis of DEGs indicated that POU domain, class 2, associating factor 1 (POU2AF1) was significantly downregulated in high-risk GISTs. By constructing the PPI network of DEGs, ten genes with high degrees formed local networks, such as PNOC, P2RY14, and SELP. Four genes as POU2AF1, PNOC, P2RY14, and SELP might be used as biomarkers for prognosis of high-risk GISTs.
Differential reconstructed gene interaction networks for deriving toxicity threshold in chemical risk assessment.

PubMed

Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping

2013-01-01

Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach. Findings from this proof-of-concept study suggest that our approach has a great potential in providing a novel and sensitive tool for threshold setting in chemical risk assessment. In future work, we plan to analyze more time-series datasets with a full spectrum of concentrations and sufficient replications per treatment. The pathway alteration-derived thresholds will also be compared with those derived from apical endpoints such as cell growth rate.
Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

PubMed

Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

2013-08-01

Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association to obesity compared to pathways identified from the original databases.
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database.

EPA Science Inventory

The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful AOP-DB user interface in...
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database

EPA Science Inventory

The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful user interface in the for...
Incorporating Information of microRNAs into Pathway Analysis in a Genome-Wide Association Study of Bipolar Disorder

PubMed Central

Shih, Wei-Liang; Kao, Chung-Feng; Chuang, Li-Chung; Kuo, Po-Hsiu

2012-01-01

MicroRNAs (miRNAs) are known to be important post-transcriptional regulators that are involved in the etiology of complex psychiatric traits. The present study aimed to incorporate miRNAs information into pathway analysis using a genome-wide association dataset to identify relevant biological pathways for bipolar disorder (BPD). We selected psychiatric- and neurological-associated miRNAs (N = 157) from PhenomiR database. The miRNA target genes (miTG) predictions were obtained from microRNA.org. Canonical pathways (N = 4,051) were downloaded from the Molecule Signature Database. We employed a novel weighting scheme for miTGs in pathway analysis using methods of gene set enrichment analysis and sum-statistic. Under four statistical scenarios, 38 significantly enriched pathways (P-value < 0.01 after multiple testing correction) were identified for the risk of developing BPD, including pathways of ion channels associated (e.g., gated channel activity, ion transmembrane transporter activity, and ion channel activity) and nervous related biological processes (e.g., nervous system development, cytoskeleton, and neuroactive ligand receptor interaction). Among them, 19 were identified only when the weighting scheme was applied. Many miRNA-targeted genes were functionally related to ion channels, collagen, and axonal growth and guidance that have been suggested to be associated with BPD previously. Some of these genes are linked to the regulation of miRNA machinery in the literature. Our findings provide support for the potential involvement of miRNAs in the psychopathology of BPD. Further investigations to elucidate the functions and mechanisms of identified candidate pathways are needed. PMID:23264780
Exercise-driven metabolic pathways in healthy cartilage.

PubMed

Blazek, A D; Nam, J; Gupta, R; Pradhan, M; Perera, P; Weisleder, N L; Hewett, T E; Chaudhari, A M; Lee, B S; Leblebicioglu, B; Butterfield, T A; Agarwal, S

2016-07-01

Exercise is vital for maintaining cartilage integrity in healthy joints. Here we examined the exercise-driven transcriptional regulation of genes in healthy rat articular cartilage to dissect the metabolic pathways responsible for the potential benefits of exercise. Transcriptome-wide gene expression in the articular cartilage of healthy Sprague-Dawley female rats exercised daily (low intensity treadmill walking) for 2, 5, or 15 days was compared to that of non-exercised rats, using Affymetrix GeneChip arrays. Database for Annotation, Visualization and Integrated Discovery (DAVID) was used for Gene Ontology (GO)-term enrichment and Functional Annotation analysis of differentially expressed genes (DEGs). Kyoto Encyclopedia of Genes and Genome (KEGG) pathway mapper was used to identify the metabolic pathways regulated by exercise. Microarray analysis revealed that exercise-induced 644 DEGs in healthy articular cartilage. The DAVID bioinformatics tool demonstrated high prevalence of functional annotation clusters with greater enrichment scores and GO-terms associated with extracellular matrix (ECM) biosynthesis/remodeling and inflammation/immune response. The KEGG database revealed that exercise regulates 147 metabolic pathways representing molecular interaction networks for Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes, Organismal Systems, and Diseases. These pathways collectively supported the complex regulation of the beneficial effects of exercise on the cartilage. Overall, the findings highlight that exercise is a robust transcriptional regulator of a wide array of metabolic pathways in healthy cartilage. The major actions of exercise involve ECM biosynthesis/cartilage strengthening and attenuation of inflammatory pathways to provide prophylaxis against onset of arthritic diseases in healthy cartilage. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
VISIBIOweb: visualization and layout services for BioPAX pathway models

PubMed Central

Dilek, Alptug; Belviranli, Mehmet E.; Dogrusoz, Ugur

2010-01-01

With recent advancements in techniques for cellular data acquisition, information on cellular processes has been increasing at a dramatic rate. Visualization is critical to analyzing and interpreting complex information; representing cellular processes or pathways is no exception. VISIBIOweb is a free, open-source, web-based pathway visualization and layout service for pathway models in BioPAX format. With VISIBIOweb, one can obtain well-laid-out views of pathway models using the standard notation of the Systems Biology Graphical Notation (SBGN), and can embed such views within one's web pages as desired. Pathway views may be navigated using zoom and scroll tools; pathway object properties, including any external database references available in the data, may be inspected interactively. The automatic layout component of VISIBIOweb may also be accessed programmatically from other tools using Hypertext Transfer Protocol (HTTP). The web site is free and open to all users and there is no login requirement. It is available at: http://visibioweb.patika.org. PMID:20460470
IntPath--an integrated pathway gene relationship database for model organisms and important pathogens.

PubMed

Zhou, Hufeng; Jin, Jingjing; Zhang, Haojun; Yi, Bo; Wozniak, Michal; Wong, Limsoon

2012-01-01

Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and relationship errors in KEGG). We turn complicated and incompatible xml data formats and inconsistent gene and gene relationship representations from different source databases into normalized and unified pathway-gene and pathway-gene pair relationships neatly recorded in simple tab-delimited text format and MySQL tables, which facilitates convenient automatic computation and large-scale referencing in many related studies. IntPath data can be downloaded in text format or MySQL dump. IntPath data can also be retrieved and analyzed conveniently through web service by local programs or through web interface by mouse clicks. Several useful analysis tools are also provided in IntPath. We have overcome in IntPath the issues of compatibility, consistency, and comprehensiveness that often hamper effective use of pathway databases. We have included four organisms in the current release of IntPath. Our methodology and programs described in this work can be easily applied to other organisms; and we will include more model organisms and important pathogens in future releases of IntPath. IntPath maintains regular updates and is freely available at http://compbio.ddns.comp.nus.edu.sg:8080/IntPath.
Identification of Biological Targets of Therapeutic Intervention for Hepatocellular Carcinoma by Integrated Bioinformatical Analysis.

PubMed

Hu, Wei Qi; Wang, Wei; Fang, Di Long; Yin, Xue Feng

2018-05-24

BACKGROUND We screened the potential molecular targets and investigated the molecular mechanisms of hepatocellular carcinoma (HCC). MATERIAL AND METHODS Microarray data of GSE47786, including the 40 μM berberine-treated HepG2 human hepatoma cell line and 0.08% DMSO-treated as control cells samples, was downloaded from the GEO database. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed; the protein-protein interaction (PPI) networks were constructed using STRING database and Cytoscape; the genetic alteration, neighboring genes networks, and survival analysis of hub genes were explored by cBio portal; and the expression of mRNA level of hub genes was obtained from the Oncomine databases. RESULTS A total of 56 upregulated and 8 downregulated DEGs were identified. The GO analysis results were significantly enriched in cell-cycle arrest, regulation of transcription, DNA-dependent, protein amino acid phosphorylation, cell cycle, and apoptosis. The KEGG pathway analysis showed that DEGs were enriched in MAPK signaling pathway, ErbB signaling pathway, and p53 signaling pathway. JUN, EGR1, MYC, and CDKN1A were identified as hub genes in PPI networks. The genetic alteration of hub genes was mainly concentrated in amplification. TP53, NDRG1, and MAPK15 were found in neighboring genes networks. Altered genes had worse overall survival and disease-free survival than unaltered genes. The expressions of EGR1, MYC, and CDKN1A were significantly increased, but expression of JUN was not, in the Roessler Liver datasets. CONCLUSIONS We found that JUN, EGR1, MYC, and CDKN1A might be used as diagnostic and therapeutic molecular biomarkers and broaden our understanding of the molecular mechanisms of HCC.
Plant Reactome: a resource for plant pathways and comparative analysis

PubMed Central

Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D.; Wu, Guanming; Fabregat, Antonio; Elser, Justin L.; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D.; Ware, Doreen; Jaiswal, Pankaj

2017-01-01

Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. PMID:27799469
Abnormal DNA methylation may contribute to the progression of osteosarcoma.

PubMed

Chen, Xiao-Gang; Ma, Liang; Xu, Jia-Xin

2018-01-01

The identification of optimal methylation biomarkers to achieve maximum diagnostic ability remains a challenge. The present study aimed to elucidate the potential molecular mechanisms underlying osteosarcoma (OS) using DNA methylation analysis. Based on the GSE36002 dataset obtained from the Gene Expression Omnibus database, differentially methylated genes were extracted between patients with OS and controls using t‑tests. Subsequently, hierarchical clustering was performed to segregate the samples into two distinct clusters, OS and normal. Gene Ontology (GO) and pathway enrichment analyses for differentially methylated genes were performed using the Database for Annotation, Visualization and Integrated Discovery tool. A protein‑protein interaction (PPI) network was established, followed by hub gene identification. Using the cut‑off threshold of ≥0.2 average β‑value difference, 3,725 unique CpGs (2,862 genes) were identified to be differentially methylated between the OS and normal groups. Among these 2,862 genes, 510 genes were differentially hypermethylated and 2,352 were differentially hypomethylated. The differentially hypermethylated genes were primarily involved in 20 GO terms, and the top 3 terms were associated with potassium ion transport. For differentially hypomethylated genes, GO functions principally included passive transmembrane transporter activity, channel activity and metal ion transmembrane transporter activity. In addition, a total of 10 significant pathways were enriched by differentially hypomethylated genes; notably, neuroactive ligand‑receptor interaction was the most significant pathway. Based on a connectivity degree >90, 7 hub genes were selected from the PPI network, including neuromedin U (NMU; degree=103) and NMU receptor 1 (NMUR1; degree=103). Functional terms (potassium ion transport, transmembrane transporter activity, and neuroactive ligand‑receptor interaction) and hub genes (NMU and NMUR1) may serve as potential targets for the treatment and diagnosis of OS.
EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Aberrant methylation patterns affect the molecular pathogenesis of rheumatoid arthritis.

PubMed

Lin, Yang; Luo, Zhengqiang

2017-05-01

This study aims to investigate DNA methylation signatures in fibroblast-like synoviocytes (FLS) from patients with rheumatoid arthritis (RA), and to explore the relationship with transcription factors (TFs) that help to distinguish RA from osteoarthritis (OA). Microarray dataset of GSE46346, including six FLS samples from patients with RA and five FLS samples from patients with OA, was downloaded from the Gene Expression Omnibus database. RA and OA samples were screened for differentially methylated loci (DMLs). The corresponding differentially methylated genes (DMGs) were identified, followed by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) enrichment analysis. A transcriptional regulatory network was built with TFs and their corresponding DMGs. Overall, 280 hypomethylated loci and 561 hypermethylated loci were screened. Genes containing hypermethylated loci were enriched in pathways in cancer, ECM-receptor interaction, focal adhesion and neurotrophin signaling pathways. Genes containing hypomethylated loci were enriched in the neurotrophin signaling pathway. Moreover, we found that CCCTC-binding factor (CTCF), Yin Yang 1 (YY1), v-myc avian myelocytomatosis viral oncogene homolog (c-MYC), and early growth response 1 (EGR1) were important TFs in the transcriptional regulatory network. Therefore, DMGs might participate in the neurotrophin signaling pathway, pathways in cancer, ECM-receptor interaction and focal adhesion pathways in RA. Furthermore, CTCF, c-MYC, YY1, and EGR1 may play important roles in RA through regulating DMGs. Copyright © 2017 Elsevier B.V. All rights reserved.
IntPath--an integrated pathway gene relationship database for model organisms and important pathogens

PubMed Central

2012-01-01

Background Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. Results In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and relationship errors in KEGG). We turn complicated and incompatible xml data formats and inconsistent gene and gene relationship representations from different source databases into normalized and unified pathway-gene and pathway-gene pair relationships neatly recorded in simple tab-delimited text format and MySQL tables, which facilitates convenient automatic computation and large-scale referencing in many related studies. IntPath data can be downloaded in text format or MySQL dump. IntPath data can also be retrieved and analyzed conveniently through web service by local programs or through web interface by mouse clicks. Several useful analysis tools are also provided in IntPath. Conclusions We have overcome in IntPath the issues of compatibility, consistency, and comprehensiveness that often hamper effective use of pathway databases. We have included four organisms in the current release of IntPath. Our methodology and programs described in this work can be easily applied to other organisms; and we will include more model organisms and important pathogens in future releases of IntPath. IntPath maintains regular updates and is freely available at http://compbio.ddns.comp.nus.edu.sg:8080/IntPath. PMID:23282057
A Systems Biology-Based Investigation into the Pharmacological Mechanisms of Sheng-ma-bie-jia-tang Acting on Systemic Lupus Erythematosus by Multi-Level Data Integration.

PubMed

Huang, Lin; Lv, Qi; Liu, Fenfen; Shi, Tieliu; Wen, Chengping

2015-11-12

Sheng-ma-bie-jia-tang (SMBJT) is a Traditional Chinese Medicine (TCM) formula that is widely used for the treatment of Systemic Lupus Erythematosus (SLE) in China. However, molecular mechanism behind this formula remains unknown. Here, we systematically analyzed targets of the ingredients in SMBJT to evaluate its potential molecular mechanism. First, we collected 1,267 targets from our previously published database, the Traditional Chinese Medicine Integrated Database (TCMID). Next, we conducted gene ontology and pathway enrichment analyses for these targets and determined that they were enriched in metabolism (amino acids, fatty acids, etc.) and signaling pathways (chemokines, Toll-like receptors, adipocytokines, etc.). 96 targets, which are known SLE disease proteins, were identified as essential targets and the rest 1,171 targets were defined as common targets of this formula. The essential targets directly interacted with SLE disease proteins. Besides, some common targets also had essential connections to both key targets and SLE disease proteins in enriched signaling pathway, e.g. toll-like receptor signaling pathway. We also found distinct function of essential and common targets in immune system processes. This multi-level approach to deciphering the underlying mechanism of SMBJT treatment of SLE details a new perspective that will further our understanding of TCM formulas.
miRwayDB: a database for experimentally validated microRNA-pathway associations in pathophysiological conditions

PubMed Central

Das, Sankha Subhra; Saha, Pritam

2018-01-01

Abstract MicroRNAs (miRNAs) are well-known as key regulators of diverse biological pathways. A series of experimental evidences have shown that abnormal miRNA expression profiles are responsible for various pathophysiological conditions by modulating genes in disease associated pathways. In spite of the rapid increase in research data confirming such associations, scientists still do not have access to a consolidated database offering these miRNA-pathway association details for critical diseases. We have developed miRwayDB, a database providing comprehensive information of experimentally validated miRNA-pathway associations in various pathophysiological conditions utilizing data collected from published literature. To the best of our knowledge, it is the first database that provides information about experimentally validated miRNA mediated pathway dysregulation as seen specifically in critical human diseases and hence indicative of a cause-and-effect relationship in most cases. The current version of miRwayDB collects an exhaustive list of miRNA-pathway association entries for 76 critical disease conditions by reviewing 663 published articles. Each database entry contains complete information on the name of the pathophysiological condition, associated miRNA(s), experimental sample type(s), regulation pattern (up/down) of miRNA, pathway association(s), targeted member of dysregulated pathway(s) and a brief description. In addition, miRwayDB provides miRNA, gene and pathway score to evaluate the role of a miRNA regulated pathways in various pathophysiological conditions. The database can also be used for other biomedical approaches such as validation of computational analysis, integrated analysis and prediction of computational model. It also offers a submission page to submit novel data from recently published studies. We believe that miRwayDB will be a useful tool for miRNA research community. Database URL: http://www.mirway.iitkgp.ac.in PMID:29688364
ZikaBase: An integrated ZIKV- Human Interactome Map database.

PubMed

Gurumayum, Sanathoi; Brahma, Rahul; Naorem, Leimarembi Devi; Muthaiyan, Mathavan; Gopal, Jeyakodi; Venkatesan, Amouda

2018-01-15

Re-emergence of ZIKV has caused infections in more than 1.5 million people. The molecular mechanism and pathogenesis of ZIKV is not well explored due to unavailability of adequate model and lack of publically accessible resources to provide information of ZIKV-Human protein interactome map till today. This study made an attempt to curate the ZIKV-Human interaction proteins from published literatures and RNA-Seq data. 11 direct interaction, 12 associated genes are retrieved from literatures and 3742 Differentially Expressed Genes (DEGs) are obtained from RNA-Seq analysis. The genes have been analyzed to construct the ZIKV-Human Interactome Map. The importance of the study has been illustrated by the enrichment analysis and observed that direct interaction and associated genes are enriched in viral entry into host cell. Also, ZIKV infection modulates 32% signal and 27% immune system pathways. The integrated database, ZikaBase has been developed to help the virology research community and accessible at https://test5.bicpu.edu.in. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of hub subnetwork based on topological features of genes in breast cancer

PubMed Central

ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO

2015-01-01

The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623

miRnalyze: an interactive database linking tool to unlock intuitive microRNA regulation of cell signaling pathways

PubMed Central

Subhra Das, Sankha; James, Mithun; Paul, Sandip

2017-01-01

Abstract The various pathophysiological processes occurring in living systems are known to be orchestrated by delicate interplays and cross-talks between different genes and their regulators. Among the various regulators of genes, there is a class of small non-coding RNA molecules known as microRNAs. Although, the relative simplicity of miRNAs and their ability to modulate cellular processes make them attractive therapeutic candidates, their presence in large numbers make it challenging for experimental researchers to interpret the intricacies of the molecular processes they regulate. Most of the existing bioinformatic tools fail to address these challenges. Here, we present a new web resource ‘miRnalyze’ that has been specifically designed to directly identify the putative regulation of cell signaling pathways by miRNAs. The tool integrates miRNA-target predictions with signaling cascade members by utilizing TargetScanHuman 7.1 miRNA-target prediction tool and the KEGG pathway database, and thus provides researchers with in-depth insights into modulation of signal transduction pathways by miRNAs. miRnalyze is capable of identifying common miRNAs targeting more than one gene in the same signaling pathway—a feature that further increases the probability of modulating the pathway and downstream reactions when using miRNA modulators. Additionally, miRnalyze can sort miRNAs according to the seed-match types and TargetScan Context ++ score, thus providing a hierarchical list of most valuable miRNAs. Furthermore, in order to provide users with comprehensive information regarding miRNAs, genes and pathways, miRnalyze also links to expression data of miRNAs (miRmine) and genes (TiGER) and proteome abundance (PaxDb) data. To validate the capability of the tool, we have documented the correlation of miRnalyze’s prediction with experimental confirmation studies. Database URL: http://www.mirnalyze.in PMID:28365733
GraphSAW: a web-based system for graphical analysis of drug interactions and side effects using pharmaceutical and molecular data.

PubMed

Shoshi, Alban; Hoppe, Tobias; Kormeier, Benjamin; Ogultarhan, Venus; Hofestädt, Ralf

2015-02-28

Adverse drug reactions are one of the most common causes of death in industrialized Western countries. Nowadays, empirical data from clinical studies for the approval and monitoring of drugs and molecular databases is available. The integration of database information is a promising method for providing well-based knowledge to avoid adverse drug reactions. This paper presents our web-based decision support system GraphSAW which analyzes and evaluates drug interactions and side effects based on data from two commercial and two freely available molecular databases. The system is able to analyze single and combined drug-drug interactions, drug-molecule interactions as well as single and cumulative side effects. In addition, it allows exploring associative networks of drugs, molecules, metabolic pathways, and diseases in an intuitive way. The molecular medication analysis includes the capabilities of the upper features. A statistical evaluation of the integrated data and top 20 drugs concerning drug interactions and side effects is performed. The results of the data analysis give an overview of all theoretically possible drug interactions and side effects. The evaluation shows a mismatch between pharmaceutical and molecular databases. The concordance of drug interactions was about 12% and 9% of drug side effects. An application case with prescribed data of 11 patients is presented in order to demonstrate the functionality of the system under real conditions. For each patient at least two interactions occured in every medication and about 8% of total diseases were possibly induced by drug therapy. GraphSAW (http://tunicata.techfak.uni-bielefeld.de/graphsaw/) is meant to be a web-based system for health professionals and researchers. GraphSAW provides comprehensive drug-related knowledge and an improved medication analysis which may support efforts to reduce the risk of medication errors and numerous drastic side effects.
Plant Reactome: a resource for plant pathways and comparative analysis.

PubMed

Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D; Wu, Guanming; Fabregat, Antonio; Elser, Justin L; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D; Ware, Doreen; Jaiswal, Pankaj

2017-01-04

Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Signaling gateway molecule pages—a data model perspective

PubMed Central

Dinasarapu, Ashok Reddy; Saunders, Brian; Ozerlat, Iley; Azam, Kenan; Subramaniam, Shankar

2011-01-01

Summary: The Signaling Gateway Molecule Pages (SGMP) database provides highly structured data on proteins which exist in different functional states participating in signal transduction pathways. A molecule page starts with a state of a native protein, without any modification and/or interactions. New states are formed with every post-translational modification or interaction with one or more proteins, small molecules or class molecules and with each change in cellular location. State transitions are caused by a combination of one or more modifications, interactions and translocations which then might be associated with one or more biological processes. In a characterized biological state, a molecule can function as one of several entities or their combinations, including channel, receptor, enzyme, transcription factor and transporter. We have also exported SGMP data to the Biological Pathway Exchange (BioPAX) and Systems Biology Markup Language (SBML) as well as in our custom XML. Availability: SGMP is available at www.signaling-gateway.org/molecule. Contact: shankar@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21505029
MoCha: Molecular Characterization of Unknown Pathways.

PubMed

Lobo, Daniel; Hammelman, Jennifer; Levin, Michael

2016-04-01

Automated methods for the reverse-engineering of complex regulatory networks are paving the way for the inference of mechanistic comprehensive models directly from experimental data. These novel methods can infer not only the relations and parameters of the known molecules defined in their input datasets, but also unknown components and pathways identified as necessary by the automated algorithms. Identifying the molecular nature of these unknown components is a crucial step for making testable predictions and experimentally validating the models, yet no specific and efficient tools exist to aid in this process. To this end, we present here MoCha (Molecular Characterization), a tool optimized for the search of unknown proteins and their pathways from a given set of known interacting proteins. MoCha uses the comprehensive dataset of protein-protein interactions provided by the STRING database, which currently includes more than a billion interactions from over 2,000 organisms. MoCha is highly optimized, performing typical searches within seconds. We demonstrate the use of MoCha with the characterization of unknown components from reverse-engineered models from the literature. MoCha is useful for working on network models by hand or as a downstream step of a model inference engine workflow and represents a valuable and efficient tool for the characterization of unknown pathways using known data from thousands of organisms. MoCha and its source code are freely available online under the GPLv3 license.
ChemProt-2.0: visual navigation in a disease chemical biology database

PubMed Central

Kim Kjærulff, Sonny; Wich, Louis; Kringelum, Jens; Jacobsen, Ulrik P.; Kouskoumvekaki, Irene; Audouze, Karine; Lund, Ole; Brunak, Søren; Oprea, Tudor I.; Taboureau, Olivier

2013-01-01

ChemProt-2.0 (http://www.cbs.dtu.dk/services/ChemProt-2.0) is a public available compilation of multiple chemical–protein annotation resources integrated with diseases and clinical outcomes information. The database has been updated to >1.15 million compounds with 5.32 millions bioactivity measurements for 15 290 proteins. Each protein is linked to quality-scored human protein–protein interactions data based on more than half a million interactions, for studying diseases and biological outcomes (diseases, pathways and GO terms) through protein complexes. In ChemProt-2.0, therapeutic effects as well as adverse drug reactions have been integrated allowing for suggesting proteins associated to clinical outcomes. New chemical structure fingerprints were computed based on the similarity ensemble approach. Protein sequence similarity search was also integrated to evaluate the promiscuity of proteins, which can help in the prediction of off-target effects. Finally, the database was integrated into a visual interface that enables navigation of the pharmacological space for small molecules. Filtering options were included in order to facilitate and to guide dynamic search of specific queries. PMID:23185041
Modeling of cell signaling pathways in macrophages by semantic networks

PubMed Central

Hsing, Michael; Bellenson, Joel L; Shankey, Conor; Cherkasov, Artem

2004-01-01

Background Substantial amounts of data on cell signaling, metabolic, gene regulatory and other biological pathways have been accumulated in literature and electronic databases. Conventionally, this information is stored in the form of pathway diagrams and can be characterized as highly "compartmental" (i.e. individual pathways are not connected into more general networks). Current approaches for representing pathways are limited in their capacity to model molecular interactions in their spatial and temporal context. Moreover, the critical knowledge of cause-effect relationships among signaling events is not reflected by most conventional approaches for manipulating pathways. Results We have applied a semantic network (SN) approach to develop and implement a model for cell signaling pathways. The semantic model has mapped biological concepts to a set of semantic agents and relationships, and characterized cell signaling events and their participants in the hierarchical and spatial context. In particular, the available information on the behaviors and interactions of the PI3K enzyme family has been integrated into the SN environment and a cell signaling network in human macrophages has been constructed. A SN-application has been developed to manipulate the locations and the states of molecules and to observe their actions under different biological scenarios. The approach allowed qualitative simulation of cell signaling events involving PI3Ks and identified pathways of molecular interactions that led to known cellular responses as well as other potential responses during bacterial invasions in macrophages. Conclusions We concluded from our results that the semantic network is an effective method to model cell signaling pathways. The semantic model allows proper representation and integration of information on biological structures and their interactions at different levels. The reconstruction of the cell signaling network in the macrophage allowed detailed investigation of connections among various essential molecules and reflected the cause-effect relationships among signaling events. The simulation demonstrated the dynamics of the semantic network, where a change of states on a molecule can alter its function and potentially cause a chain-reaction effect in the system. PMID:15494071
Domain fusion analysis by applying relational algebra to protein sequence and domain databases

PubMed Central

Truong, Kevin; Ikura, Mitsuhiko

2003-01-01

Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Reconstruction of metabolic pathways for the cattle genome

PubMed Central

Seo, Seongwon; Lewin, Harris A

2009-01-01

Background Metabolic reconstruction of microbial, plant and animal genomes is a necessary step toward understanding the evolutionary origins of metabolism and species-specific adaptive traits. The aims of this study were to reconstruct conserved metabolic pathways in the cattle genome and to identify metabolic pathways with missing genes and proteins. The MetaCyc database and PathwayTools software suite were chosen for this work because they are widely used and easy to implement. Results An amalgamated cattle genome database was created using the NCBI and Ensembl cattle genome databases (based on build 3.1) as data sources. PathwayTools was used to create a cattle-specific pathway genome database, which was followed by comprehensive manual curation for the reconstruction of metabolic pathways. The curated database, CattleCyc 1.0, consists of 217 metabolic pathways. A total of 64 mammalian-specific metabolic pathways were modified from the reference pathways in MetaCyc, and two pathways previously identified but missing from MetaCyc were added. Comparative analysis of metabolic pathways revealed the absence of mammalian genes for 22 metabolic enzymes whose activity was reported in the literature. We also identified six human metabolic protein-coding genes for which the cattle ortholog is missing from the sequence assembly. Conclusion CattleCyc is a powerful tool for understanding the biology of ruminants and other cetartiodactyl species. In addition, the approach used to develop CattleCyc provides a framework for the metabolic reconstruction of other newly sequenced mammalian genomes. It is clear that metabolic pathway analysis strongly reflects the quality of the underlying genome annotations. Thus, having well-annotated genomes from many mammalian species hosted in BioCyc will facilitate the comparative analysis of metabolic pathways among different species and a systems approach to comparative physiology. PMID:19284618
RaMP: A Comprehensive Relational Database of Metabolomics Pathways for Pathway Enrichment Analysis of Genes and Metabolites

PubMed Central

Zhang, Bofei; Hu, Senyang; Baskin, Elizabeth; Patt, Andrew; Siddiqui, Jalal K.

2018-01-01

The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms that underlie cell type or disease-specific metabolomics profiles are oftentimes unknown. To help fully exploit metabolomics data and to aid in its interpretation, analysis of metabolomics data with other complementary omics data, including transcriptomics, is helpful. To facilitate such analyses at a pathway level, we have developed RaMP (Relational database of Metabolomics Pathways), which combines biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, WikiPathways, and the Human Metabolome DataBase (HMDB). To the best of our knowledge, an off-the-shelf, public database that maps genes and metabolites to biochemical/disease pathways and can readily be integrated into other existing software is currently lacking. For consistent and comprehensive analysis, RaMP enables batch and complex queries (e.g., list all metabolites involved in glycolysis and lung cancer), can readily be integrated into pathway analysis tools, and supports pathway overrepresentation analysis given a list of genes and/or metabolites of interest. For usability, we have developed a RaMP R package (https://github.com/Mathelab/RaMP-DB), including a user-friendly RShiny web application, that supports basic simple and batch queries, pathway overrepresentation analysis given a list of genes or metabolites of interest, and network visualization of gene-metabolite relationships. The package also includes the raw database file (mysql dump), thereby providing a stand-alone downloadable framework for public use and integration with other tools. In addition, the Python code needed to recreate the database on another system is also publicly available (https://github.com/Mathelab/RaMP-BackEnd). Updates for databases in RaMP will be checked multiple times a year and RaMP will be updated accordingly. PMID:29470400
RaMP: A Comprehensive Relational Database of Metabolomics Pathways for Pathway Enrichment Analysis of Genes and Metabolites.

PubMed

Zhang, Bofei; Hu, Senyang; Baskin, Elizabeth; Patt, Andrew; Siddiqui, Jalal K; Mathé, Ewy A

2018-02-22

The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms that underlie cell type or disease-specific metabolomics profiles are oftentimes unknown. To help fully exploit metabolomics data and to aid in its interpretation, analysis of metabolomics data with other complementary omics data, including transcriptomics, is helpful. To facilitate such analyses at a pathway level, we have developed RaMP (Relational database of Metabolomics Pathways), which combines biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, WikiPathways, and the Human Metabolome DataBase (HMDB). To the best of our knowledge, an off-the-shelf, public database that maps genes and metabolites to biochemical/disease pathways and can readily be integrated into other existing software is currently lacking. For consistent and comprehensive analysis, RaMP enables batch and complex queries (e.g., list all metabolites involved in glycolysis and lung cancer), can readily be integrated into pathway analysis tools, and supports pathway overrepresentation analysis given a list of genes and/or metabolites of interest. For usability, we have developed a RaMP R package (https://github.com/Mathelab/RaMP-DB), including a user-friendly RShiny web application, that supports basic simple and batch queries, pathway overrepresentation analysis given a list of genes or metabolites of interest, and network visualization of gene-metabolite relationships. The package also includes the raw database file (mysql dump), thereby providing a stand-alone downloadable framework for public use and integration with other tools. In addition, the Python code needed to recreate the database on another system is also publicly available (https://github.com/Mathelab/RaMP-BackEnd). Updates for databases in RaMP will be checked multiple times a year and RaMP will be updated accordingly.
SoyFN: a knowledge database of soybean functional networks.

PubMed

Xu, Yungang; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

2014-01-01

Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
Assessing co-regulation of directly linked genes in biological networks using microarray time series analysis.

PubMed

Del Sorbo, Maria Rosaria; Balzano, Walter; Donato, Michele; Draghici, Sorin

2013-11-01

Differential expression of genes detected with the analysis of high throughput genomic experiments is a commonly used intermediate step for the identification of signaling pathways involved in the response to different biological conditions. The impact analysis was the first approach for the analysis of signaling pathways involved in a certain biological process that was able to take into account not only the magnitude of the expression change of the genes but also the topology of signaling pathways including the type of each interactions between the genes. In the impact analysis, signaling pathways are represented as weighted directed graphs with genes as nodes and the interactions between genes as edges. Edges weights are represented by a β factor, the regulatory efficiency, which is assumed to be equal to 1 in inductive interactions between genes and equal to -1 in repressive interactions. This study presents a similarity analysis between gene expression time series aimed to find correspondences with the regulatory efficiency, i.e. the β factor as found in a widely used pathway database. Here, we focused on correlations among genes directly connected in signaling pathways, assuming that the expression variations of upstream genes impact immediately downstream genes in a short time interval and without significant influences by the interactions with other genes. Time series were processed using three different similarity metrics. The first metric is based on the bit string matching; the second one is a specific application of the Dynamic Time Warping to detect similarities even in presence of stretching and delays; the third one is a quantitative comparative analysis resulting by an evaluation of frequency domain representation of time series: the similarity metric is the correlation between dominant spectral components. These three approaches are tested on real data and pathways, and a comparison is performed using Information Retrieval benchmark tools, indicating the frequency approach as the best similarity metric among the three, for its ability to detect the correlation based on the correspondence of the most significant frequency components. Copyright © 2013. Published by Elsevier Ireland Ltd.
Systematic reconstruction of TRANSPATH data into Cell System Markup Language

PubMed Central

Nagasaki, Masao; Saito, Ayumu; Li, Chen; Jeong, Euna; Miyano, Satoru

2008-01-01

Background Many biological repositories store information based on experimental study of the biological processes within a cell, such as protein-protein interactions, metabolic pathways, signal transduction pathways, or regulations of transcription factors and miRNA. Unfortunately, it is difficult to directly use such information when generating simulation-based models. Thus, modeling rules for encoding biological knowledge into system-dynamics-oriented standardized formats would be very useful for fully understanding cellular dynamics at the system level. Results We selected the TRANSPATH database, a manually curated high-quality pathway database, which provides a plentiful source of cellular events in humans, mice, and rats, collected from over 31,500 publications. In this work, we have developed 16 modeling rules based on hybrid functional Petri net with extension (HFPNe), which is suitable for graphical representing and simulating biological processes. In the modeling rules, each Petri net element is incorporated with Cell System Ontology to enable semantic interoperability of models. As a formal ontology for biological pathway modeling with dynamics, CSO also defines biological terminology and corresponding icons. By combining HFPNe with the CSO features, it is possible to make TRANSPATH data to simulation-based and semantically valid models. The results are encoded into a biological pathway format, Cell System Markup Language (CSML), which eases the exchange and integration of biological data and models. Conclusion By using the 16 modeling rules, 97% of the reactions in TRANSPATH are converted into simulation-based models represented in CSML. This reconstruction demonstrates that it is possible to use our rules to generate quantitative models from static pathway descriptions. PMID:18570683
Systematic reconstruction of TRANSPATH data into cell system markup language.

PubMed

Nagasaki, Masao; Saito, Ayumu; Li, Chen; Jeong, Euna; Miyano, Satoru

2008-06-23

Many biological repositories store information based on experimental study of the biological processes within a cell, such as protein-protein interactions, metabolic pathways, signal transduction pathways, or regulations of transcription factors and miRNA. Unfortunately, it is difficult to directly use such information when generating simulation-based models. Thus, modeling rules for encoding biological knowledge into system-dynamics-oriented standardized formats would be very useful for fully understanding cellular dynamics at the system level. We selected the TRANSPATH database, a manually curated high-quality pathway database, which provides a plentiful source of cellular events in humans, mice, and rats, collected from over 31,500 publications. In this work, we have developed 16 modeling rules based on hybrid functional Petri net with extension (HFPNe), which is suitable for graphical representing and simulating biological processes. In the modeling rules, each Petri net element is incorporated with Cell System Ontology to enable semantic interoperability of models. As a formal ontology for biological pathway modeling with dynamics, CSO also defines biological terminology and corresponding icons. By combining HFPNe with the CSO features, it is possible to make TRANSPATH data to simulation-based and semantically valid models. The results are encoded into a biological pathway format, Cell System Markup Language (CSML), which eases the exchange and integration of biological data and models. By using the 16 modeling rules, 97% of the reactions in TRANSPATH are converted into simulation-based models represented in CSML. This reconstruction demonstrates that it is possible to use our rules to generate quantitative models from static pathway descriptions.
KIDFamMap: a database of kinase-inhibitor-disease family maps for kinase inhibitor selectivity and binding mechanisms

PubMed Central

Chiu, Yi-Yuan; Lin, Chih-Ta; Huang, Jhang-Wei; Hsu, Kai-Cheng; Tseng, Jen-Hu; You, Syuan-Ren; Yang, Jinn-Moon

2013-01-01

Kinases play central roles in signaling pathways and are promising therapeutic targets for many diseases. Designing selective kinase inhibitors is an emergent and challenging task, because kinases share an evolutionary conserved ATP-binding site. KIDFamMap (http://gemdock.life.nctu.edu.tw/KIDFamMap/) is the first database to explore kinase-inhibitor families (KIFs) and kinase-inhibitor-disease (KID) relationships for kinase inhibitor selectivity and mechanisms. This database includes 1208 KIFs, 962 KIDs, 55 603 kinase-inhibitor interactions (KIIs), 35 788 kinase inhibitors, 399 human protein kinases, 339 diseases and 638 disease allelic variants. Here, a KIF can be defined as follows: (i) the kinases in the KIF with significant sequence similarity, (ii) the inhibitors in the KIF with significant topology similarity and (iii) the KIIs in the KIF with significant interaction similarity. The KIIs within a KIF are often conserved on some consensus KIDFamMap anchors, which represent conserved interactions between the kinase subsites and consensus moieties of their inhibitors. Our experimental results reveal that the members of a KIF often possess similar inhibition profiles. The KIDFamMap anchors can reflect kinase conformations types, kinase functions and kinase inhibitor selectivity. We believe that KIDFamMap provides biological insights into kinase inhibitor selectivity and binding mechanisms. PMID:23193279
ARACNe-based inference, using curated microarray data, of Arabidopsis thaliana root transcriptional regulatory networks

PubMed Central

2014-01-01

Background Uncovering the complex transcriptional regulatory networks (TRNs) that underlie plant and animal development remains a challenge. However, a vast amount of data from public microarray experiments is available, which can be subject to inference algorithms in order to recover reliable TRN architectures. Results In this study we present a simple bioinformatics methodology that uses public, carefully curated microarray data and the mutual information algorithm ARACNe in order to obtain a database of transcriptional interactions. We used data from Arabidopsis thaliana root samples to show that the transcriptional regulatory networks derived from this database successfully recover previously identified root transcriptional modules and to propose new transcription factors for the SHORT ROOT/SCARECROW and PLETHORA pathways. We further show that these networks are a powerful tool to integrate and analyze high-throughput expression data, as exemplified by our analysis of a SHORT ROOT induction time-course microarray dataset, and are a reliable source for the prediction of novel root gene functions. In particular, we used our database to predict novel genes involved in root secondary cell-wall synthesis and identified the MADS-box TF XAL1/AGL12 as an unexpected participant in this process. Conclusions This study demonstrates that network inference using carefully curated microarray data yields reliable TRN architectures. In contrast to previous efforts to obtain root TRNs, that have focused on particular functional modules or tissues, our root transcriptional interactions provide an overview of the transcriptional pathways present in Arabidopsis thaliana roots and will likely yield a plethora of novel hypotheses to be tested experimentally. PMID:24739361
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

PubMed Central

Caspi, Ron; Altman, Tomer; Dale, Joseph M.; Dreher, Kate; Fulcher, Carol A.; Gilham, Fred; Kaipa, Pallavi; Karthikeyan, Athikkattuvalasu S.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Paley, Suzanne; Popescu, Liviu; Pujar, Anuradha; Shearer, Alexander G.; Zhang, Peifen; Karp, Peter D.

2010-01-01

The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. With more than 1400 pathways, MetaCyc is the largest collection of metabolic pathways currently available. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations. BioCyc (BioCyc.org) is a collection of more than 500 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs also contain additional features, such as predicted operons, transport systems, and pathway hole-fillers. The BioCyc Web site offers several tools for the analysis of the PGDBs, including Omics Viewers that enable visualization of omics datasets on two different genome-scale diagrams and tools for comparative analysis. The BioCyc PGDBs generated by SRI are offered for adoption by any party interested in curation of metabolic, regulatory, and genome-related information about an organism. PMID:19850718
Protein-protein interaction analysis of Alzheimer`s disease and NAFLD based on systems biology methods unhide common ancestor pathways.

PubMed

Karbalaei, Reza; Allahyari, Marzieh; Rezaei-Tavirani, Mostafa; Asadzadeh-Aghdaei, Hamid; Zali, Mohammad Reza

2018-01-01

Analysis reconstruction networks from two diseases, NAFLD and Alzheimer`s diseases and their relationship based on systems biology methods. NAFLD and Alzheimer`s diseases are two complex diseases, with progressive prevalence and high cost for countries. There are some reports on relation and same spreading pathways of these two diseases. In addition, they have some similar risk factors, exclusively lifestyle such as feeding, exercises and so on. Therefore, systems biology approach can help to discover their relationship. DisGeNET and STRING databases were sources of disease genes and constructing networks. Three plugins of Cytoscape software, including ClusterONE, ClueGO and CluePedia, were used to analyze and cluster networks and enrichment of pathways. An R package used to define best centrality method. Finally, based on degree and Betweenness, hubs and bottleneck nodes were defined. Common genes between NAFLD and Alzheimer`s disease were 190 genes that used construct a network with STRING database. The resulting network contained 182 nodes and 2591 edges and comprises from four clusters. Enrichment of these clusters separately lead to carbohydrate metabolism, long chain fatty acid and regulation of JAK-STAT and IL-17 signaling pathways, respectively. Also seven genes selected as hub-bottleneck include: IL6, AKT1, TP53, TNF, JUN, VEGFA and PPARG. Enrichment of these proteins and their first neighbors in network by OMIM database lead to diabetes and obesity as ancestors of NAFLD and AD. Systems biology methods, specifically PPI networks, can be useful for analyzing complicated related diseases. Finding Hub and bottleneck proteins should be the goal of drug designing and introducing disease markers.
Reactome graph database: Efficient access to complex pathway data

PubMed Central

Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

2018-01-01

Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902

Reactome graph database: Efficient access to complex pathway data.

PubMed

Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

2018-01-01

Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

PubMed

Tuo, Youlin; An, Ning; Zhang, Ming

2018-03-01

The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non‑metastasis samples were screened under the threshold of P<0.05. Based on the protein‑protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non‑metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin‑dependent kinase 2 (CDK2), myelocytomatosis proto‑oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non‑ATPase 2 and telomeric repeat binding factor 2. The cyclin‑dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non‑metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non‑metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer.
In-Depth Transcriptome Sequencing of Mexican Lime Trees Infected with Candidatus Phytoplasma aurantifolia.

PubMed

Mardi, Mohsen; Karimi Farsad, Laleh; Gharechahi, Javad; Salekdeh, Ghasem Hosseini

2015-01-01

Witches' broom disease of acid lime greatly affects the production of Mexican lime in Iran. It is caused by a phytoplasma (Candidatus Phytoplasma aurantifolia). However, the molecular mechanisms that underlie phytoplasma pathogenicity and the mode of interactions with host plants are largely unknown. Here, high-throughput transcriptome sequencing was conducted to explore gene expression signatures associated with phytoplasma infection in Mexican lime trees. We assembled 78,185 unique transcript sequences (unigenes) with an average length of 530 nt. Of these, 41,805 (53.4%) were annotated against the NCBI non-redundant (nr) protein database using a BLASTx search (e-value ≤ 1e-5). When the abundances of unigenes in healthy and infected plants were compared, 2,805 transcripts showed significant differences (false discovery rate ≤ 0.001 and log2 ratio ≥ 1.5). These differentially expressed genes (DEGs) were significantly enriched in 43 KEGG metabolic and regulatory pathways. The up-regulated DEGs were mainly categorized into pathways with possible implication in plant-pathogen interaction, including cell wall biogenesis and degradation, sucrose metabolism, secondary metabolism, hormone biosynthesis and signalling, amino acid and lipid metabolism, while down-regulated DEGs were predominantly enriched in ubiquitin proteolysis and oxidative phosphorylation pathways. Our analysis provides novel insight into the molecular pathways that are deregulated during the host-pathogen interaction in Mexican lime trees infected by phytoplasma. The findings can be valuable for unravelling the molecular mechanisms of plant-phytoplasma interactions and can pave the way for engineering lime trees with resistance to witches' broom disease.
Text mining-based in silico drug discovery in oral mucositis caused by high-dose cancer therapy.

PubMed

Kirk, Jon; Shah, Nirav; Noll, Braxton; Stevens, Craig B; Lawler, Marshall; Mougeot, Farah B; Mougeot, Jean-Luc C

2018-08-01

Oral mucositis (OM) is a major dose-limiting side effect of chemotherapy and radiation used in cancer treatment. Due to the complex nature of OM, currently available drug-based treatments are of limited efficacy. Our objectives were (i) to determine genes and molecular pathways associated with OM and wound healing using computational tools and publicly available data and (ii) to identify drugs formulated for topical use targeting the relevant OM molecular pathways. OM and wound healing-associated genes were determined by text mining, and the intersection of the two gene sets was selected for gene ontology analysis using the GeneCodis program. Protein interaction network analysis was performed using STRING-db. Enriched gene sets belonging to the identified pathways were queried against the Drug-Gene Interaction database to find drug candidates for topical use in OM. Our analysis identified 447 genes common to both the "OM" and "wound healing" text mining concepts. Gene enrichment analysis yielded 20 genes representing six pathways and targetable by a total of 32 drugs which could possibly be formulated for topical application. A manual search on ClinicalTrials.gov confirmed no relevant pathway/drug candidate had been overlooked. Twenty-five of the 32 drugs can directly affect the PTGS2 (COX-2) pathway, the pathway that has been targeted in previous clinical trials with limited success. Drug discovery using in silico text mining and pathway analysis tools can facilitate the identification of existing drugs that have the potential of topical administration to improve OM treatment.
Web-based metabolic network visualization with a zooming user interface

PubMed Central

2011-01-01

Background Displaying complex metabolic-map diagrams, for Web browsers, and allowing users to interact with them for querying and overlaying expression data over them is challenging. Description We present a Web-based metabolic-map diagram, which can be interactively explored by the user, called the Cellular Overview. The main characteristic of this application is the zooming user interface enabling the user to focus on appropriate granularities of the network at will. Various searching commands are available to visually highlight sets of reactions, pathways, enzymes, metabolites, and so on. Expression data from single or multiple experiments can be overlaid on the diagram, which we call the Omics Viewer capability. The application provides Web services to highlight the diagram and to invoke the Omics Viewer. This application is entirely written in JavaScript for the client browsers and connect to a Pathway Tools Web server to retrieve data and diagrams. It uses the OpenLayers library to display tiled diagrams. Conclusions This new online tool is capable of displaying large and complex metabolic-map diagrams in a very interactive manner. This application is available as part of the Pathway Tools software that powers multiple metabolic databases including Biocyc.org: The Cellular Overview is accessible under the Tools menu. PMID:21595965
Screening of differentially expressed genes between multiple trauma patients with and without sepsis.

PubMed

Ji, S C; Pan, Y T; Lu, Q Y; Sun, Z Y; Liu, Y Z

2014-03-17

The purpose of this study was to identify critical genes associated with septic multiple trauma by comparing peripheral whole blood samples from multiple trauma patients with and without sepsis. A microarray data set was downloaded from the Gene Expression Omnibus (GEO) database. This data set included 70 samples, 36 from multiple trauma patients with sepsis and 34 from multiple trauma patients without sepsis (as a control set). The data were preprocessed, and differentially expressed genes (DEGs) were then screened for using packages of the R language. Functional analysis of DEGs was performed with DAVID. Interaction networks were then established for the most up- and down-regulated genes using HitPredict. Pathway-enrichment analysis was conducted for genes in the networks using WebGestalt. Fifty-eight DEGs were identified. The expression levels of PLAU (down-regulated) and MMP8 (up-regulated) presented the largest fold-changes, and interaction networks were established for these genes. Further analysis revealed that PLAT (plasminogen activator, tissue) and SERPINF2 (serpin peptidase inhibitor, clade F, member 2), which interact with PLAU, play important roles in the pathway of the component and coagulation cascade. We hypothesize that PLAU is a major regulator of the component and coagulation cascade, and down-regulation of PLAU results in dysfunction of the pathway, causing sepsis.
Comparison of human cell signaling pathway databases—evolution, drawbacks and challenges

PubMed Central

Chowdhury, Saikat; Sarkar, Ram Rup

2015-01-01

Elucidating the complexities of cell signaling pathways is of immense importance to gain understanding about various biological phenomenon, such as dynamics of gene/protein expression regulation, cell fate determination, embryogenesis and disease progression. The successful completion of human genome project has also helped experimental and theoretical biologists to analyze various important pathways. To advance this study, during the past two decades, systematic collections of pathway data from experimental studies have been compiled and distributed freely by several databases, which also integrate various computational tools for further analysis. Despite significant advancements, there exist several drawbacks and challenges, such as pathway data heterogeneity, annotation, regular update and automated image reconstructions, which motivated us to perform a thorough review on popular and actively functioning 24 cell signaling databases. Based on two major characteristics, pathway information and technical details, freely accessible data from commercial and academic databases are examined to understand their evolution and enrichment. This review not only helps to identify some novel and useful features, which are not yet included in any of the databases but also highlights their current limitations and subsequently propose the reasonable solutions for future database development, which could be useful to the whole scientific community. PMID:25632107
Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds.

PubMed

Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong

2016-12-01

Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
PATIKAweb: a Web interface for analyzing biological pathways through advanced querying and visualization.

PubMed

Dogrusoz, U; Erson, E Z; Giral, E; Demir, E; Babur, O; Cetintas, A; Colak, R

2006-02-01

Patikaweb provides a Web interface for retrieving and analyzing biological pathways in the Patika database, which contains data integrated from various prominent public pathway databases. It features a user-friendly interface, dynamic visualization and automated layout, advanced graph-theoretic queries for extracting biologically important phenomena, local persistence capability and exporting facilities to various pathway exchange formats.
Pathway enrichment analysis approach based on topological structure and updated annotation of pathway.

PubMed

Yang, Qian; Wang, Shuyuan; Dai, Enyu; Zhou, Shunheng; Liu, Dianming; Liu, Haizhou; Meng, Qianqian; Jiang, Bin; Jiang, Wei

2017-08-16

Pathway enrichment analysis has been widely used to identify cancer risk pathways, and contributes to elucidating the mechanism of tumorigenesis. However, most of the existing approaches use the outdated pathway information and neglect the complex gene interactions in pathway. Here, we first reviewed the existing widely used pathway enrichment analysis approaches briefly, and then, we proposed a novel topology-based pathway enrichment analysis (TPEA) method, which integrated topological properties and global upstream/downstream positions of genes in pathways. We compared TPEA with four widely used pathway enrichment analysis tools, including database for annotation, visualization and integrated discovery (DAVID), gene set enrichment analysis (GSEA), centrality-based pathway enrichment (CePa) and signaling pathway impact analysis (SPIA), through analyzing six gene expression profiles of three tumor types (colorectal cancer, thyroid cancer and endometrial cancer). As a result, we identified several well-known cancer risk pathways that could not be obtained by the existing tools, and the results of TPEA were more stable than that of the other tools in analyzing different data sets of the same cancer. Ultimately, we developed an R package to implement TPEA, which could online update KEGG pathway information and is available at the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/TPEA/. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Evolution of the NASA/IPAC Extragalactic Database (NED) into a Data Mining Discovery Engine

NASA Astrophysics Data System (ADS)

Mazzarella, Joseph M.; NED Team

2017-06-01

We review recent advances and ongoing work in evolving the NASA/IPAC Extragalactic Database (NED) beyond an object reference database into a data mining discovery engine. Updates to the infrastructure and data integration techniques are enabling more than a 10-fold expansion; NED will soon contain over a billion objects with their fundamental attributes fused across the spectrum via cross-identifications among the largest sky surveys (e.g., GALEX, SDSS, 2MASS, AllWISE, EMU), and over 100,000 smaller but scientifically important catalogs and journal articles. The recent discovery of super-luminous spiral galaxies exemplifies the opportunities for data mining and science discovery directly from NED's rich data synthesis. Enhancements to the user interface, including new APIs, VO protocols, and queries involving derived physical quantities, are opening new pathways for panchromatic studies of large galaxy samples. Examples are shown of graphics characterizing the content of NED, as well as initial steps in exploring the database via interactive statistical visualizations.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.

PubMed

Truong, Kevin; Ikura, Mitsuhiko

2003-05-06

Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
SignaLink 2 – a signaling pathway resource with multi-layered regulatory networks

PubMed Central

2013-01-01

Background Signaling networks in eukaryotes are made up of upstream and downstream subnetworks. The upstream subnetwork contains the intertwined network of signaling pathways, while the downstream regulatory part contains transcription factors and their binding sites on the DNA as well as microRNAs and their mRNA targets. Currently, most signaling and regulatory databases contain only a subsection of this network, making comprehensive analyses highly time-consuming and dependent on specific data handling expertise. The need for detailed mapping of signaling systems is also supported by the fact that several drug development failures were caused by undiscovered cross-talk or regulatory effects of drug targets. We previously created a uniformly curated signaling pathway resource, SignaLink, to facilitate the analysis of pathway cross-talks. Here, we present SignaLink 2, which significantly extends the coverage and applications of its predecessor. Description We developed a novel concept to integrate and utilize different subsections (i.e., layers) of the signaling network. The multi-layered (onion-like) database structure is made up of signaling pathways, their pathway regulators (e.g., scaffold and endocytotic proteins) and modifier enzymes (e.g., phosphatases, ubiquitin ligases), as well as transcriptional and post-transcriptional regulators of all of these components. The user-friendly website allows the interactive exploration of how each signaling protein is regulated. The customizable download page enables the analysis of any user-specified part of the signaling network. Compared to other signaling resources, distinctive features of SignaLink 2 are the following: 1) it involves experimental data not only from humans but from two invertebrate model organisms, C. elegans and D. melanogaster; 2) combines manual curation with large-scale datasets; 3) provides confidence scores for each interaction; 4) operates a customizable download page with multiple file formats (e.g., BioPAX, Cytoscape, SBML). Non-profit users can access SignaLink 2 free of charge at http://SignaLink.org. Conclusions With SignaLink 2 as a single resource, users can effectively analyze signaling pathways, scaffold proteins, modifier enzymes, transcription factors and miRNAs that are important in the regulation of signaling processes. This integrated resource allows the systems-level examination of how cross-talks and signaling flow are regulated, as well as provide data for cross-species comparisons and drug discovery analyses. PMID:23331499
SignaLink 2 - a signaling pathway resource with multi-layered regulatory networks.

PubMed

Fazekas, Dávid; Koltai, Mihály; Türei, Dénes; Módos, Dezső; Pálfy, Máté; Dúl, Zoltán; Zsákai, Lilian; Szalay-Bekő, Máté; Lenti, Katalin; Farkas, Illés J; Vellai, Tibor; Csermely, Péter; Korcsmáros, Tamás

2013-01-18

Signaling networks in eukaryotes are made up of upstream and downstream subnetworks. The upstream subnetwork contains the intertwined network of signaling pathways, while the downstream regulatory part contains transcription factors and their binding sites on the DNA as well as microRNAs and their mRNA targets. Currently, most signaling and regulatory databases contain only a subsection of this network, making comprehensive analyses highly time-consuming and dependent on specific data handling expertise. The need for detailed mapping of signaling systems is also supported by the fact that several drug development failures were caused by undiscovered cross-talk or regulatory effects of drug targets. We previously created a uniformly curated signaling pathway resource, SignaLink, to facilitate the analysis of pathway cross-talks. Here, we present SignaLink 2, which significantly extends the coverage and applications of its predecessor. We developed a novel concept to integrate and utilize different subsections (i.e., layers) of the signaling network. The multi-layered (onion-like) database structure is made up of signaling pathways, their pathway regulators (e.g., scaffold and endocytotic proteins) and modifier enzymes (e.g., phosphatases, ubiquitin ligases), as well as transcriptional and post-transcriptional regulators of all of these components. The user-friendly website allows the interactive exploration of how each signaling protein is regulated. The customizable download page enables the analysis of any user-specified part of the signaling network. Compared to other signaling resources, distinctive features of SignaLink 2 are the following: 1) it involves experimental data not only from humans but from two invertebrate model organisms, C. elegans and D. melanogaster; 2) combines manual curation with large-scale datasets; 3) provides confidence scores for each interaction; 4) operates a customizable download page with multiple file formats (e.g., BioPAX, Cytoscape, SBML). Non-profit users can access SignaLink 2 free of charge at http://SignaLink.org. With SignaLink 2 as a single resource, users can effectively analyze signaling pathways, scaffold proteins, modifier enzymes, transcription factors and miRNAs that are important in the regulation of signaling processes. This integrated resource allows the systems-level examination of how cross-talks and signaling flow are regulated, as well as provide data for cross-species comparisons and drug discovery analyses.
SorghumFDB: sorghum functional genomics database with multidimensional network analysis.

PubMed

Tian, Tian; You, Qi; Zhang, Liwei; Yi, Xin; Yan, Hengyu; Xu, Wenying; Su, Zhen

2016-01-01

Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic data and some functional annotations to construct a sorghum functional genomics database (SorghumFDB). SorghumFDB integrated knowledge of sorghum gene family classifications (transcription regulators/factors, carbohydrate-active enzymes, protein kinases, ubiquitins, cytochrome P450, monolignol biosynthesis related enzymes, R-genes and organelle-genes), detailed gene annotations, miRNA and target gene information, orthologous pairs in the model plants Arabidopsis, rice and maize, gene loci conversions and a genome browser. We further constructed a dynamic network of multidimensional biological relationships, comprised of the co-expression data, protein-protein interactions and miRNA-target pairs. We took effective measures to combine the network, gene set enrichment and motif analyses to determine the key regulators that participate in related metabolic pathways, such as the lignin pathway, which is a major biological process in bioenergy-producing plants.Database URL: http://structuralbiology.cau.edu.cn/sorghum/index.html. © The Author(s) 2016. Published by Oxford University Press.
Prediction of Oncogenic Interactions and Cancer-Related Signaling Networks Based on Network Topology

PubMed Central

Acencio, Marcio Luis; Bovolenta, Luiz Augusto; Camilo, Esther; Lemke, Ney

2013-01-01

Cancer has been increasingly recognized as a systems biology disease since many investigators have demonstrated that this malignant phenotype emerges from abnormal protein-protein, regulatory and metabolic interactions induced by simultaneous structural and regulatory changes in multiple genes and pathways. Therefore, the identification of oncogenic interactions and cancer-related signaling networks is crucial for better understanding cancer. As experimental techniques for determining such interactions and signaling networks are labor-intensive and time-consuming, the development of a computational approach capable to accomplish this task would be of great value. For this purpose, we present here a novel computational approach based on network topology and machine learning capable to predict oncogenic interactions and extract relevant cancer-related signaling subnetworks from an integrated network of human genes interactions (INHGI). This approach, called graph2sig, is twofold: first, it assigns oncogenic scores to all interactions in the INHGI and then these oncogenic scores are used as edge weights to extract oncogenic signaling subnetworks from INHGI. Regarding the prediction of oncogenic interactions, we showed that graph2sig is able to recover 89% of known oncogenic interactions with a precision of 77%. Moreover, the interactions that received high oncogenic scores are enriched in genes for which mutations have been causally implicated in cancer. We also demonstrated that graph2sig is potentially useful in extracting oncogenic signaling subnetworks: more than 80% of constructed subnetworks contain more than 50% of original interactions in their corresponding oncogenic linear pathways present in the KEGG PATHWAY database. In addition, the potential oncogenic signaling subnetworks discovered by graph2sig are supported by experimental evidence. Taken together, these results suggest that graph2sig can be a useful tool for investigators involved in cancer research interested in detecting signaling networks most prone to contribute with the emergence of malignant phenotype. PMID:24204854
Network pharmacology-based identification of key pharmacological pathways of Yin-Huang-Qing-Fei capsule acting on chronic bronchitis.

PubMed

Yu, Guohua; Zhang, Yanqiong; Ren, Weiqiong; Dong, Ling; Li, Junfang; Geng, Ya; Zhang, Yi; Li, Defeng; Xu, Haiyu; Yang, Hongjun

2017-01-01

For decades in China, the Yin-Huang-Qing-Fei capsule (YHQFC) has been widely used in the treatment of chronic bronchitis, with good curative effects. Owing to the complexity of traditional Chinese herbal formulas, the pharmacological mechanism of YHQFC remains unclear. To address this problem, a network pharmacology-based strategy was proposed in this study. At first, the putative target profile of YHQFC was predicted using MedChem Studio, based on structural and functional similarities of all available YHQFC components to the known drugs obtained from the DrugBank database. Then, an interaction network was constructed using links between putative YHQFC targets and known therapeutic targets of chronic bronchitis. Following the calculation of four topological features (degree, betweenness, closeness, and coreness) of each node in the network, 475 major putative targets of YHQFC and their topological importance were identified. In addition, a pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes pathway database indicated that the major putative targets of YHQFC are significantly associated with various pathways involved in anti-inflammation processes, immune responses, and pathological changes caused by asthma. More interestingly, eight major putative targets of YHQFC (interleukin [IL]-3, IL-4, IL-5, IL-10, IL-13, FCER1G, CCL11, and EPX) were demonstrated to be associated with the inflammatory process that occurs during the progression of asthma. Finally, a molecular docking simulation was performed and the results exhibited that 17 pairs of chemical components and candidate YHQFC targets involved in asthma pathway had strong binding efficiencies. In conclusion, this network pharmacology-based investigation revealed that YHQFC may attenuate the inflammatory reaction of chronic bronchitis by regulating its candidate targets, which may be implicated in the major pathological processes of the asthma pathway.
Identification of transcriptional factors and key genes in primary osteoporosis by DNA microarray.

PubMed

Xie, Wengui; Ji, Lixin; Zhao, Teng; Gao, Pengfei

2015-05-09

A number of genes have been identified to be related with primary osteoporosis while less is known about the comprehensive interactions between regulating genes and proteins. We aimed to identify the differentially expressed genes (DEGs) and regulatory effects of transcription factors (TFs) involved in primary osteoporosis. The gene expression profile GSE35958 was obtained from Gene Expression Omnibus database, including 5 primary osteoporosis and 4 normal bone tissues. The differentially expressed genes between primary osteoporosis and normal bone tissues were identified by the same package in R language. The TFs of these DEGs were predicted with the Essaghir A method. DAVID (The Database for Annotation, Visualization and Integrated Discovery) was applied to perform the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of DEGs. After analyzing regulatory effects, a regulatory network was built between TFs and the related DEGs. A total of 579 DEGs was screened, including 310 up-regulated genes and 269 down-regulated genes in primary osteoporosis samples. In GO terms, more up-regulated genes were enriched in transcription regulator activity, and secondly in transcription factor activity. A total 10 significant pathways were enriched in KEGG analysis, including colorectal cancer, Wnt signaling pathway, Focal adhesion, and MAPK signaling pathway. Moreover, total 7 TFs were enriched, of which CTNNB1, SP1, and TP53 regulated most up-regulated DEGs. The discovery of the enriched TFs might contribute to the understanding of the mechanism of primary osteoporosis. Further research on genes and TFs related to the WNT signaling pathway and MAPK pathway is urgent for clinical diagnosis and directing treatment of primary osteoporosis.
Network pharmacology-based identification of key pharmacological pathways of Yin–Huang–Qing–Fei capsule acting on chronic bronchitis

PubMed Central

Yu, Guohua; Zhang, Yanqiong; Ren, Weiqiong; Dong, Ling; Li, Junfang; Geng, Ya; Zhang, Yi; Li, Defeng; Xu, Haiyu; Yang, Hongjun

2017-01-01

For decades in China, the Yin–Huang–Qing–Fei capsule (YHQFC) has been widely used in the treatment of chronic bronchitis, with good curative effects. Owing to the complexity of traditional Chinese herbal formulas, the pharmacological mechanism of YHQFC remains unclear. To address this problem, a network pharmacology-based strategy was proposed in this study. At first, the putative target profile of YHQFC was predicted using MedChem Studio, based on structural and functional similarities of all available YHQFC components to the known drugs obtained from the DrugBank database. Then, an interaction network was constructed using links between putative YHQFC targets and known therapeutic targets of chronic bronchitis. Following the calculation of four topological features (degree, betweenness, closeness, and coreness) of each node in the network, 475 major putative targets of YHQFC and their topological importance were identified. In addition, a pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes pathway database indicated that the major putative targets of YHQFC are significantly associated with various pathways involved in anti-inflammation processes, immune responses, and pathological changes caused by asthma. More interestingly, eight major putative targets of YHQFC (interleukin [IL]-3, IL-4, IL-5, IL-10, IL-13, FCER1G, CCL11, and EPX) were demonstrated to be associated with the inflammatory process that occurs during the progression of asthma. Finally, a molecular docking simulation was performed and the results exhibited that 17 pairs of chemical components and candidate YHQFC targets involved in asthma pathway had strong binding efficiencies. In conclusion, this network pharmacology-based investigation revealed that YHQFC may attenuate the inflammatory reaction of chronic bronchitis by regulating its candidate targets, which may be implicated in the major pathological processes of the asthma pathway. PMID:28053519
Differential gene expression analysis in glioblastoma cells and normal human brain cells based on GEO database.

PubMed

Wang, Anping; Zhang, Guibin

2017-11-01

The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.

DrugPath: a database for academic investigators to match oncology molecular targets with drugs in development.

PubMed

Shah, Eric D; Fisch, Brandon M A; Arceci, Robert J; Buckley, Jonathan D; Reaman, Gregory H; Sorensen, Poul H; Triche, Timothy J; Reynolds, C Patrick

2014-05-01

Academic laboratories are developing increasingly large amounts of data that describe the genomic landscape and gene expression patterns of various types of cancers. Such data can potentially identify novel oncology molecular targets in cancer types that may not be the primary focus of a drug sponsor's initial research for an investigational new drug. Obtaining preclinical data that point toward the potential for a given molecularly targeted agent, or a novel combination of agents requires knowledge of drugs currently in development in both the academic and commercial sectors. We have developed the DrugPath database ( http://www.drugpath.org ) as a comprehensive, free-of-charge resource for academic investigators to identify agents being developed in academics or industry that may act against molecular targets of interest. DrugPath data on molecular targets overlay the Michigan Molecular Interactions ( http://mimi.ncibi.org ) gene-gene interaction map to facilitate identification of related agents in the same pathway. The database catalogs 2,081 drug development programs representing 751 drug sponsors and 722 molecular and genetic targets. DrugPath should assist investigators in identifying and obtaining drugs acting on specific molecular targets for biological and preclinical therapeutic studies.
Pathway Analysis and Omics Data Visualization Using Pathway Genome Databases: FragariaCyc, a Case Study.

PubMed

Naithani, Sushma; Jaiswal, Pankaj

2017-01-01

The species-specific plant Pathway Genome Databases (PGDBs) based on the BioCyc platform provide a conceptual model of the cellular metabolic network of an organism. Such frameworks allow analysis of the genome-scale expression data to understand changes in the overall metabolisms of an organism (or organs, tissues, and cells) in response to various extrinsic (e.g. developmental and differentiation) and/or extrinsic signals (e.g. pathogens and abiotic stresses) from the surrounding environment. Using FragariaCyc, a pathway database for the diploid strawberry Fragaria vesca, we show (1) the basic navigation across a PGDB; (2) a case study of pathway comparison across plant species; and (3) an example of RNA-Seq data analysis using Omics Viewer tool. The protocols described here generally apply to other Pathway Tools-based PGDBs.
Role of miR-452-5p in the tumorigenesis of prostate cancer: A study based on the Cancer Genome Atl(TCGA), Gene Expression Omnibus (GEO), and bioinformatics analysis.

PubMed

Gao, Li; Zhang, Li-Jie; Li, Sheng-Hua; Wei, Li-Li; Luo, Bin; He, Rong-Quan; Xia, Shuang

2018-03-06

MiR-452-5p has been reported to be down-regulated in prostate cancer, affecting the development of this type of cancer. However, the molecular mechanism of miR-452-5p in prostate cancer remains unclear. Therefore, we investigated the network of target genes of miR-452-5p in prostate cancer using bioinformatics analyses. We first analyzed the expression profiles and prognostic value of miR-452-5p in prostate cancer tissues from a public database. Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), PANTHER pathway analyses, and a disease ontology (DG) analysis were performed to find the molecular functions of the target genes from GSE datasets and miRWalk. Finally, we validated hub genes from the protein-protein interaction (PPI) networks of the target genes in the Human Protein Atlas (HPA) database and Gene Expression Profiling Interactive Analysis (GEPIA). Narrowing down the optimal target genes was conducted by seeking the common parts of up-regulated genes from GEPIA, down-regulated genes from GSE datasets, and predicted genes in miRWalk. Based on mining of GEO and ArrayExpress microarray chips and miRNA-Seq data in the TCGA database, which includes 1007 prostate cancer samples and 387 non-cancer samples, miR-452-5p is shown to be down-regulated in prostate cancer. GO, KEGG, and PANTHER pathway analyses suggested that the target genes might participate in important biological processes, such as transforming growth factor beta signaling and the positive regulation of brown fat cell differentiation and mesenchymal cell differentiation, as well as the Ras signaling pathway and pathways regulating the pluripotency of stem cells and arrhythmogenic right ventricular cardiomyopathy (ARVC). Nine genes-GABBR, PNISR, NTSR1, DOCK1, EREG, SFRP1, PTGS2, LEF1, and BMP2-were defined as hub genes in the PPI network. Three genes-FAM174B, SLC30A4, and SLIT1-were jointly shared by GEPIA, the GSE datasets, and miRWalk. Down-regulated miR-452-5p might play an essential role in the tumorigenesis of prostate cancer. Copyright © 2018. Published by Elsevier GmbH.
HMDB 3.0--The Human Metabolome Database in 2013.

PubMed

Wishart, David S; Jewison, Timothy; Guo, An Chi; Wilson, Michael; Knox, Craig; Liu, Yifeng; Djoumbou, Yannick; Mandal, Rupasri; Aziat, Farid; Dong, Edison; Bouatra, Souhaila; Sinelnikov, Igor; Arndt, David; Xia, Jianguo; Liu, Philip; Yallou, Faizath; Bjorndahl, Trent; Perez-Pineiro, Rolando; Eisner, Roman; Allen, Felicity; Neveu, Vanessa; Greiner, Russ; Scalbert, Augustin

2013-01-01

The Human Metabolome Database (HMDB) (www.hmdb.ca) is a resource dedicated to providing scientists with the most current and comprehensive coverage of the human metabolome. Since its first release in 2007, the HMDB has been used to facilitate research for nearly 1000 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 3.0) has been significantly expanded and enhanced over the 2009 release (version 2.0). In particular, the number of annotated metabolite entries has grown from 6500 to more than 40,000 (a 600% increase). This enormous expansion is a result of the inclusion of both 'detected' metabolites (those with measured concentrations or experimental confirmation of their existence) and 'expected' metabolites (those for which biochemical pathways are known or human intake/exposure is frequent but the compound has yet to be detected in the body). The latest release also has greatly increased the number of metabolites with biofluid or tissue concentration data, the number of compounds with reference spectra and the number of data fields per entry. In addition to this expansion in data quantity, new database visualization tools and new data content have been added or enhanced. These include better spectral viewing tools, more powerful chemical substructure searches, an improved chemical taxonomy and better, more interactive pathway maps. This article describes these enhancements to the HMDB, which was previously featured in the 2009 NAR Database Issue. (Note to referees, HMDB 3.0 will go live on 18 September 2012.).
Identification and analysis of potential targets in Streptococcus sanguinis using computer aided protein data analysis

PubMed Central

Chowdhury, Md Rabiul Hossain; Bhuiyan, Md IqbalKaiser; Saha, Ayan; Mosleh, Ivan MHAI; Mondol, Sobuj; Ahmed, C M Sabbir

2014-01-01

Purpose Streptococcus sanguinis is a Gram-positive, facultative aerobic bacterium that is a member of the viridans streptococcus group. It is found in human mouths in dental plaque, which accounts for both dental cavities and bacterial endocarditis, and which entails a mortality rate of 25%. Although a range of remedial mediators have been found to control this organism, the effectiveness of agents such as penicillin, amoxicillin, trimethoprim–sulfamethoxazole, and erythromycin, was observed. The emphasis of this investigation was on finding substitute and efficient remedial approaches for the total destruction of this bacterium. Materials and methods In this computational study, various databases and online software were used to ascertain some specific targets of S. sanguinis. Particularly, the Kyoto Encyclopedia of Genes and Genomes databases were applied to determine human nonhomologous proteins, as well as the metabolic pathways involved with those proteins. Different software such as Phyre2, CastP, DoGSiteScorer, the Protein Function Predictor server, and STRING were utilized to evaluate the probable active drug binding site with its known function and protein–protein interaction. Results In this study, among 218 essential proteins of this pathogenic bacterium, 81 nonhomologous proteins were accrued, and 15 proteins that are unique in several metabolic pathways of S. sanguinis were isolated through metabolic pathway analysis. Furthermore, four essentially membrane-bound unique proteins that are involved in distinct metabolic pathways were revealed by this research. Active sites and druggable pockets of these selected proteins were investigated with bioinformatic techniques. In addition, this study also mentions the activity of those proteins, as well as their interactions with the other proteins. Conclusion Our findings helped to identify the type of protein to be considered as an efficient drug target. This study will pave the way for researchers to develop and discover more effective and specific therapeutic agents against S. sanguinis. PMID:25473301
Identification and analysis of potential targets in Streptococcus sanguinis using computer aided protein data analysis.

PubMed

Chowdhury, Md Rabiul Hossain; Bhuiyan, Md IqbalKaiser; Saha, Ayan; Mosleh, Ivan Mhai; Mondol, Sobuj; Ahmed, C M Sabbir

2014-01-01

Streptococcus sanguinis is a Gram-positive, facultative aerobic bacterium that is a member of the viridans streptococcus group. It is found in human mouths in dental plaque, which accounts for both dental cavities and bacterial endocarditis, and which entails a mortality rate of 25%. Although a range of remedial mediators have been found to control this organism, the effectiveness of agents such as penicillin, amoxicillin, trimethoprim-sulfamethoxazole, and erythromycin, was observed. The emphasis of this investigation was on finding substitute and efficient remedial approaches for the total destruction of this bacterium. In this computational study, various databases and online software were used to ascertain some specific targets of S. sanguinis. Particularly, the Kyoto Encyclopedia of Genes and Genomes databases were applied to determine human nonhomologous proteins, as well as the metabolic pathways involved with those proteins. Different software such as Phyre2, CastP, DoGSiteScorer, the Protein Function Predictor server, and STRING were utilized to evaluate the probable active drug binding site with its known function and protein-protein interaction. In this study, among 218 essential proteins of this pathogenic bacterium, 81 nonhomologous proteins were accrued, and 15 proteins that are unique in several metabolic pathways of S. sanguinis were isolated through metabolic pathway analysis. Furthermore, four essentially membrane-bound unique proteins that are involved in distinct metabolic pathways were revealed by this research. Active sites and druggable pockets of these selected proteins were investigated with bioinformatic techniques. In addition, this study also mentions the activity of those proteins, as well as their interactions with the other proteins. Our findings helped to identify the type of protein to be considered as an efficient drug target. This study will pave the way for researchers to develop and discover more effective and specific therapeutic agents against S. sanguinis.
Effects of 5-h multimodal stress on the molecules and pathways involved in dendritic morphology and cognitive function.

PubMed

Xu, Yiran; Cheng, Xiaorui; Cui, Xiuliang; Wang, Tongxing; Liu, Gang; Yang, Ruishang; Wang, Jianhui; Bo, Xiaochen; Wang, Shengqi; Zhou, Wenxia; Zhang, Yongxiang

2015-09-01

Stress induces cognitive impairments, which are likely related to the damaged dendritic morphology in the brain. Treatments for stress-induced impairments remain limited because the molecules and pathways underlying these impairments are unknown. Therefore, the aim of this study was to find the potential molecules and pathways related to damage of the dendritic morphology induced by stress. To do this, we detected gene expression, constructed a protein-protein interaction (PPI) network, and analyzed the molecular pathways in the brains of mice exposed to 5-h multimodal stress. The results showed that stress increased plasma corticosterone concentration, decreased cognitive function, damaged dendritic morphologies, and altered APBB1, CLSTN1, KCNA4, NOTCH3, PLAU, RPS6KA1, SYP, TGFB1, KCNA1, NTRK3, and SNCA expression in the brains of mice. Further analyses found that the abnormal expressions of CLSTN1, PLAU, NOTCH3, and TGFB1 induced by stress were related to alterations in the dendritic morphology. These four genes demonstrated interactions with 55 other genes, and configured a closed PPI network. Molecular pathway analysis use the Database for Annotation, Visualization, and Integrated Discovery (DAVID), specifically the gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG), each identified three pathways that were significantly enriched in the gene list of the PPI network, with genes belonging to the Notch and transforming growth factor-beta (TGF-B) signaling pathways being the most enriched. Our results suggest that TGFB1, PLAU, NOTCH3, and CLSTN1 may be related to the alterations in dendritic morphology induced by stress, and imply that the Notch and TGF-B signaling pathways may be involved. Copyright © 2015 Elsevier Inc. All rights reserved.
Screening for genes and subnetworks associated with pancreatic cancer based on the gene expression profile.

PubMed

Long, Jin; Liu, Zhe; Wu, Xingda; Xu, Yuanhong; Ge, Chunlin

2016-05-01

The present study aimed to screen for potential genes and subnetworks associated with pancreatic cancer (PC) using the gene expression profile. The expression profile GSE 16515 was downloaded from the Gene Expression Omnibus database, which included 36 PC tissue samples and 16 normal samples. Limma package in R language was used to screen differentially expressed genes (DEGs), which were grouped as up‑ and downregulated genes. Then, PFSNet was applied to perform subnetwork analysis for all the DEGs. Moreover, Gene Ontology (GO) and REACTOME pathway enrichment analysis of up‑ and downregulated genes was performed, followed by protein‑protein interaction (PPI) network construction using Search Tool for the Retrieval of Interacting Genes Search Tool for the Retrieval of Interacting Genes. In total, 1,989 DEGs including 1,461 up‑ and 528 downregulated genes were screened out. Subnetworks including pancreatic cancer in PC tissue samples and intercellular adhesion in normal samples were identified, respectively. A total of 8 significant REACTOME pathways for upregulated DEGs, such as hemostasis and cell cycle, mitotic were identified. Moreover, 4 significant REACTOME pathways for downregulated DEGs, including regulation of β‑cell development and transmembrane transport of small molecules were screened out. Additionally, DEGs with high connectivity degrees, such as CCNA2 (cyclin A2) and PBK (PDZ binding kinase), of the module in the protein‑protein interaction network were mainly enriched with cell‑division cycle. CCNA2 and PBK of the module and their relative pathway cell‑division cycle, and two subnetworks (pancreatic cancer and intercellular adhesion subnetworks) may be pivotal for further understanding of the molecular mechanism of PC.
The NCBI BioSystems database.

PubMed

Geer, Lewis Y; Marchler-Bauer, Aron; Geer, Renata C; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H

2010-01-01

The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI's Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets.
VitisCyc: a metabolic pathway knowledgebase for grapevine (Vitis vinifera)

PubMed Central

Naithani, Sushma; Raja, Rajani; Waddell, Elijah N.; Elser, Justin; Gouthu, Satyanarayana; Deluc, Laurent G.; Jaiswal, Pankaj

2014-01-01

We have developed VitisCyc, a grapevine-specific metabolic pathway database that allows researchers to (i) search and browse the database for its various components such as metabolic pathways, reactions, compounds, genes and proteins, (ii) compare grapevine metabolic networks with other publicly available plant metabolic networks, and (iii) upload, visualize and analyze high-throughput data such as transcriptomes, proteomes, metabolomes etc. using OMICs-Viewer tool. VitisCyc is based on the genome sequence of the nearly homozygous genotype PN40024 of Vitis vinifera “Pinot Noir” cultivar with 12X v1 annotations and was built on BioCyc platform using Pathway Tools software and MetaCyc reference database. Furthermore, VitisCyc was enriched for plant-specific pathways and grape-specific metabolites, reactions and pathways. Currently VitisCyc harbors 68 super pathways, 362 biosynthesis pathways, 118 catabolic pathways, 5 detoxification pathways, 36 energy related pathways and 6 transport pathways, 10,908 enzymes, 2912 enzymatic reactions, 31 transport reactions and 2024 compounds. VitisCyc, as a community resource, can aid in the discovery of candidate genes and pathways that are regulated during plant growth and development, and in response to biotic and abiotic stress signals generated from a plant's immediate environment. VitisCyc version 3.18 is available online at http://pathways.cgrb.oregonstate.edu. PMID:25538713
Sig2BioPAX: Java tool for converting flat files to BioPAX Level 3 format.

PubMed

Webb, Ryan L; Ma'ayan, Avi

2011-03-21

The World Wide Web plays a critical role in enabling molecular, cell, systems and computational biologists to exchange, search, visualize, integrate, and analyze experimental data. Such efforts can be further enhanced through the development of semantic web concepts. The semantic web idea is to enable machines to understand data through the development of protocol free data exchange formats such as Resource Description Framework (RDF) and the Web Ontology Language (OWL). These standards provide formal descriptors of objects, object properties and their relationships within a specific knowledge domain. However, the overhead of converting datasets typically stored in data tables such as Excel, text or PDF into RDF or OWL formats is not trivial for non-specialists and as such produces a barrier to seamless data exchange between researchers, databases and analysis tools. This problem is particularly of importance in the field of network systems biology where biochemical interactions between genes and their protein products are abstracted to networks. For the purpose of converting biochemical interactions into the BioPAX format, which is the leading standard developed by the computational systems biology community, we developed an open-source command line tool that takes as input tabular data describing different types of molecular biochemical interactions. The tool converts such interactions into the BioPAX level 3 OWL format. We used the tool to convert several existing and new mammalian networks of protein interactions, signalling pathways, and transcriptional regulatory networks into BioPAX. Some of these networks were deposited into PathwayCommons, a repository for consolidating and organizing biochemical networks. The software tool Sig2BioPAX is a resource that enables experimental and computational systems biologists to contribute their identified networks and pathways of molecular interactions for integration and reuse with the rest of the research community.
Systems approach for the selection of micro-RNAs as therapeutic biomarkers of anti-EGFR monoclonal antibody treatment in colorectal cancer

NASA Astrophysics Data System (ADS)

Deyati, Avisek; Bagewadi, Shweta; Senger, Philipp; Hofmann-Apitius, Martin; Novac, Natalia

2015-01-01

miRNA plays an important role in tumourgenesis by regulating expression of oncogenes and tumour suppressors. Thus affects cell proliferation and differentiation, apoptosis, invasion and angiogenesis. miRNAs are potential biomarkers for diagnosis, prognosis and therapies of different forms of cancer. However, relationship between response of cancer patients towards targeted therapy and the resulting modifications of the miRNA transcriptome in the context of pathway regulation is poorly understood. With ever-increasing pathways and miRNA-mRNA interaction databases, freely available mRNA and miRNA expression data in multiple cancer therapy have produced an unprecedented opportunity to decipher the role of miRNAs in early prediction of therapeutic efficacy in diseases. Efficient translation of -omics data and accumulated knowledge to clinical decision-making are of paramount scientific and public health interest. Well-structured translational algorithms are needed to bridge the gap from databases to decisions. Herein, we present a novel SMARTmiR algorithm to prospectively predict the role of miRNA as therapeutic biomarker for an anti-EGFR monoclonal antibody i.e. cetuximab treatment in colorectal cancer.
BioNetSim: a Petri net-based modeling tool for simulations of biochemical processes.

PubMed

Gao, Junhui; Li, Li; Wu, Xiaolin; Wei, Dong-Qing

2012-03-01

BioNetSim, a Petri net-based software for modeling and simulating biochemistry processes, is developed, whose design and implement are presented in this paper, including logic construction, real-time access to KEGG (Kyoto Encyclopedia of Genes and Genomes), and BioModel database. Furthermore, glycolysis is simulated as an example of its application. BioNetSim is a helpful tool for researchers to download data, model biological network, and simulate complicated biochemistry processes. Gene regulatory networks, metabolic pathways, signaling pathways, and kinetics of cell interaction are all available in BioNetSim, which makes modeling more efficient and effective. Similar to other Petri net-based softwares, BioNetSim does well in graphic application and mathematic construction. Moreover, it shows several powerful predominances. (1) It creates models in database. (2) It realizes the real-time access to KEGG and BioModel and transfers data to Petri net. (3) It provides qualitative analysis, such as computation of constants. (4) It generates graphs for tracing the concentration of every molecule during the simulation processes.
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study

PubMed Central

Weißenborn, Sandra; Walther, Dirk

2017-01-01

Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
EDdb: a web resource for eating disorder and its application to identify an extended adipocytokine signaling pathway related to eating disorder.

PubMed

Zhao, Min; Li, XiaoMo; Qu, Hong

2013-12-01

Eating disorder is a group of physiological and psychological disorders affecting approximately 1% of the female population worldwide. Although the genetic epidemiology of eating disorder is becoming increasingly clear with accumulated studies, the underlying molecular mechanisms are still unclear. Recently, integration of various high-throughput data expanded the range of candidate genes and started to generate hypotheses for understanding potential pathogenesis in complex diseases. This article presents EDdb (Eating Disorder database), the first evidence-based gene resource for eating disorder. Fifty-nine experimentally validated genes from the literature in relation to eating disorder were collected as the core dataset. Another four datasets with 2824 candidate genes across 601 genome regions were expanded based on the core dataset using different criteria (e.g., protein-protein interactions, shared cytobands, and related complex diseases). Based on human protein-protein interaction data, we reconstructed a potential molecular sub-network related to eating disorder. Furthermore, with an integrative pathway enrichment analysis of genes in EDdb, we identified an extended adipocytokine signaling pathway in eating disorder. Three genes in EDdb (ADIPO (adiponectin), TNF (tumor necrosis factor) and NR3C1 (nuclear receptor subfamily 3, group C, member 1)) link the KEGG (Kyoto Encyclopedia of Genes and Genomes) "adipocytokine signaling pathway" with the BioCarta "visceral fat deposits and the metabolic syndrome" pathway to form a joint pathway. In total, the joint pathway contains 43 genes, among which 39 genes are related to eating disorder. As the first comprehensive gene resource for eating disorder, EDdb ( http://eddb.cbi.pku.edu.cn ) enables the exploration of gene-disease relationships and cross-talk mechanisms between related disorders. Through pathway statistical studies, we revealed that abnormal body weight caused by eating disorder and obesity may both be related to dysregulation of the novel joint pathway of adipocytokine signaling. In addition, this joint pathway may be the common pathway for body weight regulation in complex human diseases related to unhealthy lifestyle.
Graphite Web: web tool for gene set analysis exploiting pathway topology

PubMed Central

Sales, Gabriele; Calura, Enrica; Martini, Paolo; Romualdi, Chiara

2013-01-01

Graphite web is a novel web tool for pathway analyses and network visualization for gene expression data of both microarray and RNA-seq experiments. Several pathway analyses have been proposed either in the univariate or in the global and multivariate context to tackle the complexity and the interpretation of expression results. These methods can be further divided into ‘topological’ and ‘non-topological’ methods according to their ability to gain power from pathway topology. Biological pathways are, in fact, not only gene lists but can be represented through a network where genes and connections are, respectively, nodes and edges. To this day, the most used approaches are non-topological and univariate although they miss the relationship among genes. On the contrary, topological and multivariate approaches are more powerful, but difficult to be used by researchers without bioinformatic skills. Here we present Graphite web, the first public web server for pathway analysis on gene expression data that combines topological and multivariate pathway analyses with an efficient system of interactive network visualizations for easy results interpretation. Specifically, Graphite web implements five different gene set analyses on three model organisms and two pathway databases. Graphite Web is freely available at http://graphiteweb.bio.unipd.it/. PMID:23666626
The NCBI BioSystems database

PubMed Central

Geer, Lewis Y.; Marchler-Bauer, Aron; Geer, Renata C.; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H.

2010-01-01

The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI’s Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets. PMID:19854944
PAMDB: a comprehensive Pseudomonas aeruginosa metabolome database.

PubMed

Huang, Weiliang; Brewer, Luke K; Jones, Jace W; Nguyen, Angela T; Marcu, Ana; Wishart, David S; Oglesby-Sherrouse, Amanda G; Kane, Maureen A; Wilks, Angela

2018-01-04

The Pseudomonas aeruginosaMetabolome Database (PAMDB, http://pseudomonas.umaryland.edu) is a searchable, richly annotated metabolite database specific to P. aeruginosa. P. aeruginosa is a soil organism and significant opportunistic pathogen that adapts to its environment through a versatile energy metabolism network. Furthermore, P. aeruginosa is a model organism for the study of biofilm formation, quorum sensing, and bioremediation processes, each of which are dependent on unique pathways and metabolites. The PAMDB is modelled on the Escherichia coli (ECMDB), yeast (YMDB) and human (HMDB) metabolome databases and contains >4370 metabolites and 938 pathways with links to over 1260 genes and proteins. The database information was compiled from electronic databases, journal articles and mass spectrometry (MS) metabolomic data obtained in our laboratories. For each metabolite entered, we provide detailed compound descriptions, names and synonyms, structural and physiochemical information, nuclear magnetic resonance (NMR) and MS spectra, enzymes and pathway information, as well as gene and protein sequences. The database allows extensive searching via chemical names, structure and molecular weight, together with gene, protein and pathway relationships. The PAMBD and its future iterations will provide a valuable resource to biologists, natural product chemists and clinicians in identifying active compounds, potential biomarkers and clinical diagnostics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Linking disease-associated genes to regulatory networks via promoter organization

PubMed Central

Döhr, S.; Klingenhoff, A.; Maier, H.; de Angelis, M. Hrabé; Werner, T.; Schneider, R.

2005-01-01

Pathway- or disease-associated genes may participate in more than one transcriptional co-regulation network. Such gene groups can be readily obtained by literature analysis or by high-throughput techniques such as microarrays or protein-interaction mapping. We developed a strategy that defines regulatory networks by in silico promoter analysis, finding potentially co-regulated subgroups without a priori knowledge. Pairs of transcription factor binding sites conserved in orthologous genes (vertically) as well as in promoter sequences of co-regulated genes (horizontally) were used as seeds for the development of promoter models representing potential co-regulation. This approach was applied to a Maturity Onset Diabetes of the Young (MODY)-associated gene list, which yielded two models connecting functionally interacting genes within MODY-related insulin/glucose signaling pathways. Additional genes functionally connected to our initial gene list were identified by database searches with these promoter models. Thus, data-driven in silico promoter analysis allowed integrating molecular mechanisms with biological functions of the cell. PMID:15701758
1-CMDb: A Curated Database of Genomic Variations of the One-Carbon Metabolism Pathway.

PubMed

Bhat, Manoj K; Gadekar, Veerendra P; Jain, Aditya; Paul, Bobby; Rai, Padmalatha S; Satyamoorthy, Kapaettu

2017-01-01

The one-carbon metabolism pathway is vital in maintaining tissue homeostasis by driving the critical reactions of folate and methionine cycles. A myriad of genetic and epigenetic events mark the rate of reactions in a tissue-specific manner. Integration of these to predict and provide personalized health management requires robust computational tools that can process multiomics data. The DNA sequences that may determine the chain of biological events and the endpoint reactions within one-carbon metabolism genes remain to be comprehensively recorded. Hence, we designed the one-carbon metabolism database (1-CMDb) as a platform to interrogate its association with a host of human disorders. DNA sequence and network information of a total of 48 genes were extracted from a literature survey and KEGG pathway that are involved in the one-carbon folate-mediated pathway. The information generated, collected, and compiled for all these genes from the UCSC genome browser included the single nucleotide polymorphisms (SNPs), CpGs, copy number variations (CNVs), and miRNAs, and a comprehensive database was created. Furthermore, a significant correlation analysis was performed for SNPs in the pathway genes. Detailed data of SNPs, CNVs, CpG islands, and miRNAs for 48 folate pathway genes were compiled. The SNPs in CNVs (9670), CpGs (984), and miRNAs (14) were also compiled for all pathway genes. The SIFT score, the prediction and PolyPhen score, as well as the prediction for each of the SNPs were tabulated and represented for folate pathway genes. Also included in the database for folate pathway genes were the links to 124 various phenotypes and disease associations as reported in the literature and from publicly available information. A comprehensive database was generated consisting of genomic elements within and among SNPs, CNVs, CpGs, and miRNAs of one-carbon metabolism pathways to facilitate (a) single source of information and (b) integration into large-genome scale network analysis to be developed in the future by the scientific community. The database can be accessed at http://slsdb.manipal.edu/ocm/. © 2017 S. Karger AG, Basel.

Transcriptome analysis reveals enrichment of genes associated with auditory system in swimbladder of channel catfish.

PubMed

Yang, Yujia; Wang, Xiaozhu; Liu, Yang; Fu, Qiang; Tian, Changxu; Wu, Chenglong; Shi, Huitong; Yuan, Zihao; Tan, Suxu; Liu, Shikai; Gao, Dongya; Dunham, Rex; Liu, Zhanjiang

2018-04-30

In aquatic organisms, hearing is an important sense for acoustic communications and detection of sound-emitting predators and prey. Channel catfish is a dominant aquaculture species in the United States. As channel catfish can hear sounds of relatively high frequency, it serves as a good model for study auditory mechanisms. In catfishes, Weberian ossicles connect the swimbladder to the inner ear to transfer the forced vibrations and improve hearing ability. In this study, we examined the transcriptional profiles of channel catfish swimbladder and other four tissues (gill, liver, skin, and intestine). We identified a total of 1777 genes that exhibited preferential expression pattern in swimbladder of channel catfish. Based on Gene Ontology enrichment analysis, many of swimbladder-enriched genes were categorized into sensory perception of sound, auditory behavior, response to auditory stimulus, or detection of mechanical stimulus involved in sensory perception of sound, such as coch, kcnq4, sptbn1, sptbn4, dnm1, ush2a, and col11a1. Six signaling pathways associated with hearing (Glutamatergic synapse, GABAergic synapse pathways, Axon guidance, cAMP signaling pathway, Ionotropic glutamate receptor pathway, and Metabotropic glutamate receptor group III pathway) were over-represented in KEGG and PANTHER databases. Protein interaction prediction revealed an interactive relationship among the swimbladder-enriched genes and genes involved in sensory perception of sound. This study identified a set of genes and signaling pathways associated with auditory system in the swimbladder of channel catfish and provide resources for further study on the biological and physiological roles in catfish swimbladder. Copyright © 2018 Elsevier Inc. All rights reserved.
CARFMAP: A Curated Pathway Map of Cardiac Fibroblasts.

PubMed

Nim, Hieu T; Furtado, Milena B; Costa, Mauro W; Kitano, Hiroaki; Rosenthal, Nadia A; Boyd, Sarah E

2015-01-01

The adult mammalian heart contains multiple cell types that work in unison under tightly regulated conditions to maintain homeostasis. Cardiac fibroblasts are a significant and unique population of non-muscle cells in the heart that have recently gained substantial interest in the cardiac biology community. To better understand this renaissance cell, it is essential to systematically survey what has been known in the literature about the cellular and molecular processes involved. We have built CARFMAP (http://visionet.erc.monash.edu.au/CARFMAP), an interactive cardiac fibroblast pathway map derived from the biomedical literature using a software-assisted manual data collection approach. CARFMAP is an information-rich interactive tool that enables cardiac biologists to explore the large body of literature in various creative ways. There is surprisingly little overlap between the cardiac fibroblast pathway map, a foreskin fibroblast pathway map, and a whole mouse organism signalling pathway map from the REACTOME database. Among the use cases of CARFMAP is a common task in our cardiac biology laboratory of identifying new genes that are (1) relevant to cardiac literature, and (2) differentially regulated in high-throughput assays. From the expression profiles of mouse cardiac and tail fibroblasts, we employed CARFMAP to characterise cardiac fibroblast pathways. Using CARFMAP in conjunction with transcriptomic data, we generated a stringent list of six genes that would not have been singled out using bioinformatics analyses alone. Experimental validation showed that five genes (Mmp3, Il6, Edn1, Pdgfc and Fgf10) are differentially regulated in the cardiac fibroblast. CARFMAP is a powerful tool for systems analyses of cardiac fibroblasts, facilitating systems-level cardiovascular research.
Pivotal role of the muscle-contraction pathway in cryptorchidism and evidence for genomic connections with cardiomyopathy pathways in RASopathies.

PubMed

Cannistraci, Carlo V; Ogorevc, Jernej; Zorc, Minja; Ravasi, Timothy; Dovc, Peter; Kunej, Tanja

2013-02-14

Cryptorchidism is the most frequent congenital disorder in male children; however the genetic causes of cryptorchidism remain poorly investigated. Comparative integratomics combined with systems biology approach was employed to elucidate genetic factors and molecular pathways underlying testis descent. Literature mining was performed to collect genomic loci associated with cryptorchidism in seven mammalian species. Information regarding the collected candidate genes was stored in MySQL relational database. Genomic view of the loci was presented using Flash GViewer web tool (http://gmod.org/wiki/Flashgviewer/). DAVID Bioinformatics Resources 6.7 was used for pathway enrichment analysis. Cytoscape plug-in PiNGO 1.11 was employed for protein-network-based prediction of novel candidate genes. Relevant protein-protein interactions were confirmed and visualized using the STRING database (version 9.0). The developed cryptorchidism gene atlas includes 217 candidate loci (genes, regions involved in chromosomal mutations, and copy number variations) identified at the genomic, transcriptomic, and proteomic level. Human orthologs of the collected candidate loci were presented using a genomic map viewer. The cryptorchidism gene atlas is freely available online: http://www.integratomics-time.com/cryptorchidism/. Pathway analysis suggested the presence of twelve enriched pathways associated with the list of 179 literature-derived candidate genes. Additionally, a list of 43 network-predicted novel candidate genes was significantly associated with four enriched pathways. Joint pathway analysis of the collected and predicted candidate genes revealed the pivotal importance of the muscle-contraction pathway in cryptorchidism and evidence for genomic associations with cardiomyopathy pathways in RASopathies. The developed gene atlas represents an important resource for the scientific community researching genetics of cryptorchidism. The collected data will further facilitate development of novel genetic markers and could be of interest for functional studies in animals and human. The proposed network-based systems biology approach elucidates molecular mechanisms underlying co-presence of cryptorchidism and cardiomyopathy in RASopathies. Such approach could also aid in molecular explanation of co-presence of diverse and apparently unrelated clinical manifestations in other syndromes.
Predicting miRNA targets for head and neck squamous cell carcinoma using an ensemble method.

PubMed

Gao, Hong; Jin, Hui; Li, Guijun

2018-01-01

This study aimed to uncover potential microRNA (miRNA) targets in head and neck squamous cell carcinoma (HNSCC) using an ensemble method which combined 3 different methods: Pearson's correlation coefficient (PCC), Lasso and a causal inference method (i.e., intervention calculus when the directed acyclic graph (DAG) is absent [IDA]), based on Borda count election. The Borda count election method was used to integrate the top 100 predicted targets of each miRNA generated by individual methods. Afterwards, to validate the performance ability of our method, we checked the TarBase v6.0, miRecords v2013, miRWalk v2.0 and miRTarBase v4.5 databases to validate predictions for miRNAs. Pathway enrichment analysis of target genes in the top 1,000 miRNA-messenger RNA (mRNA) interactions was conducted to focus on significant KEGG pathways. Finally, we extracted target genes based on occurrence frequency ≥3. Based on an absolute value of PCC >0.7, we found 33 miRNAs and 288 mRNAs for further analysis. We extracted 10 target genes with predicted frequencies not less than 3. The target gene MYO5C possessed the highest frequency, which was predicted by 7 different miRNAs. Significantly, a total of 8 pathways were identified; the pathways of cytokine-cytokine receptor interaction and chemokine signaling pathway were the most significant. We successfully predicted target genes and pathways for HNSCC relying on miRNA expression data, mRNA expression profile, an ensemble method and pathway information. Our results may offer new information for the diagnosis and estimation of the prognosis of HNSCC.
Systems Genetics Analysis of GWAS reveals Novel Associations between Key Biological Processes and Coronary Artery Disease

PubMed Central

Ghosh, Sujoy; Vivar, Juan; Nelson, Christopher P; Willenborg, Christina; Segrè, Ayellet V; Mäkinen, Ville-Petteri; Nikpay, Majid; Erdmann, Jeannette; Blankenberg, Stefan; O'Donnell, Christopher; März, Winfried; Laaksonen, Reijo; Stewart, Alexandre FR; Epstein, Stephen E; Shah, Svati H; Granger, Christopher B; Hazen, Stanley L; Kathiresan, Sekar; Reilly, Muredach P; Yang, Xia; Quertermous, Thomas; Samani, Nilesh J; Schunkert, Heribert; Assimes, Themistocles L; McPherson, Ruth

2016-01-01

Objective Genome-wide association (GWA) studies have identified multiple genetic variants affecting the risk of coronary artery disease (CAD). However, individually these explain only a small fraction of the heritability of CAD and for most, the causal biological mechanisms remain unclear. We sought to obtain further insights into potential causal processes of CAD by integrating large-scale GWA data with expertly curated databases of core human pathways and functional networks. Approaches and Results Employing pathways (gene sets) from Reactome, we carried out a two-stage gene set enrichment analysis strategy. From a meta-analyzed discovery cohort of 7 CADGWAS data sets (9,889 cases/11,089 controls), nominally significant gene-sets were tested for replication in a meta-analysis of 9 additional studies (15,502 cases/55,730 controls) from the CARDIoGRAM Consortium. A total of 32 of 639 Reactome pathways tested showed convincing association with CAD (replication p<0.05). These pathways resided in 9 of 21 core biological processes represented in Reactome, and included pathways relevant to extracellular matrix integrity, innate immunity, axon guidance, and signaling by PDRF, NOTCH, and the TGF-β/SMAD receptor complex. Many of these pathways had strengths of association comparable to those observed in lipid transport pathways. Network analysis of unique genes within the replicated pathways further revealed several interconnected functional and topologically interacting modules representing novel associations (e.g. semaphorin regulated axonal guidance pathway) besides confirming known processes (lipid metabolism). The connectivity in the observed networks was statistically significant compared to random networks (p<0.001). Network centrality analysis (‘degree’ and ‘betweenness’) further identified genes (e.g. NCAM1, FYN, FURIN etc.) likely to play critical roles in the maintenance and functioning of several of the replicated pathways. Conclusions These findings provide novel insights into how genetic variation, interpreted in the context of biological processes and functional interactions among genes, may help define the genetic architecture of CAD. PMID:25977570
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data

PubMed Central

Chen, Yi-Hau

2017-01-01

Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.

PubMed

Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin

2017-06-01

Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.
Automated detection of discourse segment and experimental types from the text of cancer pathway results sections.

PubMed

Burns, Gully A P C; Dasigi, Pradeep; de Waard, Anita; Hovy, Eduard H

2016-01-01

Automated machine-reading biocuration systems typically use sentence-by-sentence information extraction to construct meaning representations for use by curators. This does not directly reflect the typical discourse structure used by scientists to construct an argument from the experimental data available within a article, and is therefore less likely to correspond to representations typically used in biomedical informatics systems (let alone to the mental models that scientists have). In this study, we develop Natural Language Processing methods to locate, extract, and classify the individual passages of text from articles' Results sections that refer to experimental data. In our domain of interest (molecular biology studies of cancer signal transduction pathways), individual articles may contain as many as 30 small-scale individual experiments describing a variety of findings, upon which authors base their overall research conclusions. Our system automatically classifies discourse segments in these texts into seven categories (fact, hypothesis, problem, goal, method, result, implication) with an F-score of 0.68. These segments describe the essential building blocks of scientific discourse to (i) provide context for each experiment, (ii) report experimental details and (iii) explain the data's meaning in context. We evaluate our system on text passages from articles that were curated in molecular biology databases (the Pathway Logic Datum repository, the Molecular Interaction MINT and INTACT databases) linking individual experiments in articles to the type of assay used (coprecipitation, phosphorylation, translocation etc.). We use supervised machine learning techniques on text passages containing unambiguous references to experiments to obtain baseline F1 scores of 0.59 for MINT, 0.71 for INTACT and 0.63 for Pathway Logic. Although preliminary, these results support the notion that targeting information extraction methods to experimental results could provide accurate, automated methods for biocuration. We also suggest the need for finer-grained curation of experimental methods used when constructing molecular biology databases. © The Author(s) 2016. Published by Oxford University Press.
Microarray analysis reveals key genes and pathways in Tetralogy of Fallot

PubMed Central

He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai

2017-01-01

The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF. PMID:28713939
Longitudinal associations between adult children's relations with parents and intimate partners.

PubMed

Johnson, Matthew D; Galovan, Adam M; Horne, Rebecca M; Min, Joohong; Walper, Sabine

2017-10-01

Drawing on 5 waves of multiple-informant data gathered from focal participants and their parents and intimate partners (n = 360 families) who completed annual surveys in the German Family Panel (pairfam) study, the present investigation examined bidirectional associations between the development of adults' conflictual and intimate interactions with their parents and intimate partners. Autoregressive cross-lagged latent change score modeling results revealed a robust pattern of coordinated development between parent-adult child and couple conflictual and intimate interactions: increases in conflict and intimacy in one relationship were contemporaneously intertwined with changes in the other relationship. Additionally, prior couple intimacy and conflict predicted future parent-adult child relations in 7 out of 14 cross-lagged pathways examined, but parent-adult child conflict and intimacy was only associated with future couple interactions in 1 pathway. These associations were not moderated by the gender of parents or the adult child or whether the adult child was a young adult or nearing midlife. Frequency of contact between parents and the adult child moderated some associations. Adults simultaneously juggle ties with parents and intimate partners, and this study provides strong evidence supporting the coordinated development of conflictual and intimate patterns of interaction in each relationship. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
The interpersonal theory of suicide: A systematic review and meta-analysis of a decade of cross-national research.

PubMed

Chu, Carol; Buchman-Schmitt, Jennifer M; Stanley, Ian H; Hom, Melanie A; Tucker, Raymond P; Hagan, Christopher R; Rogers, Megan L; Podlogar, Matthew C; Chiurliza, Bruno; Ringer, Fallon B; Michaels, Matthew S; Patros, Connor H G; Joiner, Thomas E

2017-12-01

Over the past decade, the interpersonal theory of suicide has contributed to substantial advances in the scientific and clinical understanding of suicide and related conditions. The interpersonal theory of suicide posits that suicidal desire emerges when individuals experience intractable feelings of perceived burdensomeness and thwarted belongingness and near-lethal or lethal suicidal behavior occurs in the presence of suicidal desire and capability for suicide. A growing number of studies have tested these posited pathways in various samples; however, these findings have yet to be evaluated meta-analytically. This paper aimed to (a) conduct a systematic review of the unpublished and published, peer-reviewed literature examining the relationship between interpersonal theory constructs and suicidal thoughts and behaviors, (b) conduct meta-analyses testing the interpersonal theory hypotheses, and (c) evaluate the influence of various moderators on these relationships. Four electronic bibliographic databases were searched through the end of March, 2016: PubMed, Medline, PsycINFO, and Web of Science. Hypothesis-driven meta-analyses using random effects models were conducted using 122 distinct unpublished and published samples. Findings supported the interpersonal theory: the interaction between thwarted belongingness and perceived burdensomeness was significantly associated with suicidal ideation; and the interaction between thwarted belongingness, perceived burdensomeness, and capability for suicide was significantly related to a greater number of prior suicide attempts. However, effect sizes for these interactions were modest. Alternative configurations of theory variables were similarly useful for predicting suicide risk as theory-consistent pathways. We conclude with limitations and recommendations for the interpersonal theory as a framework for understanding the suicidal spectrum. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
MGDB: a comprehensive database of genes involved in melanoma.

PubMed

Zhang, Di; Zhu, Rongrong; Zhang, Hanqian; Zheng, Chun-Hou; Xia, Junfeng

2015-01-01

The Melanoma Gene Database (MGDB) is a manually curated catalog of molecular genetic data relating to genes involved in melanoma. The main purpose of this database is to establish a network of melanoma related genes and to facilitate the mechanistic study of melanoma tumorigenesis. The entries describing the relationships between melanoma and genes in the current release were manually extracted from PubMed abstracts, which contains cumulative to date 527 human melanoma genes (422 protein-coding and 105 non-coding genes). Each melanoma gene was annotated in seven different aspects (General Information, Expression, Methylation, Mutation, Interaction, Pathway and Drug). In addition, manually curated literature references have also been provided to support the inclusion of the gene in MGDB and establish its association with melanoma. MGDB has a user-friendly web interface with multiple browse and search functions. We hoped MGDB will enrich our knowledge about melanoma genetics and serve as a useful complement to the existing public resources. Database URL: http://bioinfo.ahu.edu.cn:8080/Melanoma/index.jsp. © The Author(s) 2015. Published by Oxford University Press.
HMDB 4.0: the human metabolome database for 2018

PubMed Central

Feunang, Yannick Djoumbou; Marcu, Ana; Guo, An Chi; Liang, Kevin; Vázquez-Fresno, Rosa; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Karu, Naama; Sayeeda, Zinat; Lo, Elvis; Assempour, Nazanin; Berjanskii, Mark; Singhal, Sandeep; Arndt, David; Liang, Yonjie; Badran, Hasan; Grant, Jason; Serra-Cayuela, Arnau; Liu, Yifeng; Mandal, Rupa; Neveu, Vanessa; Pon, Allison; Knox, Craig; Wilson, Michael; Manach, Claudine; Scalbert, Augustin

2018-01-01

Abstract The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB’s chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC–MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science. PMID:29140435
From genomics to chemical genomics: new developments in KEGG

PubMed Central

Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika

2006-01-01

The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
Drug-drug interactions as a result of co-administering Δ9-THC and CBD with other psychotropic agents.

PubMed

Rong, Carola; Carmona, Nicole E; Lee, Yena L; Ragguett, Renee-Marie; Pan, Zihang; Rosenblat, Joshua D; Subramaniapillai, Mehala; Shekotikhina, Margarita; Almatham, Fahad; Alageel, Asem; Mansur, Rodrigo; Ho, Roger C; McIntyre, Roger S

2018-01-01

To determine, via narrative, non-systematic review of pre-clinical and clinical studies, whether the effect of cannabis on hepatic biotransformation pathways would be predicted to result in clinically significant drug-drug interactions (DDIs) with commonly prescribed psychotropic agents. Areas covered: A non-systematic literature search was conducted using the following databases: PubMed, PsycInfo, and Scopus from inception to January 2017. The search term cannabis was cross-referenced with the terms drug interactions, cytochrome, cannabinoids, cannabidiol, and medical marijuana. Pharmacological, molecular, and physiologic studies evaluating the pharmacokinetics of Δ 9 -tetrahydrocannabinol (Δ 9 -THC) and cannabidiol (CBD), both in vitro and in vivo, were included. Bibliographies were also manually searched for additional citations that were relevant to the overarching aim of this paper. Expert opinion: Δ 9 -Tetrahydrocannabinol and CBD are substrates and inhibitors of cytochrome P450 enzymatic pathways relevant to the biotransformation of commonly prescribed psychotropic agents. The high frequency and increasing use of cannabis invites the need for healthcare providers to familiarize themselves with potential DDIs in persons receiving select psychotropic agents, and additionally consuming medical marijuana and/or recreational marijuana.
Controversial roles played by toll like receptor 4 in urinary bladder cancer; A systematic review.

PubMed

Afsharimoghaddam, Amin; Soleimani, Mohammad; Lashay, Alireza; Dehghani, Mahdi; Sepehri, Zahra

2016-08-01

Urinary bladder cancer (UBC) is a prevalent human cancer. The main mechanisms which lead to eradication or progression the disease has yet to be clarified. Toll like receptor (TLR) 4 is a membrane receptor which is expressed either on immune cells or tumor cells. This review article was aimed to clear the main mechanisms played by TLR4 and its related intracellular pathways on outcome of UBC. PubMed, Scopus and Google scholar databases have been used for searching related research articles which have evaluated the roles played by TLR4 and its related intracellular pathways on outcome of UBC. Collected information from the related articles revealed that TLR4 either participates in induction of immune responses against UBC or development of the malignancy. There are limited investigations regarding the genetic variations of TLR4 in UBC. According to the results it seems that TLR4/ligands interaction outcome is dependent on several factors including TLR4 ligand doses, interaction of TLR4 with its ligands on immune cells or tumor cells, and other TLRs/ligand interaction simultaneously. Copyright © 2016 Elsevier Inc. All rights reserved.
SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

PubMed Central

2013-01-01

Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. Conclusions The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. PMID:24267658
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).

PubMed

Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo

2017-10-05

Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Analyzing Gene Expression Proﬁles with Preliminary Validations in Cardiac Hypertrophy Induced by Pressure-overload.

PubMed

Gao, Jing; Li, Yuhong; Wang, Tongmei; Shi, Zhuo; Zhang, Yiqi; Liu, Shuang; Wen, Pushuai; Ma, Chunyan

2018-03-06

The aim of this study was to identify the key genes involved in the cardiac hypertrophy (CH) induced by pressure overload. mRNA microarray dataset GSE5500 and GSE18801 were downloaded from GEO database, and differentially expressed genes (DEGs) were screened using Limma package; then, functional and pathway enrichment analysis were performed for common DEGs using DAVID database. Furthermore, the top DEGs were further validated using qPCR in the hypertrophic heart tissue induced by Isoprenaline (ISO). A total of 113 common DEGs with absolute fold change >0.5, including 60 significantly up-regulated DEGs and 53 down-regulated DEGs were obtained. GO term enrichment analysis suggested that common up-regulated DEG mainly enriched in neutrophil chemotaxis, extracellular fibril organization and cell proliferation, and the common down-regulated genes were signiﬁcantly enriched in ion transport, endoplasmic reticulum and dendritic spine. KEGG pathway analysis found that the common DEGs were mainly enriched in ECM-receptor interaction, phagosome, and focal adhesion. Additionally, the expression of Mfap4, Ltbp2, Aspn, Serpina3n, and Cnksr1 were up-regulated in the model of cardiac hypertrophy, while the expression of Anp32a was down-regulated. The current study identified the key deregulated genes and pathways involved in the CH, which could shed new light to understand the mechanism of CH.
Bioinformatics functional analysis of let-7a, miR-34a, and miR-199a/b reveals novel insights into immune system pathways and cancer hallmarks for hepatocellular carcinoma.

PubMed

Soliman, Bangly; Salem, Ahmed; Ghazy, Mohamed; Abu-Shahba, Nourhan; El Hefnawi, Mahmoud

2018-05-01

Let-7a, miR-34a, and miR-199 a/b have gained a great attention as master regulators for cellular processes. In particular, these three micro-RNAs act as potential onco-suppressors for hepatocellular carcinoma. Bioinformatics can reveal the functionality of these micro-RNAs through target prediction and functional annotation analysis. In the current study, in silico analysis using innovative servers (miRror Suite, DAVID, miRGator V3.0, GeneTrail) has demonstrated the combinatorial and the individual target genes of these micro-RNAs and further explored their roles in hepatocellular carcinoma progression. There were 87 common target messenger RNAs (p ≤ 0.05) that were predicted to be regulated by the three micro-RNAs using miRror 2.0 target prediction tool. In addition, the functional enrichment analysis of these targets that was performed by DAVID functional annotation and REACTOME tools revealed two major immune-related pathways, eight hepatocellular carcinoma hallmarks-linked pathways, and two pathways that mediate interconnected processes between immune system and hepatocellular carcinoma hallmarks. Moreover, protein-protein interaction network for the predicted common targets was obtained by using STRING database. The individual analysis of target genes and pathways for the three micro-RNAs of interest using miRGator V3.0 and GeneTrail servers revealed some novel predicted target oncogenes such as SOX4, which we validated experimentally, in addition to some regulated pathways of immune system and hepatocarcinogenesis such as insulin signaling pathway and adipocytokine signaling pathway. In general, our results demonstrate that let-7a, miR-34a, and miR-199 a/b have novel interactions in different immune system pathways and major hepatocellular carcinoma hallmarks. Thus, our findings shed more light on the roles of these miRNAs as cancer silencers.

Targetome Analysis Revealed Involvement of MiR-126 in Neurotrophin Signaling Pathway: A Possible Role in Prevention of Glioma Development.

PubMed

Rouigari, Maedeh; Dehbashi, Moein; Ghaedi, Kamran; Pourhossein, Meraj

2018-07-01

For the first time, we used molecular signaling pathway enrichment analysis to determine possible involvement of miR-126 and IRS-1 in neurotrophin pathway. In this prospective study, Validated and predicted targets (targetome) of miR-126 were collected following searching miRtarbase (http://mirtarbase.mbc.nctu.edu.tw/) and miRWalk 2.0 databases, respectively. Then, approximate expression of miR-126 targeting in Glioma tissue was examined using UniGene database (http://www.ncbi. nlm.nih.gov/unigene). In silico molecular pathway enrichment analysis was carried out by DAVID 6.7 database (http://david. abcc.ncifcrf.gov/) to explore which signaling pathway is related to miR-126 targeting and how miR-126 attributes to glioma development. MiR-126 exerts a variety of functions in cancer pathogenesis via suppression of expression of target gene including PI3K, KRAS, EGFL7, IRS-1 and VEGF. Our bioinformatic studies implementing DAVID database, showed the involvement of miR-126 target genes in several signaling pathways including cancer pathogenesis, neurotrophin functions, Glioma formation, insulin function, focal adhesion production, chemokine synthesis and secretion and regulation of the actin cytoskeleton. Taken together, we concluded that miR-126 enhances the formation of glioma cancer stem cell probably via down regulation of IRS-1 in neurotrophin signaling pathway. Copyright© by Royan Institute. All rights reserved.
A dedicated database system for handling multi-level data in systems biology.

PubMed

Pornputtapong, Natapol; Wanichthanarak, Kwanjeera; Nilsson, Avlant; Nookaew, Intawat; Nielsen, Jens

2014-01-01

Advances in high-throughput technologies have enabled extensive generation of multi-level omics data. These data are crucial for systems biology research, though they are complex, heterogeneous, highly dynamic, incomplete and distributed among public databases. This leads to difficulties in data accessibility and often results in errors when data are merged and integrated from varied resources. Therefore, integration and management of systems biological data remain very challenging. To overcome this, we designed and developed a dedicated database system that can serve and solve the vital issues in data management and hereby facilitate data integration, modeling and analysis in systems biology within a sole database. In addition, a yeast data repository was implemented as an integrated database environment which is operated by the database system. Two applications were implemented to demonstrate extensibility and utilization of the system. Both illustrate how the user can access the database via the web query function and implemented scripts. These scripts are specific for two sample cases: 1) Detecting the pheromone pathway in protein interaction networks; and 2) Finding metabolic reactions regulated by Snf1 kinase. In this study we present the design of database system which offers an extensible environment to efficiently capture the majority of biological entities and relations encountered in systems biology. Critical functions and control processes were designed and implemented to ensure consistent, efficient, secure and reliable transactions. The two sample cases on the yeast integrated data clearly demonstrate the value of a sole database environment for systems biology research.
An Approach for Identification of Novel Drug Targets in Streptococcus pyogenes SF370 Through Pathway Analysis.

PubMed

Singh, Satendra; Singh, Dev Bukhsh; Singh, Anamika; Gautam, Budhayash; Ram, Gurudayal; Dwivedi, Seema; Ramteke, Pramod W

2016-12-01

Streptococcus pyogenes is one of the most important pathogens as it is involved in various infections affecting upper respiratory tract and skin. Due to the emergence of multidrug resistance and cross-resistance, S. Pyogenes is becoming more pathogenic and dangerous. In the present study, an in silico comparative analysis of total 65 metabolic pathways of the host (Homo sapiens) and the pathogen was performed. Initially, 486 paralogous enzymes were identified so that they can be removed from possible drug target list. The 105 enzymes of the biochemical pathways of S. pyogenes from the KEGG metabolic pathway database were compared with the proteins from the Homo sapiens by performing a BLASTP search against the non-redundant database restricted to the Homo sapiens subset. Out of these, 83 enzymes were identified as non-human homologous while 30 enzymes of inadequate amino acid length were removed for further processing. Essential enzymes were finally mined from remaining 53 enzymes. Finally, 28 essential enzymes were identified in S. pyogenes SF370 (serotype M1). In subcellular localization study, 18 enzymes were predicted with cytoplasmic localization and ten enzymes with the membrane localization. These ten enzymes with putative membrane localization should be of particular interest. Acyl-carrier-protein S-malonyltransferase, DNA polymerase III subunit beta and dihydropteroate synthase are novel drug targets and thus can be used to design potential inhibitors against S. pyogenes infection. 3D structure of dihydropteroate synthase was modeled and validated that can be used for virtual screening and interaction study of potential inhibitors with the target enzyme.
The chemokine receptor CCR1 is identified in mast cell-derived exosomes.

PubMed

Liang, Yuting; Qiao, Longwei; Peng, Xia; Cui, Zelin; Yin, Yue; Liao, Huanjin; Jiang, Min; Li, Li

2018-01-01

Mast cells are important effector cells of the immune system, and mast cell-derived exosomes carrying RNAs play a role in immune regulation. However, the molecular function of mast cell-derived exosomes is currently unknown, and here, we identify differentially expressed genes (DEGs) in mast cells and exosomes. We isolated mast cells derived exosomes through differential centrifugation and screened the DEGs from mast cell-derived exosomes, using the GSE25330 array dataset downloaded from the Gene Expression Omnibus database. Biochemical pathways were analyzed by Gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway on the online tool DAVID. DEGs-associated protein-protein interaction networks (PPIs) were constructed using the STRING database and Cytoscape software. The genes identified from these bioinformatics analyses were verified by qRT-PCR and Western blot in mast cells and exosomes. We identified 2121 DEGs (843 up and 1278 down-regulated genes) in HMC-1 cell-derived exosomes and HMC-1 cells. The up-regulated DEGs were classified into two significant modules. The chemokine receptor CCR1 was screened as a hub gene and enriched in cytokine-mediated signaling pathway in module one. Seven genes, including CCR1, CD9, KIT, TGFBR1, TLR9, TPSAB1 and TPSB2 were screened and validated through qRT-PCR analysis. We have achieved a comprehensive view of the pivotal genes and pathways in mast cells and exosomes and identified CCR1 as a hub gene in mast cell-derived exosomes. Our results provide novel clues with respect to the biological processes through which mast cell-derived exosomes modulate immune responses.
Columba: an integrated database of proteins, structures, and annotations.

PubMed

Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf

2005-03-31

Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
A Brief Review of RNA–Protein Interaction Database Resources

PubMed Central

Yi, Ying; Zhao, Yue; Huang, Yan; Wang, Dong

2017-01-01

RNA–Protein interactions play critical roles in various biological processes. By collecting and analyzing the RNA–Protein interactions and binding sites from experiments and predictions, RNA–Protein interaction databases have become an essential resource for the exploration of the transcriptional and post-transcriptional regulatory network. Here, we briefly review several widely used RNA–Protein interaction database resources developed in recent years to provide a guide of these databases. The content and major functions in databases are presented. The brief description of database helps users to quickly choose the database containing information they interested. In short, these RNA–Protein interaction database resources are continually updated, but the current state shows the efforts to identify and analyze the large amount of RNA–Protein interactions. PMID:29657278
The Hippo/YAP pathway interacts with EGFR signaling and HPV oncoproteins to regulate cervical cancer progression

PubMed Central

He, Chunbo; Mao, Dagan; Hua, Guohua; Lv, Xiangmin; Chen, Xingcheng; Angeletti, Peter C; Dong, Jixin; Remmenga, Steven W; Rodabaugh, Kerry J; Zhou, Jin; Lambert, Paul F; Yang, Peixin; Davis, John S; Wang, Cheng

2015-01-01

The Hippo signaling pathway controls organ size and tumorigenesis through a kinase cascade that inactivates Yes-associated protein (YAP). Here, we show that YAP plays a central role in controlling the progression of cervical cancer. Our results suggest that YAP expression is associated with a poor prognosis for cervical cancer. TGF-α and amphiregulin (AREG), via EGFR, inhibit the Hippo signaling pathway and activate YAP to induce cervical cancer cell proliferation and migration. Activated YAP allows for up-regulation of TGF-α, AREG, and EGFR, forming a positive signaling loop to drive cervical cancer cell proliferation. HPV E6 protein, a major etiological molecule of cervical cancer, maintains high YAP protein levels in cervical cancer cells by preventing proteasome-dependent YAP degradation to drive cervical cancer cell proliferation. Results from human cervical cancer genomic databases and an accepted transgenic mouse model strongly support the clinical relevance of the discovered feed-forward signaling loop. Our study indicates that combined targeting of the Hippo and the ERBB signaling pathways represents a novel therapeutic strategy for prevention and treatment of cervical cancer. PMID:26417066
Exploring consumer exposure pathways and patterns of use for chemicals in the environment through the Chemical/Product Categories Database

EPA Pesticide Factsheets

Exploring consumer exposure pathways and patterns of use for chemicals in the environment through the Chemical/Product Categories Database (CPCat) (Presented by: Kathie Dionisio, Sc.D., NERL, US EPA, Research Triangle Park, NC (1/23/2014).
Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.

PubMed

Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C

2015-02-25

Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Concept mapping One-Carbon Metabolism to model future ontologies for nutrient-gene-phenotype interactions.

PubMed

Joslin, A C; Green, R; German, J B; Lange, M C

2014-09-01

Advances in the development of bioinformatic tools continue to improve investigators' ability to interrogate, organize, and derive knowledge from large amounts of heterogeneous information. These tools often require advanced technical skills not possessed by life scientists. User-friendly, low-barrier-to-entry methods of visualizing nutrigenomics information are yet to be developed. We utilized concept mapping software from the Institute for Human and Machine Cognition to create a conceptual model of diet and health-related data that provides a foundation for future nutrigenomics ontologies describing published nutrient-gene/polymorphism-phenotype data. In this model, maps containing phenotype, nutrient, gene product, and genetic polymorphism interactions are visualized as triples of two concepts linked together by a linking phrase. These triples, or "knowledge propositions," contextualize aggregated data and information into easy-to-read knowledge maps. Maps of these triples enable visualization of genes spanning the One-Carbon Metabolism (OCM) pathway, their sequence variants, and multiple literature-mined associations including concepts relevant to nutrition, phenotypes, and health. The concept map development process documents the incongruity of information derived from pathway databases versus literature resources. This conceptual model highlights the importance of incorporating information about genes in upstream pathways that provide substrates, as well as downstream pathways that utilize products of the pathway under investigation, in this case OCM. Other genes and their polymorphisms, such as TCN2 and FUT2, although not directly involved in OCM, potentially alter OCM pathway functionality. These upstream gene products regulate substrates such as B12. Constellations of polymorphisms affecting the functionality of genes along OCM, together with substrate and cofactor availability, may impact resultant phenotypes. These conceptual maps provide a foundational framework for development of nutrient-gene/polymorphism-phenotype ontologies and systems visualization.
Differences in gene expression profiles and signaling pathways in rhabdomyolysis-induced acute kidney injury.

PubMed

Geng, Xiaodong; Wang, Yuanda; Hong, Quan; Yang, Jurong; Zheng, Wei; Zhang, Gang; Cai, Guangyan; Chen, Xiangmei; Wu, Di

2015-01-01

Rhabdomyolysis is a threatening syndrome because it causes the breakdown of skeletal muscle. Muscle destruction leads to the release of myoglobin, intracellular proteins, and electrolytes into the circulation. The aim of this study was to investigate the differences in gene expression profiles and signaling pathways upon rhabdomyolysis-induced acute kidney injury (AKI). In this study, we used glycerol-induced renal injury as a model of rhabdomyolysis-induced AKI. We analyzed data and relevant information from the Gene Expression Omnibus database (No: GSE44925). The gene expression data for three untreated mice were compared to data for five mice with rhabdomyolysis-induced AKI. The expression profiling of the three untreated mice and the five rhabdomyolysis-induced AKI mice was performed using microarray analysis. We examined the levels of Cyp3a13, Rela, Aldh7a1, Jun, CD14. And Cdkn1a using RT-PCR to determine the accuracy of the microarray results. The microarray analysis showed that there were 1050 downregulated and 659 upregulated genes in the rhabdomyolysis-induced AKI mice compared to the control group. The interactions of all differentially expressed genes in the Signal-Net were analyzed. Cyp3a13 and Rela had the most interactions with other genes. The data showed that Rela and Aldh7a1 were the key nodes and had important positions in the Signal-Net. The genes Jun, CD14, and Cdkn1a were also significantly upregulated. The pathway analysis classified the differentially expressed genes into 71 downregulated and 48 upregulated pathways including the PI3K/Akt, MAPK, and NF-κB signaling pathways. The results of this study indicate that the NF-κB, MAPK, PI3K/Akt, and apoptotic pathways are regulated in rhabdomyolysis-induced AKI.
Elucidation of metabolic pathways from enzyme classification data.

PubMed

McDonald, Andrew G; Tipton, Keith F

2014-01-01

The IUBMB Enzyme List is widely used by other databases as a source for avoiding ambiguity in the recognition of enzymes as catalytic entities. However, it was not designed for metabolic pathway tracing, which has become increasingly important in systems biology. A Reactions Database has been created from the material in the Enzyme List to allow reactions to be searched by substrate/product, and pathways to be traced from any selected starting/seed substrate. An extensive synonym glossary allows searches by many of the alternative names, including accepted abbreviations, by which a chemical compound may be known. This database was necessary for the development of the application Reaction Explorer ( http://www.reaction-explorer.org ), which was written in Real Studio ( http://www.realsoftware.com/realstudio/ ) to search the Reactions Database and draw metabolic pathways from reactions selected by the user. Having input the name of the starting compound (the "seed"), the user is presented with a list of all reactions containing that compound and then selects the product of interest as the next point on the ensuing graph. The pathway diagram is then generated as the process iterates. A contextual menu is provided, which allows the user: (1) to remove a compound from the graph, along with all associated links; (2) to search the reactions database again for additional reactions involving the compound; (3) to search for the compound within the Enzyme List.
Pathway and network-based analysis of genome-wide association studies and RT-PCR validation in polycystic ovary syndrome

PubMed Central

Shen, Haoran; Liang, Zhou; Zheng, Saihua; Li, Xuelian

2017-01-01

The purpose of this study was to identify promising candidate genes and pathways in polycystic ovary syndrome (PCOS). Microarray dataset GSE345269 obtained from the Gene Expression Omnibus database includes 7 granulosa cell samples from PCOS patients, and 3 normal granulosa cell samples. Differentially expressed genes (DEGs) were screened between PCOS and normal samples. Pathway enrichment analysis was conducted for DEGs using ClueGO and CluePedia plugin of Cytoscape. A Reactome functional interaction (FI) network of the DEGs was built using ReactomeFIViz, and then network modules were extracted, followed by pathway enrichment analysis for the modules. Expression of DEGs in granulosa cell samples was measured using quantitative RT-PCR. A total of 674 DEGs were retained, which were significantly enriched with inflammation and immune-related pathways. Eight modules were extracted from the Reactome FI network. Pathway enrichment analysis revealed significant pathways of each module: module 0, Regulation of RhoA activity and Signaling by Rho GTPases pathways shared ARHGAP4 and ARHGAP9; module 2, GlycoProtein VI-mediated activation cascade pathway was enriched with RHOG; module 3, Thromboxane A2 receptor signaling, Chemokine signaling pathway, CXCR4-mediated signaling events pathways were enriched with LYN, the hub gene of module 3. Results of RT-PCR confirmed the finding of the bioinformatic analysis that ARHGAP4, ARHGAP9, RHOG and LYN were significantly upregulated in PCOS. RhoA-related pathways, GlycoProtein VI-mediated activation cascade pathway, ARHGAP4, ARHGAP9, RHOG and LYN may be involved in the pathogenesis of PCOS. PMID:28949383
Screening of potential genes contributing to the macrocycle drug resistance of C. albicans via microarray analysis

PubMed Central

Yang, Jing; Zhang, Wei; Sun, Jian; Xi, Zhiqin; Qiao, Zusha; Zhang, Jinyu; Wang, Yan; Ji, Ying; Feng, Wenli

2017-01-01

The aim of the present study was to investigate the potential genes involved in drug resistance of Candida albicans (C. albicans) by performing microarray analysis. The gene expression profile of GSE65396 was downloaded from the Gene Expression Omnibus, including a control, 15-min and 45-min macrocyclic compound RF59-treated group with three repeats for each. Following preprocessing using RAM, the differentially expressed genes (DEGs) were screened using the Limma package. Subsequently, the Kyoto Encyclopedia of Genes and Genomes pathways of these genes were analyzed using the Database for Annotation, Visualization and Integrated Discovery. Based on interactions estimated by the Search Tool for Retrieval of Interacting Gene, the protein-protein interaction (PPI) network was visualized using Cytoscape. Subnetwork analysis was performed using ReactomeFI. A total of 154 upregulated and 27 downregulated DEGs were identified in the 15-min treated group, compared with the control, and 235 upregulated and 233 downregulated DEGs were identified in the 45-min treated group, compared with the control. The upregulated DEGs were significantly enriched in the ribosome pathway. Based on the PPI network, PRP5, RCL1, NOP13, NOP4 and MRT4 were the top five nodes in the 15-min treated comparison. GIS2, URA3, NOP58, ELP3 and PLP7 were the top five nodes in the 45-min treated comparison, and its subnetwork was significantly enriched in the ribosome pathway. The macrocyclic compound RF59 had a notable effect on the ribosome and its associated pathways of C. albicans. RCL1, NOP4, MRT4, GIS2 and NOP58 may be important in RF59-resistance. PMID:28944888
A comprehensive map of the influenza A virus replication cycle

PubMed Central

2013-01-01

Background Influenza is a common infectious disease caused by influenza viruses. Annual epidemics cause severe illnesses, deaths, and economic loss around the world. To better defend against influenza viral infection, it is essential to understand its mechanisms and associated host responses. Many studies have been conducted to elucidate these mechanisms, however, the overall picture remains incompletely understood. A systematic understanding of influenza viral infection in host cells is needed to facilitate the identification of influential host response mechanisms and potential drug targets. Description We constructed a comprehensive map of the influenza A virus (‘IAV’) life cycle (‘FluMap’) by undertaking a literature-based, manual curation approach. Based on information obtained from publicly available pathway databases, updated with literature-based information and input from expert virologists and immunologists, FluMap is currently composed of 960 factors (i.e., proteins, mRNAs etc.) and 456 reactions, and is annotated with ~500 papers and curation comments. In addition to detailing the type of molecular interactions, isolate/strain specific data are also available. The FluMap was built with the pathway editor CellDesigner in standard SBML (Systems Biology Markup Language) format and visualized as an SBGN (Systems Biology Graphical Notation) diagram. It is also available as a web service (online map) based on the iPathways+ system to enable community discussion by influenza researchers. We also demonstrate computational network analyses to identify targets using the FluMap. Conclusion The FluMap is a comprehensive pathway map that can serve as a graphically presented knowledge-base and as a platform to analyze functional interactions between IAV and host factors. Publicly available webtools will allow continuous updating to ensure the most reliable representation of the host-virus interaction network. The FluMap is available at http://www.influenza-x.org/flumap/. PMID:24088197
Enhancing a Pathway-Genome Database (PGDB) to capture subcellular localization of metabolites and enzymes: the nucleotide-sugar biosynthetic pathways of Populus trichocarpa.

PubMed

Nag, Ambarish; Karpinets, Tatiana V; Chang, Christopher H; Bar-Peled, Maor

2012-01-01

Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s). Database URL: The curated Populus PGDB is available in the BESC public portal at http://cricket.ornl.gov/cgi-bin/beocyc_home.cgi and the nucleotide-sugar biosynthetic pathways can be directly accessed at http://cricket.ornl.gov:1555/PTR/new-image?object=SUGAR-NUCLEOTIDES.
Enhancing a Pathway-Genome Database (PGDB) to capture subcellular localization of metabolites and enzymes: the nucleotide-sugar biosynthetic pathways of Populus trichocarpa

PubMed Central

Nag, Ambarish; Karpinets, Tatiana V.; Chang, Christopher H.; Bar-Peled, Maor

2012-01-01

Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s). Database URL: The curated Populus PGDB is available in the BESC public portal at http://cricket.ornl.gov/cgi-bin/beocyc_home.cgi and the nucleotide-sugar biosynthetic pathways can be directly accessed at http://cricket.ornl.gov:1555/PTR/new-image?object=SUGAR-NUCLEOTIDES. PMID:22465851
Identifying pathways affected by cancer mutations.

PubMed

Iengar, Prathima

2017-12-16

Mutations in 15 cancers, sourced from the COSMIC Whole Genomes database, and 297 human pathways, arranged into pathway groups based on the processes they orchestrate, and sourced from the KEGG pathway database, have together been used to identify pathways affected by cancer mutations. Genes studied in ≥15, and mutated in ≥10 samples of a cancer have been considered recurrently mutated, and pathways with recurrently mutated genes have been considered affected in the cancer. Novel doughnut plots have been presented which enable visualization of the extent to which pathways and genes, in each pathway group, are targeted, in each cancer. The 'organismal systems' pathway group (including organism-level pathways; e.g., nervous system) is the most targeted, more than even the well-recognized signal transduction, cell-cycle and apoptosis, and DNA repair pathway groups. The important, yet poorly-recognized, role played by the group merits attention. Pathways affected in ≥7 cancers yielded insights into processes affected. Copyright © 2017 Elsevier Inc. All rights reserved.
The Reactome Pathway Knowledgebase

PubMed Central

Jupe, Steven; Matthews, Lisa; Sidiropoulos, Konstantinos; Gillespie, Marc; Garapati, Phani; Haw, Robin; Jassal, Bijay; Korninger, Florian; May, Bruce; Milacic, Marija; Roca, Corina Duenas; Rothfels, Karen; Sevilla, Cristoffer; Shamovsky, Veronica; Shorser, Solomon; Varusai, Thawfeek; Viteri, Guilherme; Weiser, Joel

2018-01-01

Abstract The Reactome Knowledgebase (https://reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism, and other cellular processes as an ordered network of molecular transformations—an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression profiles or somatic mutation catalogues from tumor cells. To support the continued brisk growth in the size and complexity of Reactome, we have implemented a graph database, improved performance of data analysis tools, and designed new data structures and strategies to boost diagram viewer performance. To make our website more accessible to human users, we have improved pathway display and navigation by implementing interactive Enhanced High Level Diagrams (EHLDs) with an associated icon library, and subpathway highlighting and zooming, in a simplified and reorganized web site with adaptive design. To encourage re-use of our content, we have enabled export of pathway diagrams as ‘PowerPoint’ files. PMID:29145629
Investigating ego modules and pathways in osteosarcoma by integrating the EgoNet algorithm and pathway analysis.

PubMed

Chen, X Y; Chen, Y H; Zhang, L J; Wang, Y; Tong, Z C

2017-02-16

Osteosarcoma (OS) is the most common primary bone malignancy, but current therapies are far from effective for all patients. A better understanding of the pathological mechanism of OS may help to achieve new treatments for this tumor. Hence, the objective of this study was to investigate ego modules and pathways in OS utilizing EgoNet algorithm and pathway-related analysis, and reveal pathological mechanisms underlying OS. The EgoNet algorithm comprises four steps: constructing background protein-protein interaction (PPI) network (PPIN) based on gene expression data and PPI data; extracting differential expression network (DEN) from the background PPIN; identifying ego genes according to topological features of genes in reweighted DEN; and collecting ego modules using module search by ego gene expansion. Consequently, we obtained 5 ego modules (Modules 2, 3, 4, 5, and 6) in total. After applying the permutation test, all presented statistical significance between OS and normal controls. Finally, pathway enrichment analysis combined with Reactome pathway database was performed to investigate pathways, and Fisher's exact test was conducted to capture ego pathways for OS. The ego pathway for Module 2 was CLEC7A/inflammasome pathway, while for Module 3 a tetrasaccharide linker sequence was required for glycosaminoglycan (GAG) synthesis, and for Module 6 was the Rho GTPase cycle. Interestingly, genes in Modules 4 and 5 were enriched in the same pathway, the 2-LTR circle formation. In conclusion, the ego modules and pathways might be potential biomarkers for OS therapeutic index, and give great insight of the molecular mechanism underlying this tumor.

Investigating ego modules and pathways in osteosarcoma by integrating the EgoNet algorithm and pathway analysis

PubMed Central

Chen, X.Y.; Chen, Y.H.; Zhang, L.J.; Wang, Y.; Tong, Z.C.

2017-01-01

Osteosarcoma (OS) is the most common primary bone malignancy, but current therapies are far from effective for all patients. A better understanding of the pathological mechanism of OS may help to achieve new treatments for this tumor. Hence, the objective of this study was to investigate ego modules and pathways in OS utilizing EgoNet algorithm and pathway-related analysis, and reveal pathological mechanisms underlying OS. The EgoNet algorithm comprises four steps: constructing background protein-protein interaction (PPI) network (PPIN) based on gene expression data and PPI data; extracting differential expression network (DEN) from the background PPIN; identifying ego genes according to topological features of genes in reweighted DEN; and collecting ego modules using module search by ego gene expansion. Consequently, we obtained 5 ego modules (Modules 2, 3, 4, 5, and 6) in total. After applying the permutation test, all presented statistical significance between OS and normal controls. Finally, pathway enrichment analysis combined with Reactome pathway database was performed to investigate pathways, and Fisher's exact test was conducted to capture ego pathways for OS. The ego pathway for Module 2 was CLEC7A/inflammasome pathway, while for Module 3 a tetrasaccharide linker sequence was required for glycosaminoglycan (GAG) synthesis, and for Module 6 was the Rho GTPase cycle. Interestingly, genes in Modules 4 and 5 were enriched in the same pathway, the 2-LTR circle formation. In conclusion, the ego modules and pathways might be potential biomarkers for OS therapeutic index, and give great insight of the molecular mechanism underlying this tumor. PMID:28225867
Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers

PubMed Central

2013-01-01

Background Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer. Results We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis. Conclusions In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/. PMID:23822816
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

PubMed Central

Kim, Woonsu; Park, Hyesun; Seo, Seongwon

2016-01-01

The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID:26992093
An attempt to understand glioma stem cell biology through centrality analysis of a protein interaction network.

PubMed

Mallik, Mrinmay Kumar

2018-02-07

Biological networks can be analyzed using "Centrality Analysis" to identify the more influential nodes and interactions in the network. This study was undertaken to create and visualize a biological network comprising of protein-protein interactions (PPIs) amongst proteins which are preferentially over-expressed in glioma cancer stem cell component (GCSC) of glioblastomas as compared to the glioma non-stem cancer cell (GNSC) component and then to analyze this network through centrality analyses (CA) in order to identify the essential proteins in this network and their interactions. In addition, this study proposes a new centrality analysis method pertaining exclusively to transcription factors (TFs) and interactions amongst them. Moreover the relevant molecular functions, biological processes and biochemical pathways amongst these proteins were sought through enrichment analysis. A protein interaction network was created using a list of proteins which have been shown to be preferentially expressed or over-expressed in GCSCs isolated from glioblastomas as compared to the GNSCs. This list comprising of 38 proteins, created using manual literature mining, was submitted to the Reactome FIViz tool, a web based application integrated into Cytoscape, an open source software platform for visualizing and analyzing molecular interaction networks and biological pathways to produce the network. This network was subjected to centrality analyses utilizing ranked lists of six centrality measures using the FIViz application and (for the first time) a dedicated centrality analysis plug-in ; CytoNCA. The interactions exclusively amongst the transcription factors were nalyzed through a newly proposed centrality analysis method called "Gene Expression Associated Degree Centrality Analysis (GEADCA)". Enrichment analysis was performed using the "network function analysis" tool on Reactome. The CA was able to identify a small set of proteins with consistently high centrality ranks that is indicative of their strong influence in the protein protein interaction network. Similarly the newly proposed GEADCA helped identify the transcription factors with high centrality values indicative of their key roles in transcriptional regulation. The enrichment studies provided a list of molecular functions, biological processes and biochemical pathways associated with the constructed network. The study shows how pathway based databases may be used to create and analyze a relevant protein interaction network in glioma cancer stem cells and identify the essential elements within it to gather insights into the molecular interactions that regulate the properties of glioma stem cells. How these insights may be utilized to help the development of future research towards formulation of new management strategies have been discussed from a theoretical standpoint. Copyright © 2017 Elsevier Ltd. All rights reserved.
Identification and functional analysis of risk-related microRNAs for the prognosis of patients with bladder urothelial carcinoma.

PubMed

Gao, Ji; Li, Hongyan; Liu, Lei; Song, Lide; Lv, Yanting; Han, Yuping

2017-12-01

The aim of the present study was to investigate risk-related microRNAs (miRs) for bladder urothelial carcinoma (BUC) prognosis. Clinical and microRNA expression data downloaded from the Cancer Genome Atlas were utilized for survival analysis. Risk factor estimation was performed using Cox's proportional regression analysis. A microRNA-regulated target gene network was constructed and presented using Cytoscape. In addition, the Database for Annotation, Visualization and Integrated Discovery was used for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway enrichment, followed by protein-protein interaction (PPI) network analysis. Finally, the K-clique method was applied to analyze sub-pathways. A total of 16 significant microRNAs, including hsa-miR-3622a and hsa-miR-29a, were identified (P<0.05). Following Cox's proportional regression analysis, hsa-miR-29a was screened as a prognostic marker of BUC risk (P=0.0449). A regulation network of hsa-miR-29a comprising 417 target genes was constructed. These target genes were primarily enriched in GO terms, including collagen fibril organization, extracellular matrix (ECM) organization and pathways, such as focal adhesion (P<0.05). A PPI network including 197 genes and 510 interactions, was constructed. The top 21 genes in the network module were enriched in GO terms, including collagen fibril organization and pathways, such as ECM receptor interaction (P<0.05). Finally, 4 sub-pathways of cysteine and methionine metabolism, including paths 00270_4, 00270_1, 00270_2 and 00270_5, were obtained (P<0.01) and identified to be enriched through DNA (cytosine-5)-methyltransferase ( DNMT)3A, DNMT3B , methionine adenosyltransferase 2α ( MAT2A ) and spermine synthase ( SMS ). The identified microRNAs, particularly hsa-miR-29a and its 4 associated target genes DNMT3A, DNMT3B, MAT2A and SMS , may participate in the prognostic risk mechanism of BUC.
In Silico Enhancing M. tuberculosis Protein Interaction Networks in STRING To Predict Drug-Resistance Pathways and Pharmacological Risks.

PubMed

Mei, Suyu

2018-05-04

Bacterial protein-protein interaction (PPI) networks are significant to reveal the machinery of signal transduction and drug resistance within bacterial cells. The database STRING has collected a large number of bacterial pathogen PPI networks, but most of the data are of low quality without being experimentally or computationally validated, thus restricting its further biomedical applications. We exploit the experimental data via four solutions to enhance the quality of M. tuberculosis H37Rv (MTB) PPI networks in STRING. Computational results show that the experimental data derived jointly by two-hybrid and copurification approaches are the most reliable to train an L 2 -regularized logistic regression model for MTB PPI network validation. On the basis of the validated MTB PPI networks, we further study the three problems via breadth-first graph search algorithm: (1) discovery of MTB drug-resistance pathways through searching for the paths between known drug-target genes and drug-resistance genes, (2) choosing potential cotarget genes via searching for the critical genes located on multiple pathways, and (3) choosing essential drug-target genes via analysis of network degree distribution. In addition, we further combine the validated MTB PPI networks with human PPI networks to analyze the potential pharmacological risks of known and candidate drug-target genes from the point of view of system pharmacology. The evidence from protein structure alignment demonstrates that the drugs that act on MTB target genes could also adversely act on human signaling pathways.
BioWarehouse: a bioinformatics database warehouse toolkit

PubMed Central

Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

2006-01-01

Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for bioinformatics. PMID:16556315
BioWarehouse: a bioinformatics database warehouse toolkit.

PubMed

Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D

2006-03-23

This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.
The identification of key genes and pathways in hepatocellular carcinoma by bioinformatics analysis of high-throughput data.

PubMed

Zhang, Chaoyang; Peng, Li; Zhang, Yaqin; Liu, Zhaoyang; Li, Wenling; Chen, Shilian; Li, Guancheng

2017-06-01

Liver cancer is a serious threat to public health and has fairly complicated pathogenesis. Therefore, the identification of key genes and pathways is of much importance for clarifying molecular mechanism of hepatocellular carcinoma (HCC) initiation and progression. HCC-associated gene expression dataset was downloaded from Gene Expression Omnibus database. Statistical software R was used for significance analysis of differentially expressed genes (DEGs) between liver cancer samples and normal samples. Gene Ontology (GO) term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, based on R software, were applied for the identification of pathways in which DEGs significantly enriched. Cytoscape software was for the construction of protein-protein interaction (PPI) network and module analysis to find the hub genes and key pathways. Finally, weighted correlation network analysis (WGCNA) was conducted to further screen critical gene modules with similar expression pattern and explore their biological significance. Significance analysis identified 1230 DEGs with fold change >2, including 632 significantly down-regulated DEGs and 598 significantly up-regulated DEGs. GO term enrichment analysis suggested that up-regulated DEG significantly enriched in immune response, cell adhesion, cell migration, type I interferon signaling pathway, and cell proliferation, and the down-regulated DEG mainly enriched in response to endoplasmic reticulum stress and endoplasmic reticulum unfolded protein response. KEGG pathway analysis found DEGs significantly enriched in five pathways including complement and coagulation cascades, focal adhesion, ECM-receptor interaction, antigen processing and presentation, and protein processing in endoplasmic reticulum. The top 10 hub genes in HCC were separately GMPS, ACACA, ALB, TGFB1, KRAS, ERBB2, BCL2, EGFR, STAT3, and CD8A, which resulted from PPI network. The top 3 gene interaction modules in PPI network enriched in immune response, organ development, and response to other organism, respectively. WGCNA revealed that the confirmed eight gene modules significantly enriched in monooxygenase and oxidoreductase activity, response to endoplasmic reticulum stress, type I interferon signaling pathway, processing, presentation and binding of peptide antigen, cellular response to cadmium and zinc ion, cell locomotion and differentiation, ribonucleoprotein complex and RNA processing, and immune system process, respectively. In conclusion, we identified some key genes and pathways closely related with HCC initiation and progression by a series of bioinformatics analysis on DEGs. These screened genes and pathways provided for a more detailed molecular mechanism underlying HCC occurrence and progression, holding promise for acting as biomarkers and potential therapeutic targets.
HMDB 4.0: the human metabolome database for 2018.

PubMed

Wishart, David S; Feunang, Yannick Djoumbou; Marcu, Ana; Guo, An Chi; Liang, Kevin; Vázquez-Fresno, Rosa; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Karu, Naama; Sayeeda, Zinat; Lo, Elvis; Assempour, Nazanin; Berjanskii, Mark; Singhal, Sandeep; Arndt, David; Liang, Yonjie; Badran, Hasan; Grant, Jason; Serra-Cayuela, Arnau; Liu, Yifeng; Mandal, Rupa; Neveu, Vanessa; Pon, Allison; Knox, Craig; Wilson, Michael; Manach, Claudine; Scalbert, Augustin

2018-01-04

The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB's chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC-MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
miRPathDB: a new dictionary on microRNAs and target pathways.

PubMed

Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas

2017-01-04

In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/ With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Information technology in chemistry research and education: Part I. Ab initio studies on the hydrolysis of aromatic diazonium ions. Part II. Theoretical study and molecular modeling of non-covalent interactions. Part III. Applying information technology in chemistry education

NASA Astrophysics Data System (ADS)

Wu, Zhengyu

Part I of this dissertation studies the bonding in chemical reactions, while Part II studies the bonding related to inter- and intra-molecular interactions. Part III studies the application of IT technology in chemistry education. Part I of this dissertation (chapter 1 and chapter 2) focuses on the theoretical studies on the mechanism of the hydrolysis reactions of benzenediazonium ion and guaninediazonium ion. The major conclusion is that in hydrolysis reactions the "unimolecular mechanism" actually has to involve the reacting solvent molecule. Therefore, the unimolecular pathway can only serve as a conceptual model but will not happen in the reality. Chapter I concludes that the hydrolysis reaction of benzenediazonium ion takes the direct SN2Ar mechanism via a transition state but without going through a pre-coordination complex. Chapter 2 concludes that the formation of xanthine from the dediazoniation reaction of guaninediazonium ion in water takes the SN2Ar pathway without a transition state. And oxanine might come from an intermediate formed by the bimolecular deprotonation of the H atom on N3 of guaninediazonium ion synchronized with the pyrimidine ring opening reaction. Part II of this dissertation includes chapters 3, 4, and 5. Chapter 3 studies the quadrupole moment of benzene and quadrupole-quadrupole interactions. We concluded that the quadrupole-quadrupole interaction is important in the arene-arene interactions. Our study shows the most stable structure of benzene dimer is the point-to-face T-shaped structure. Chapter 4 studies the intermolecular interactions that result in the disorder of the crystal of 4-Chloroacetophenone-(4-methoxyphenylethylidene). We analyzed all the nearest neighbor interactions within that crystal and found that the crystal structure is determined by its thermo-dynamical properties. Our calculation perfectly reproduced the percentage of parallel-alignment of the crystal. Part III of this dissertation is focused on the application of database management system and computer technology on chemistry education. A database-supported webtool was developed to support the creation of news portfolio and peer reviews online. The responses to an in-class survey show that students embrace the use of this webtool for its conceptually clear design and its easiness of use.
Search for novel remedies to augment radiation resistance of inhabitants of Fukushima and Chernobyl disasters: identifying DNA repair protein XRCC4 inhibitors.

PubMed

Sun, Mao-Feng; Chen, Hsin-Yi; Tsai, Fuu-Jen; Lui, Shu-Hui; Chen, Chih-Yi; Chen, Calvin Yu-Chian

2011-10-01

Two nuclear plant disasters occurring within a span of 25 years threaten health and genome integrity both in Fukushima and Chernobyl. Search for remedies capable of enhancing DNA repair efficiency and radiation resistance in humans appears to be a urgent problem for now. XRCC4 is an important enhancer in promoting repair pathway triggered by DNA double-strand break (DSB). In the context of radiation therapy, active XRCC4 could reduce DSB-mediated apoptotic effect on cancer cells. Hence, developing XRCC4 inhibitors could possibly enhance radiotherapy outcomes. In this study, we screened traditional Chinese medicine (TCM) database, TCM Database@Taiwan, and have identified three potent inhibitor agents against XRCC4. Through molecular dynamics simulation, we have determined that the protein-ligand interactions were focused at Lys188 on chain A and Lys187 on chain B. Intriguingly, the hydrogen bonds for all three ligands fluctuated frequently but were held at close approximation. The pi-cation interactions and ionic interactions mediated by o-hydroxyphenyl and carboxyl functional groups respectively have been demonstrated to play critical roles in stabilizing binding conformations. Based on these results, we reported the identification of potential radiotherapy enhancers from TCM. We further characterized the key binding elements for inhibiting the XRCC4 activities.
A Systems Biology Approach Reveals Converging Molecular Mechanisms that Link Different POPs to Common Metabolic Diseases.

PubMed

Ruiz, Patricia; Perlina, Ally; Mumtaz, Moiz; Fowler, Bruce A

2016-07-01

A number of epidemiological studies have identified statistical associations between persistent organic pollutants (POPs) and metabolic diseases, but testable hypotheses regarding underlying molecular mechanisms to explain these linkages have not been published. We assessed the underlying mechanisms of POPs that have been associated with metabolic diseases; three well-known POPs [2,3,7,8-tetrachlorodibenzodioxin (TCDD), 2,2´,4,4´,5,5´-hexachlorobiphenyl (PCB 153), and 4,4´-dichlorodiphenyldichloroethylene (p,p´-DDE)] were studied. We used advanced database search tools to delineate testable hypotheses and to guide laboratory-based research studies into underlying mechanisms by which this POP mixture could produce or exacerbate metabolic diseases. For our searches, we used proprietary systems biology software (MetaCore™/MetaDrug™) to conduct advanced search queries for the underlying interactions database, followed by directional network construction to identify common mechanisms for these POPs within two or fewer interaction steps downstream of their primary targets. These common downstream pathways belong to various cytokine and chemokine families with experimentally well-documented causal associations with type 2 diabetes. Our systems biology approach allowed identification of converging pathways leading to activation of common downstream targets. To our knowledge, this is the first study to propose an integrated global set of step-by-step molecular mechanisms for a combination of three common POPs using a systems biology approach, which may link POP exposure to diseases. Experimental evaluation of the proposed pathways may lead to development of predictive biomarkers of the effects of POPs, which could translate into disease prevention and effective clinical treatment strategies. Ruiz P, Perlina A, Mumtaz M, Fowler BA. 2016. A systems biology approach reveals converging molecular mechanisms that link different POPs to common metabolic diseases. Environ Health Perspect 124:1034-1041; http://dx.doi.org/10.1289/ehp.1510308.
Xtalk: a path-based approach for identifying crosstalk between signaling pathways

PubMed Central

Tegge, Allison N.; Sharp, Nicholas; Murali, T. M.

2016-01-01

Motivation: Cells communicate with their environment via signal transduction pathways. On occasion, the activation of one pathway can produce an effect downstream of another pathway, a phenomenon known as crosstalk. Existing computational methods to discover such pathway pairs rely on simple overlap statistics. Results: We present Xtalk, a path-based approach for identifying pairs of pathways that may crosstalk. Xtalk computes the statistical significance of the average length of multiple short paths that connect receptors in one pathway to the transcription factors in another. By design, Xtalk reports the precise interactions and mechanisms that support the identified crosstalk. We applied Xtalk to signaling pathways in the KEGG and NCI-PID databases. We manually curated a gold standard set of 132 crosstalking pathway pairs and a set of 140 pairs that did not crosstalk, for which Xtalk achieved an area under the receiver operator characteristic curve of 0.65, a 12% improvement over the closest competing approach. The area under the receiver operator characteristic curve varied with the pathway, suggesting that crosstalk should be evaluated on a pathway-by-pathway level. We also analyzed an extended set of 658 pathway pairs in KEGG and to a set of more than 7000 pathway pairs in NCI-PID. For the top-ranking pairs, we found substantial support in the literature (81% for KEGG and 78% for NCI-PID). We provide examples of networks computed by Xtalk that accurately recovered known mechanisms of crosstalk. Availability and implementation: The XTALK software is available at http://bioinformatics.cs.vt.edu/~murali/software. Crosstalk networks are available at http://graphspace.org/graphs?tags=2015-bioinformatics-xtalk. Contact: ategge@vt.edu, murali@cs.vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26400040
Bioinformatics analysis on molecular mechanism of rheum officinale in treatment of jaundice

NASA Astrophysics Data System (ADS)

Shan, Si; Tu, Jun; Nie, Peng; Yan, Xiaojun

2017-01-01

Objective: To study the molecular mechanism of Rheum officinale in the treatment of Jaundice by building molecular networks and comparing canonical pathways. Methods: Target proteins of Rheum officinale and related genes of Jaundice were searched from Pubchem and Gene databases online respectively. Molecular networks and canonical pathways comparison analyses were performed by Ingenuity Pathway Analysis (IPA). Results: The molecular networks of Rheum officinale and Jaundice were complex and multifunctional. The 40 target proteins of Rheum officinale and 33 Homo sapiens genes of Jaundice were found in databases. There were 19 common pathways both related networks. Rheum officinale could regulate endothelial differentiation, Interleukin-1B (IL-1B) and Tumor Necrosis Factor (TNF) in these pathways. Conclusions: Rheum officinale treat Jaundice by regulating many effective nodes of Apoptotic pathway and cellular immunity related pathways.
Systems Genetics Analysis of Genome-Wide Association Study Reveals Novel Associations Between Key Biological Processes and Coronary Artery Disease.

PubMed

Ghosh, Sujoy; Vivar, Juan; Nelson, Christopher P; Willenborg, Christina; Segrè, Ayellet V; Mäkinen, Ville-Petteri; Nikpay, Majid; Erdmann, Jeannette; Blankenberg, Stefan; O'Donnell, Christopher; März, Winfried; Laaksonen, Reijo; Stewart, Alexandre F R; Epstein, Stephen E; Shah, Svati H; Granger, Christopher B; Hazen, Stanley L; Kathiresan, Sekar; Reilly, Muredach P; Yang, Xia; Quertermous, Thomas; Samani, Nilesh J; Schunkert, Heribert; Assimes, Themistocles L; McPherson, Ruth

2015-07-01

Genome-wide association studies have identified multiple genetic variants affecting the risk of coronary artery disease (CAD). However, individually these explain only a small fraction of the heritability of CAD and for most, the causal biological mechanisms remain unclear. We sought to obtain further insights into potential causal processes of CAD by integrating large-scale GWA data with expertly curated databases of core human pathways and functional networks. Using pathways (gene sets) from Reactome, we carried out a 2-stage gene set enrichment analysis strategy. From a meta-analyzed discovery cohort of 7 CAD genome-wide association study data sets (9889 cases/11 089 controls), nominally significant gene sets were tested for replication in a meta-analysis of 9 additional studies (15 502 cases/55 730 controls) from the Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIoGRAM) Consortium. A total of 32 of 639 Reactome pathways tested showed convincing association with CAD (replication P<0.05). These pathways resided in 9 of 21 core biological processes represented in Reactome, and included pathways relevant to extracellular matrix (ECM) integrity, innate immunity, axon guidance, and signaling by PDRF (platelet-derived growth factor), NOTCH, and the transforming growth factor-β/SMAD receptor complex. Many of these pathways had strengths of association comparable to those observed in lipid transport pathways. Network analysis of unique genes within the replicated pathways further revealed several interconnected functional and topologically interacting modules representing novel associations (eg, semaphoring-regulated axonal guidance pathway) besides confirming known processes (lipid metabolism). The connectivity in the observed networks was statistically significant compared with random networks (P<0.001). Network centrality analysis (degree and betweenness) further identified genes (eg, NCAM1, FYN, FURIN, etc) likely to play critical roles in the maintenance and functioning of several of the replicated pathways. These findings provide novel insights into how genetic variation, interpreted in the context of biological processes and functional interactions among genes, may help define the genetic architecture of CAD. © 2015 American Heart Association, Inc.
Identification of Key Pathways and Genes in L4 Dorsal Root Ganglion (DRG) After Sciatic Nerve Injury via Microarray Analysis.

PubMed

Zhao, He; Duan, Li-Jun; Sun, Qing-Ling; Gao, Yu-Shan; Yang, Yong-Dong; Tang, Xiang-Sheng; Zhao, Ding-Yan; Xiong, Yang; Hu, Zhen-Guo; Li, Chuan-Hong; Chen, Si-Xue; Liu, Tao; Yu, Xing

2018-04-19

Peripheral nerve injury (PNI) has devastating consequences. Dorsal root ganglion as a pivotal locus participates in the process of neuropathic pain and nerve regeneration. In recent years, gene sequencing technology has seen rapid rise in the biomedicine field. So, we attempt to gain insight into in the mechanism of neuropathic pain and nerve regeneration in the transcriptional level and to explore novel genes through bioinformatics analysis. The gene expression profiles of GSE96051 were downloaded from GEO database. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. Our results showed that both IL-6 and Jun genes and the signaling pathway of MAPK, apoptosis, P53 present their vital modulatory role in nerve regeneration and neuropathic pain. Noteworthy, 13 hub genes associated with neuropathic pain and nerve regeneration, including Ccl12, Ppp1r15a, Cdkn1a, Atf3, Nts, Dusp1, Ccl7, Csf, Gadd45a, Serpine1, Timp1 were rarely reported in PubMed database, these genes may provide us the new orientation in experimental research and clinical study. Our results may provide more deep insight into the mechanism and a promising therapeutic target. The next step is to put our emphasis on an experiment level and to verify the novel genes from 13 hub genes.
SZDB: A Database for Schizophrenia Genetic Research

PubMed Central

Wu, Yong; Yao, Yong-Gang

2017-01-01

Abstract Schizophrenia (SZ) is a debilitating brain disorder with a complex genetic architecture. Genetic studies, especially recent genome-wide association studies (GWAS), have identified multiple variants (loci) conferring risk to SZ. However, how to efficiently extract meaningful biological information from bulk genetic findings of SZ remains a major challenge. There is a pressing need to integrate multiple layers of data from various sources, eg, genetic findings from GWAS, copy number variations (CNVs), association and linkage studies, gene expression, protein–protein interaction (PPI), co-expression, expression quantitative trait loci (eQTL), and Encyclopedia of DNA Elements (ENCODE) data, to provide a comprehensive resource to facilitate the translation of genetic findings into SZ molecular diagnosis and mechanism study. Here we developed the SZDB database (http://www.szdb.org/), a comprehensive resource for SZ research. SZ genetic data, gene expression data, network-based data, brain eQTL data, and SNP function annotation information were systematically extracted, curated and deposited in SZDB. In-depth analyses and systematic integration were performed to identify top prioritized SZ genes and enriched pathways. Multiple types of data from various layers of SZ research were systematically integrated and deposited in SZDB. In-depth data analyses and integration identified top prioritized SZ genes and enriched pathways. We further showed that genes implicated in SZ are highly co-expressed in human brain and proteins encoded by the prioritized SZ risk genes are significantly interacted. The user-friendly SZDB provides high-confidence candidate variants and genes for further functional characterization. More important, SZDB provides convenient online tools for data search and browse, data integration, and customized data analyses. PMID:27451428
A Systems Biology Approach to Reveal Putative Host-Derived Biomarkers of Periodontitis by Network Topology Characterization of MMP-REDOX/NO and Apoptosis Integrated Pathways.

PubMed

Zeidán-Chuliá, Fares; Gürsoy, Mervi; Neves de Oliveira, Ben-Hur; Özdemir, Vural; Könönen, Eija; Gürsoy, Ulvi K

2015-01-01

Periodontitis, a formidable global health burden, is a common chronic disease that destroys tooth-supporting tissues. Biomarkers of the early phase of this progressive disease are of utmost importance for global health. In this context, saliva represents a non-invasive biosample. By using systems biology tools, we aimed to (1) identify an integrated interactome between matrix metalloproteinase (MMP)-REDOX/nitric oxide (NO) and apoptosis upstream pathways of periodontal inflammation, and (2) characterize the attendant topological network properties to uncover putative biomarkers to be tested in saliva from patients with periodontitis. Hence, we first generated a protein-protein network model of interactions ("BIOMARK" interactome) by using the STRING 10 database, a search tool for the retrieval of interacting genes/proteins, with "Experiments" and "Databases" as input options and a confidence score of 0.400. Second, we determined the centrality values (closeness, stress, degree or connectivity, and betweenness) for the "BIOMARK" members by using the Cytoscape software. We found Ubiquitin C (UBC), Jun proto-oncogene (JUN), and matrix metalloproteinase-14 (MMP14) as the most central hub- and non-hub-bottlenecks among the 211 genes/proteins of the whole interactome. We conclude that UBC, JUN, and MMP14 are likely an optimal candidate group of host-derived biomarkers, in combination with oral pathogenic bacteria-derived proteins, for detecting periodontitis at its early phase by using salivary samples from patients. These findings therefore have broader relevance for systems medicine in global health as well.

The chemokine receptor CCR1 is identified in mast cell-derived exosomes

PubMed Central

Liang, Yuting; Qiao, Longwei; Peng, Xia; Cui, Zelin; Yin, Yue; Liao, Huanjin; Jiang, Min; Li, Li

2018-01-01

Mast cells are important effector cells of the immune system, and mast cell-derived exosomes carrying RNAs play a role in immune regulation. However, the molecular function of mast cell-derived exosomes is currently unknown, and here, we identify differentially expressed genes (DEGs) in mast cells and exosomes. We isolated mast cells derived exosomes through differential centrifugation and screened the DEGs from mast cell-derived exosomes, using the GSE25330 array dataset downloaded from the Gene Expression Omnibus database. Biochemical pathways were analyzed by Gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway on the online tool DAVID. DEGs-associated protein-protein interaction networks (PPIs) were constructed using the STRING database and Cytoscape software. The genes identified from these bioinformatics analyses were verified by qRT-PCR and Western blot in mast cells and exosomes. We identified 2121 DEGs (843 up and 1278 down-regulated genes) in HMC-1 cell-derived exosomes and HMC-1 cells. The up-regulated DEGs were classified into two significant modules. The chemokine receptor CCR1 was screened as a hub gene and enriched in cytokine-mediated signaling pathway in module one. Seven genes, including CCR1, CD9, KIT, TGFBR1, TLR9, TPSAB1 and TPSB2 were screened and validated through qRT-PCR analysis. We have achieved a comprehensive view of the pivotal genes and pathways in mast cells and exosomes and identified CCR1 as a hub gene in mast cell-derived exosomes. Our results provide novel clues with respect to the biological processes through which mast cell-derived exosomes modulate immune responses. PMID:29511430
Exploring pathway interactions in insulin resistant mouse liver

PubMed Central

2011-01-01

Background Complex phenotypes such as insulin resistance involve different biological pathways that may interact and influence each other. Interpretation of related experimental data would be facilitated by identifying relevant pathway interactions in the context of the dataset. Results We developed an analysis approach to study interactions between pathways by integrating gene and protein interaction networks, biological pathway information and high-throughput data. This approach was applied to a transcriptomics dataset to investigate pathway interactions in insulin resistant mouse liver in response to a glucose challenge. We identified regulated pathway interactions at different time points following the glucose challenge and also studied the underlying protein interactions to find possible mechanisms and key proteins involved in pathway cross-talk. A large number of pathway interactions were found for the comparison between the two diet groups at t = 0. The initial response to the glucose challenge (t = 0.6) was typed by an acute stress response and pathway interactions showed large overlap between the two diet groups, while the pathway interaction networks for the late response were more dissimilar. Conclusions Studying pathway interactions provides a new perspective on the data that complements established pathway analysis methods such as enrichment analysis. This study provided new insights in how interactions between pathways may be affected by insulin resistance. In addition, the analysis approach described here can be generally applied to different types of high-throughput data and will therefore be useful for analysis of other complex datasets as well. PMID:21843341
Environmental surveillance and monitoring. The next frontiers ...

EPA Pesticide Factsheets

High throughput toxicity testing (HTT) technologies along with the world-wide web are revolutionizing both generation and access to data regarding the bioactivities that chemicals can elicit when they interact with specific proteins, genes, or other targets in the body of an organism. However, to date, most of the focus has been on the application of such data to assessment of individual chemicals. We suggest that environmental surveillance and monitoring represent the next frontiers for HTT. Resources already exist in curated databases of chemical-biological interactions, including highly standardized quantitative dose-response data generated from nascent HTT programs like ToxCast and Tox21, to link chemicals detected through environmental analytical chemistry to known biological activities. The emergence of the adverse outcome pathway framework and associated knowledgebase for linking molecular or pathway-level perturbations of biological systems to adverse outcomes traditionally considered in risk assessment and regulatory decision-making through a series of measureable biological changes provides a critical link between activity and hazard. Furthermore, environmental samples can be directly analyzed via HTT platforms to provide an unprecedented breadth of biological activity characterization that integrates the effects of all compounds present in a mixture, whether known or not. Novel application of these chemical-biological interaction data provide an oppor
How much do we know about the coupling of G-proteins to serotonin receptors?

PubMed Central

2014-01-01

Serotonin receptors are G-protein-coupled receptors (GPCRs) involved in a variety of psychiatric disorders. G-proteins, heterotrimeric complexes that couple to multiple receptors, are activated when their receptor is bound by the appropriate ligand. Activation triggers a cascade of further signalling events that ultimately result in cell function changes. Each of the several known G-protein types can activate multiple pathways. Interestingly, since several G-proteins can couple to the same serotonin receptor type, receptor activation can result in induction of different pathways. To reach a better understanding of the role, interactions and expression of G-proteins a literature search was performed in order to list all the known heterotrimeric combinations and serotonin receptor complexes. Public databases were analysed to collect transcript and protein expression data relating to G-proteins in neural tissues. Only a very small number of heterotrimeric combinations and G-protein-receptor complexes out of the possible thousands suggested by expression data analysis have been examined experimentally. In addition this has mostly been obtained using insect, hamster, rat and, to a lesser extent, human cell lines. Besides highlighting which interactions have not been explored, our findings suggest additional possible interactions that should be examined based on our expression data analysis. PMID:25011628
How much do we know about the coupling of G-proteins to serotonin receptors?

PubMed

Giulietti, Matteo; Vivenzio, Viviana; Piva, Francesco; Principato, Giovanni; Bellantuono, Cesario; Nardi, Bernardo

2014-07-10

Serotonin receptors are G-protein-coupled receptors (GPCRs) involved in a variety of psychiatric disorders. G-proteins, heterotrimeric complexes that couple to multiple receptors, are activated when their receptor is bound by the appropriate ligand. Activation triggers a cascade of further signalling events that ultimately result in cell function changes. Each of the several known G-protein types can activate multiple pathways. Interestingly, since several G-proteins can couple to the same serotonin receptor type, receptor activation can result in induction of different pathways. To reach a better understanding of the role, interactions and expression of G-proteins a literature search was performed in order to list all the known heterotrimeric combinations and serotonin receptor complexes. Public databases were analysed to collect transcript and protein expression data relating to G-proteins in neural tissues. Only a very small number of heterotrimeric combinations and G-protein-receptor complexes out of the possible thousands suggested by expression data analysis have been examined experimentally. In addition this has mostly been obtained using insect, hamster, rat and, to a lesser extent, human cell lines. Besides highlighting which interactions have not been explored, our findings suggest additional possible interactions that should be examined based on our expression data analysis.
A database of human genes and a gene network involved in response to tick-borne encephalitis virus infection.

PubMed

Ignatieva, Elena V; Igoshin, Alexander V; Yudin, Nikolay S

2017-12-28

Tick-borne encephalitis is caused by the neurotropic, positive-sense RNA virus, tick-borne encephalitis virus (TBEV). TBEV infection can lead to a variety of clinical manifestations ranging from slight fever to severe neurological illness. Very little is known about genetic factors predisposing to severe forms of disease caused by TBEV. The aims of the study were to compile a catalog of human genes involved in response to TBEV infection and to rank genes from the catalog based on the number of neighbors in the network of pairwise interactions involving these genes and TBEV RNA or proteins. Based on manual review and curation of scientific publications a catalog comprising 140 human genes involved in response to TBEV infection was developed. To provide access to data on all genes, the TBEVhostDB web resource ( http://icg.nsc.ru/TBEVHostDB/ ) was created. We reconstructed a network formed by pairwise interactions between TBEV virion itself, viral RNA and viral proteins and 140 genes/proteins from TBEVHostDB. Genes were ranked according to the number of interactions in the network. Two genes/proteins (CCR5 and IFNAR1) that had maximal number of interactions were revealed. It was found that the subnetworks formed by CCR5 and IFNAR1 and their neighbors were a fragments of two key pathways functioning during the course of tick-borne encephalitis: (1) the attenuation of interferon-I signaling pathway by the TBEV NS5 protein that targeted peptidase D; (2) proinflammation and tissue damage pathway triggered by chemokine receptor CCR5 interacting with CD4, CCL3, CCL4, CCL2. Among nine genes associated with severe forms of TBEV infection, three genes/proteins (CCR5, IL10, ARID1B) were found to have protein-protein interactions within the network, and two genes/proteins (IFNL3 and the IL10, that was just mentioned) were up- or down-regulated in response to TBEV infection. Based on this finding, potential mechanisms for participation of CCR5, IL10, ARID1B, and IFNL3 in the host response to TBEV infection were suggested. A database comprising 140 human genes involved in response to TBEV infection was compiled and the TBEVHostDB web resource, providing access to all genes was created. This is the first effort of integrating and unifying data on genetic factors that may predispose to severe forms of diseases caused by TBEV. The TBEVHostDB could potentially be used for assessment of risk factors for severe forms of tick-borne encephalitis and for the design of personalized pharmacological strategies for the treatment of TBEV infection.
Pathway and network-based analysis of genome-wide association studies and RT-PCR validation in polycystic ovary syndrome.

PubMed

Shen, Haoran; Liang, Zhou; Zheng, Saihua; Li, Xuelian

2017-11-01

The purpose of this study was to identify promising candidate genes and pathways in polycystic ovary syndrome (PCOS). Microarray dataset GSE345269 obtained from the Gene Expression Omnibus database includes 7 granulosa cell samples from PCOS patients, and 3 normal granulosa cell samples. Differentially expressed genes (DEGs) were screened between PCOS and normal samples. Pathway enrichment analysis was conducted for DEGs using ClueGO and CluePedia plugin of Cytoscape. A Reactome functional interaction (FI) network of the DEGs was built using ReactomeFIViz, and then network modules were extracted, followed by pathway enrichment analysis for the modules. Expression of DEGs in granulosa cell samples was measured using quantitative RT-PCR. A total of 674 DEGs were retained, which were significantly enriched with inflammation and immune-related pathways. Eight modules were extracted from the Reactome FI network. Pathway enrichment analysis revealed significant pathways of each module: module 0, Regulation of RhoA activity and Signaling by Rho GTPases pathways shared ARHGAP4 and ARHGAP9; module 2, GlycoProtein VI-mediated activation cascade pathway was enriched with RHOG; module 3, Thromboxane A2 receptor signaling, Chemokine signaling pathway, CXCR4-mediated signaling events pathways were enriched with LYN, the hub gene of module 3. Results of RT-PCR confirmed the finding of the bioinformatic analysis that ARHGAP4, ARHGAP9, RHOG and LYN were significantly upregulated in PCOS. RhoA-related pathways, GlycoProtein VI-mediated activation cascade pathway, ARHGAP4, ARHGAP9, RHOG and LYN may be involved in the pathogenesis of PCOS.
Stabilizing Effect of Sweep on Low-Frequency STBLI Unsteadiness

NASA Astrophysics Data System (ADS)

Adler, Michael; Gaitonde, Datta

2017-11-01

A Large-Eddy Simulation database is generated to examine unsteady shock/turbulent boundary-layer-interaction (STBLI) mechanisms in a Mach 2 swept-compression-corner. Such interactions exhibit open separation, with separation relief from the sweep, and lack the closed mean recirculation found in spanwise-homogeneous STBLIs. We find that the swept interaction lacks the low-frequency coherent shock unsteadiness, two-decades below incoming turbulent boundary layer scales, that is a principal feature of comparable closed separation STBLIs. Rather, the prominent unsteady content is a mid-frequency regime that develops in the separated shear layer and scales weakly with the local separation length. Additionally, a linear perturbation analysis of the unsteady flow indicates that the feedback pathway (associated with an absolute instability in spanwise-homogeneous interactions) is absent in swept-compression-corner interactions. This suggests that 1) the linear oscillator is an essential component of low-frequency unsteadiness in interactions with closed separation. 2) Low-frequency control efforts should be focused on disrupting this oscillator. 3) Introduction of 3D effects constitute one mechanism to disrupt the oscillator.
Virtual Interactomics of Proteins from Biochemical Standpoint

PubMed Central

Kubrycht, Jaroslav; Sigler, Karel; Souček, Pavel

2012-01-01

Virtual interactomics represents a rapidly developing scientific area on the boundary line of bioinformatics and interactomics. Protein-related virtual interactomics then comprises instrumental tools for prediction, simulation, and networking of the majority of interactions important for structural and individual reproduction, differentiation, recognition, signaling, regulation, and metabolic pathways of cells and organisms. Here, we describe the main areas of virtual protein interactomics, that is, structurally based comparative analysis and prediction of functionally important interacting sites, mimotope-assisted and combined epitope prediction, molecular (protein) docking studies, and investigation of protein interaction networks. Detailed information about some interesting methodological approaches and online accessible programs or databases is displayed in our tables. Considerable part of the text deals with the searches for common conserved or functionally convergent protein regions and subgraphs of conserved interaction networks, new outstanding trends and clinically interesting results. In agreement with the presented data and relationships, virtual interactomic tools improve our scientific knowledge, help us to formulate working hypotheses, and they frequently also mediate variously important in silico simulations. PMID:22928109
The phylogenetic analysis of tetraspanins projects the evolution of cell-cell interactions from unicellular to multicellular organisms.

PubMed

Huang, Shengfeng; Yuan, Shaochun; Dong, Meiling; Su, Jing; Yu, Cuiling; Shen, Yang; Xie, Xiaojin; Yu, Yanhong; Yu, Xuesong; Chen, Shangwu; Zhang, Shicui; Pontarotti, Pierre; Xu, Anlong

2005-12-01

In animals, the tetraspanins are a large superfamily of membrane proteins that play important roles in organizing various cell-cell and matrix-cell interactions and signal pathways based on such interactions. However, their origin and evolution largely remain elusive and most of the family's members are functionally unknown or less known due to difficulties of study, such as functional redundancy. In this study, we rebuilt the family's phylogeny with sequences retrieved from online databases and our cDNA library of amphioxus. We reveal that, in addition to in metazoans, various tetraspanins are extensively expressed in protozoan amoebae, fungi, and plants. We also discuss the structural evolution of tetraspanin's major extracellular domain and the relation between tetraspanin's duplication and functional redundancy. Finally, we elucidate the coevolution of tetraspanins and eukaryotes and suggest that tetraspanins play important roles in the unicell-to-multicell transition. In short, the study of tetraspanin in a phylogenetic context helps us understand the evolution of intercellular interactions.
Bayesian network analyses of resistance pathways against efavirenz and nevirapine

PubMed Central

Deforche, Koen; Camacho, Ricardo J.; Grossman, Zehave; Soares, Marcelo A.; Laethem, Kristel Van; Katzenstein, David A.; Harrigan, P. Richard; Kantor, Rami; Shafer, Robert; Vandamme, Anne-Mieke

2016-01-01

Objective To clarify the role of novel mutations selected by treatment with efavirenz or nevirapine, and investigate the influence of HIV-1 subtype on nonnucleoside reverse transcriptase inhibitor (nNRTI) resistance pathways. Design By finding direct dependencies between treatment-selected mutations, the involvement of these mutations as minor or major resistance mutations against efavirenz, nevirapine, or coadministrated nucleoside analogue reverse transcriptase inhibitors (NRTIs) is hypothesized. In addition, direct dependencies were investigated between treatment-selected mutations and polymorphisms, some of which are linked with subtype, and between NRTI and nNRTI resistance pathways. Methods Sequences from a large collaborative database of various subtypes were jointly analyzed to detect mutations selected by treatment. Using Bayesian network learning, direct dependencies were investigated between treatment-selected mutations, NRTI and nNRTI treatment history, and known NRTI resistance mutations. Results Several novel minor resistance mutations were found: 28K and 196R (for resistance against efavirenz), 101H and 138Q (nevirapine), and 31L (lamivudine). Robust interactions between NRTI mutations (65R, 74V, 75I/M, and 184V) and nNRTI resistance mutations (100I, 181C, 190E and 230L) may affect resistance development to particular treatment combinations. For example, an interaction between 65R and 181C predicts that the nevirapine and tenofovir and lamivudine/emtricitabine combination should be more prone to failure than efavirenz and tenofovir and lamivudine/emtricitabine. Conclusion Bayesian networks were helpful in untangling the selection of mutations by NRTI versus nNRTI treatment, and in discovering interactions between resistance mutations within and between these two classes of inhibitors. PMID:18832874
[Validation of interaction databases in psychopharmacotherapy].

PubMed

Hahn, M; Roll, S C

2018-03-01

Drug-drug interaction databases are an important tool to increase drug safety in polypharmacy. There are several drug interaction databases available but it is unclear which one shows the best results and therefore increases safety for the user of the databases and the patients. So far, there has been no validation of German drug interaction databases. Validation of German drug interaction databases regarding the number of hits, mechanisms of drug interaction, references, clinical advice, and severity of the interaction. A total of 36 drug interactions which were published in the last 3-5 years were checked in 5 different databases. Besides the number of hits, it was also documented if the mechanism was correct, clinical advice was given, primary literature was cited, and the severity level of the drug-drug interaction was given. All databases showed weaknesses regarding the hit rate of the tested drug interactions, with a maximum of 67.7% hits. The highest score in this validation was achieved by MediQ with 104 out of 180 points. PsiacOnline achieved 83 points, arznei-telegramm® 58, ifap index® 54 and the ABDA-database 49 points. Based on this validation MediQ seems to be the most suitable databank for the field of psychopharmacotherapy. The best results in this comparison were achieved by MediQ but this database also needs improvement with respect to the hit rate so that the users can rely on the results and therefore increase drug therapy safety.
The Hippo/YAP pathway interacts with EGFR signaling and HPV oncoproteins to regulate cervical cancer progression.

PubMed

He, Chunbo; Mao, Dagan; Hua, Guohua; Lv, Xiangmin; Chen, Xingcheng; Angeletti, Peter C; Dong, Jixin; Remmenga, Steven W; Rodabaugh, Kerry J; Zhou, Jin; Lambert, Paul F; Yang, Peixin; Davis, John S; Wang, Cheng

2015-11-01

The Hippo signaling pathway controls organ size and tumorigenesis through a kinase cascade that inactivates Yes-associated protein (YAP). Here, we show that YAP plays a central role in controlling the progression of cervical cancer. Our results suggest that YAP expression is associated with a poor prognosis for cervical cancer. TGF-α and amphiregulin (AREG), via EGFR, inhibit the Hippo signaling pathway and activate YAP to induce cervical cancer cell proliferation and migration. Activated YAP allows for up-regulation of TGF-α, AREG, and EGFR, forming a positive signaling loop to drive cervical cancer cell proliferation. HPV E6 protein, a major etiological molecule of cervical cancer, maintains high YAP protein levels in cervical cancer cells by preventing proteasome-dependent YAP degradation to drive cervical cancer cell proliferation. Results from human cervical cancer genomic databases and an accepted transgenic mouse model strongly support the clinical relevance of the discovered feed-forward signaling loop. Our study indicates that combined targeting of the Hippo and the ERBB signaling pathways represents a novel therapeutic strategy for prevention and treatment of cervical cancer. © 2015 The Authors. Published under the terms of the CC BY 4.0 license.
Challenges in horizontal model integration.

PubMed

Kolczyk, Katrin; Conradi, Carsten

2016-03-11

Systems Biology has motivated dynamic models of important intracellular processes at the pathway level, for example, in signal transduction and cell cycle control. To answer important biomedical questions, however, one has to go beyond the study of isolated pathways towards the joint study of interacting signaling pathways or the joint study of signal transduction and cell cycle control. Thereby the reuse of established models is preferable, as it will generally reduce the modeling effort and increase the acceptance of the combined model in the field. Obtaining a combined model can be challenging, especially if the submodels are large and/or come from different working groups (as is generally the case, when models stored in established repositories are used). To support this task, we describe a semi-automatic workflow based on established software tools. In particular, two frequent challenges are described: identification of the overlap and subsequent (re)parameterization of the integrated model. The reparameterization step is crucial, if the goal is to obtain a model that can reproduce the data explained by the individual models. For demonstration purposes we apply our workflow to integrate two signaling pathways (EGF and NGF) from the BioModels Database.
Challenges of the information age: the impact of false discovery on pathway identification.

PubMed

Rog, Colin J; Chekuri, Srinivasa C; Edgerton, Mary E

2012-11-21

Pathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored. The frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene. This can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.
Interactome of the hepatitis C virus: Literature mining with ANDSystem.

PubMed

Saik, Olga V; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A

2016-06-15

A study of the molecular genetics mechanisms of host-pathogen interactions is of paramount importance in developing drugs against viral diseases. Currently, the literature contains a huge amount of information that describes interactions between HCV and human proteins. In addition, there are many factual databases that contain experimentally verified data on HCV-host interactions. The sources of such data are the original data along with the data manually extracted from the literature. However, the manual analysis of scientific publications is time consuming and, because of this, databases created with such an approach often do not have complete information. One of the most promising methods to provide actualisation and completeness of information is text mining. Here, with the use of a previously developed method by the authors using ANDSystem, an automated extraction of information on the interactions between HCV and human proteins was conducted. As a data source for the text mining approach, PubMed abstracts and full text articles were used. Additionally, external factual databases were analyzed. On the basis of this analysis, a special version of ANDSystem, extended with the HCV interactome, was created. The HCV interactome contains information about the interactions between 969 human and 11 HCV proteins. Among the 969 proteins, 153 'new' proteins were found not previously referred to in any external databases of protein-protein interactions for HCV-host interactions. Thus, the extended ANDSystem possesses a more comprehensive detailing of HCV-host interactions versus other existing databases. It was interesting that HCV proteins more preferably interact with human proteins that were already involved in a large number of protein-protein interactions as well as those associated with many diseases. Among human proteins of the HCV interactome, there were a large number of proteins regulated by microRNAs. It turned out that the results obtained for protein-protein interactions and microRNA-regulation did not depend on how well the proteins were studied, while protein-disease interactions appeared to be dependent on the level of study. In particular, the mean number of diseases linked to well-studied proteins (proteins were considered well-studied if they were mentioned in 50 or more PubMed publications) from the HCV interactome was 20.8, significantly exceeding the mean number of associations with diseases (10.1) for the total set of well-studied human proteins present in ANDSystem. For proteins not highly poorly-studied investigated, proteins from the HCV interactome (each protein was referred to in less than 50 publications) distribution of the number of diseases associated with them had no statistically significant differences from the distribution of the number of diseases associated with poorly-studied proteins based on the total set of human proteins stored in ANDSystem. With this, the average number of associations with diseases for the HCV interactome and the total set of human proteins were 0.3 and 0.2, respectively. Thus, ANDSystem, extended with the HCV interactome, can be helpful in a wide range of issues related to analyzing HCV-host interactions in the search for anti-HCV drug targets. The demo version of the extended ANDSystem covered here containing only interactions between human proteins, genes, metabolites, diseases, miRNAs and molecular-genetic pathways, as well as interactions between human proteins/genes and HCV proteins, is freely available at the following web address: http://www-bionet.sscc.ru/psd/andhcv/. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
The European Bioinformatics Institute's data resources: towards systems biology.

PubMed

Brooksbank, Catherine; Cameron, Graham; Thornton, Janet

2005-01-01

Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. The need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation is greater than ever. The European Bioinformatics Institute's (EBI's) databases and tools have evolved to meet the changing needs of molecular biologists: since we last wrote about our services in the 2003 issue of Nucleic Acids Research, we have launched new databases covering protein-protein interactions (IntAct), pathways (Reactome) and small molecules (ChEBI). Our existing core databases have continued to evolve to meet the changing needs of biomedical researchers, and we have developed new data-access tools that help biologists to move intuitively through the different data types, thereby helping them to put the parts together to understand biology at the systems level. The EBI's data resources are all available on our website at http://www.ebi.ac.uk.
'RetinoGenetics': a comprehensive mutation database for genes related to inherited retinal degeneration.

PubMed

Ran, Xia; Cai, Wei-Jun; Huang, Xiu-Feng; Liu, Qi; Lu, Fan; Qu, Jia; Wu, Jinyu; Jin, Zi-Bing

2014-01-01

Inherited retinal degeneration (IRD), a leading cause of human blindness worldwide, is exceptionally heterogeneous with clinical heterogeneity and genetic variety. During the past decades, tremendous efforts have been made to explore the complex heterogeneity, and massive mutations have been identified in different genes underlying IRD with the significant advancement of sequencing technology. In this study, we developed a comprehensive database, 'RetinoGenetics', which contains informative knowledge about all known IRD-related genes and mutations for IRD. 'RetinoGenetics' currently contains 4270 mutations in 186 genes, with detailed information associated with 164 phenotypes from 934 publications and various types of functional annotations. Then extensive annotations were performed to each gene using various resources, including Gene Ontology, KEGG pathways, protein-protein interaction, mutational annotations and gene-disease network. Furthermore, by using the search functions, convenient browsing ways and intuitive graphical displays, 'RetinoGenetics' could serve as a valuable resource for unveiling the genetic basis of IRD. Taken together, 'RetinoGenetics' is an integrative, informative and updatable resource for IRD-related genetic predispositions. Database URL: http://www.retinogenetics.org/. © The Author(s) 2014. Published by Oxford University Press.
The European Bioinformatics Institute's data resources: towards systems biology

PubMed Central

Brooksbank, Catherine; Cameron, Graham; Thornton, Janet

2005-01-01

Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. The need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation is greater than ever. The European Bioinformatics Institute's (EBI's) databases and tools have evolved to meet the changing needs of molecular biologists: since we last wrote about our services in the 2003 issue of Nucleic Acids Research, we have launched new databases covering protein–protein interactions (IntAct), pathways (Reactome) and small molecules (ChEBI). Our existing core databases have continued to evolve to meet the changing needs of biomedical researchers, and we have developed new data-access tools that help biologists to move intuitively through the different data types, thereby helping them to put the parts together to understand biology at the systems level. The EBI's data resources are all available on our website at http://www.ebi.ac.uk. PMID:15608238
Discovery of multiple interacting partners of gankyrin, a proteasomal chaperone and an oncoprotein--evidence for a common hot spot site at the interface and its functional relevance.

PubMed

Nanaware, Padma P; Ramteke, Manoj P; Somavarapu, Arun K; Venkatraman, Prasanna

2014-07-01

Gankyrin, a non-ATPase component of the proteasome and a chaperone of proteasome assembly, is also an oncoprotein. Gankyrin regulates a variety of oncogenic signaling pathways in cancer cells and accelerates degradation of tumor suppressor proteins p53 and Rb. Therefore gankyrin may be a unique hub integrating signaling networks with the degradation pathway. To identify new interactions that may be crucial in consolidating its role as an oncogenic hub, crystal structure of gankyrin-proteasome ATPase complex was used to predict novel interacting partners. EEVD, a four amino acid linear sequence seems a hot spot site at this interface. By searching for EEVD in exposed regions of human proteins in PDB database, we predicted 34 novel interactions. Eight proteins were tested and seven of them were found to interact with gankyrin. Affinity of four interactions is high enough for endogenous detection. Others require gankyrin overexpression in HEK 293 cells or occur endogenously in breast cancer cell line- MDA-MB-435, reflecting lower affinity or presence of a deregulated network. Mutagenesis and peptide inhibition confirm that EEVD is the common hot spot site at these interfaces and therefore a potential polypharmacological drug target. In MDA-MB-231 cells in which the endogenous CLIC1 is silenced, trans-expression of Wt protein (CLIC1_EEVD) and not the hot spot site mutant (CLIC1_AAVA) resulted in significant rescue of the migratory potential. Our approach can be extended to identify novel functionally relevant protein-protein interactions, in expansion of oncogenic networks and in identifying potential therapeutic targets. © 2013 Wiley Periodicals, Inc.

Effect of curcumin on aged Drosophila melanogaster: a pathway prediction analysis.

PubMed

Zhang, Zhi-guo; Niu, Xu-yan; Lu, Ai-ping; Xiao, Gary Guishan

2015-02-01

To re-analyze the data published in order to explore plausible biological pathways that can be used to explain the anti-aging effect of curcumin. Microarray data generated from other study aiming to investigate effect of curcumin on extending lifespan of Drosophila melanogaster were further used for pathway prediction analysis. The differentially expressed genes were identified by using GeneSpring GX with a criterion of 3.0-fold change. Two Cytoscape plugins including BisoGenet and molecular complex detection (MCODE) were used to establish the protein-protein interaction (PPI) network based upon differential genes in order to detect highly connected regions. The function annotation clustering tool of Database for Annotation, Visualization and Integrated Discovery (DAVID) was used for pathway analysis. A total of 87 genes expressed differentially in D. melanogaster melanogaster treated with curcumin were identified, among which 50 were up-regulated significantly and 37 were remarkably down-regulated in D. melanogaster melanogaster treated with curcumin. Based upon these differential genes, PPI network was constructed with 1,082 nodes and 2,412 edges. Five highly connected regions in PPI networks were detected by MCODE algorithm, suggesting anti-aging effect of curcumin may be underlined through five different pathways including Notch signaling pathway, basal transcription factors, cell cycle regulation, ribosome, Wnt signaling pathway, and p53 pathway. Genes and their associated pathways in D. melanogaster melanogaster treated with anti-aging agent curcumin were identified using PPI network and MCODE algorithm, suggesting that curcumin may be developed as an alternative therapeutic medicine for treating aging-associated diseases.
PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis.

PubMed

Pan, Deyun; Sun, Ning; Cheung, Kei-Hoi; Guan, Zhong; Ma, Ligeng; Holford, Matthew; Deng, Xingwang; Zhao, Hongyu

2003-11-07

To date, many genomic and pathway-related tools and databases have been developed to analyze microarray data. In published web-based applications to date, however, complex pathways have been displayed with static image files that may not be up-to-date or are time-consuming to rebuild. In addition, gene expression analyses focus on individual probes and genes with little or no consideration of pathways. These approaches reveal little information about pathways that are key to a full understanding of the building blocks of biological systems. Therefore, there is a need to provide useful tools that can generate pathways without manually building images and allow gene expression data to be integrated and analyzed at pathway levels for such experimental organisms as Arabidopsis. We have developed PathMAPA, a web-based application written in Java that can be easily accessed over the Internet. An Oracle database is used to store, query, and manipulate the large amounts of data that are involved. PathMAPA allows its users to (i) upload and populate microarray data into a database; (ii) integrate gene expression with enzymes of the pathways; (iii) generate pathway diagrams without building image files manually; (iv) visualize gene expressions for each pathway at enzyme, locus, and probe levels; and (v) perform statistical tests at pathway, enzyme and gene levels. PathMAPA can be used to examine Arabidopsis thaliana gene expression patterns associated with metabolic pathways. PathMAPA provides two unique features for the gene expression analysis of Arabidopsis thaliana: (i) automatic generation of pathways associated with gene expression and (ii) statistical tests at pathway level. The first feature allows for the periodical updating of genomic data for pathways, while the second feature can provide insight into how treatments affect relevant pathways for the selected experiment(s).
PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis

PubMed Central

Pan, Deyun; Sun, Ning; Cheung, Kei-Hoi; Guan, Zhong; Ma, Ligeng; Holford, Matthew; Deng, Xingwang; Zhao, Hongyu

2003-01-01

Background To date, many genomic and pathway-related tools and databases have been developed to analyze microarray data. In published web-based applications to date, however, complex pathways have been displayed with static image files that may not be up-to-date or are time-consuming to rebuild. In addition, gene expression analyses focus on individual probes and genes with little or no consideration of pathways. These approaches reveal little information about pathways that are key to a full understanding of the building blocks of biological systems. Therefore, there is a need to provide useful tools that can generate pathways without manually building images and allow gene expression data to be integrated and analyzed at pathway levels for such experimental organisms as Arabidopsis. Results We have developed PathMAPA, a web-based application written in Java that can be easily accessed over the Internet. An Oracle database is used to store, query, and manipulate the large amounts of data that are involved. PathMAPA allows its users to (i) upload and populate microarray data into a database; (ii) integrate gene expression with enzymes of the pathways; (iii) generate pathway diagrams without building image files manually; (iv) visualize gene expressions for each pathway at enzyme, locus, and probe levels; and (v) perform statistical tests at pathway, enzyme and gene levels. PathMAPA can be used to examine Arabidopsis thaliana gene expression patterns associated with metabolic pathways. Conclusion PathMAPA provides two unique features for the gene expression analysis of Arabidopsis thaliana: (i) automatic generation of pathways associated with gene expression and (ii) statistical tests at pathway level. The first feature allows for the periodical updating of genomic data for pathways, while the second feature can provide insight into how treatments affect relevant pathways for the selected experiment(s). PMID:14604444
MetaMapR: pathway independent metabolomic network analysis incorporating unknowns.

PubMed

Grapov, Dmitry; Wanichthanarak, Kwanjeera; Fiehn, Oliver

2015-08-15

Metabolic network mapping is a widely used approach for integration of metabolomic experimental results with biological domain knowledge. However, current approaches can be limited by biochemical domain or pathway knowledge which results in sparse disconnected graphs for real world metabolomic experiments. MetaMapR integrates enzymatic transformations with metabolite structural similarity, mass spectral similarity and empirical associations to generate richly connected metabolic networks. This open source, web-based or desktop software, written in the R programming language, leverages KEGG and PubChem databases to derive associations between metabolites even in cases where biochemical domain or molecular annotations are unknown. Network calculation is enhanced through an interface to the Chemical Translation System, which allows metabolite identifier translation between >200 common biochemical databases. Analysis results are presented as interactive visualizations or can be exported as high-quality graphics and numerical tables which can be imported into common network analysis and visualization tools. Freely available at http://dgrapov.github.io/MetaMapR/. Requires R and a modern web browser. Installation instructions, tutorials and application examples are available at http://dgrapov.github.io/MetaMapR/. ofiehn@ucdavis.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A computational platform to maintain and migrate manual functional annotations for BioCyc databases.

PubMed

Walsh, Jesse R; Sen, Taner Z; Dickerson, Julie A

2014-10-12

BioCyc databases are an important resource for information on biological pathways and genomic data. Such databases represent the accumulation of biological data, some of which has been manually curated from literature. An essential feature of these databases is the continuing data integration as new knowledge is discovered. As functional annotations are improved, scalable methods are needed for curators to manage annotations without detailed knowledge of the specific design of the BioCyc database. We have developed CycTools, a software tool which allows curators to maintain functional annotations in a model organism database. This tool builds on existing software to improve and simplify annotation data imports of user provided data into BioCyc databases. Additionally, CycTools automatically resolves synonyms and alternate identifiers contained within the database into the appropriate internal identifiers. Automating steps in the manual data entry process can improve curation efforts for major biological databases. The functionality of CycTools is demonstrated by transferring GO term annotations from MaizeCyc to matching proteins in CornCyc, both maize metabolic pathway databases available at MaizeGDB, and by creating strain specific databases for metabolic engineering.
VitisExpDB: a database resource for grape functional genomics.

PubMed

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-02-28

The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics

PubMed Central

Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L

2008-01-01

Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
The FMRP regulon: from targets to disease convergence

PubMed Central

Fernández, Esperanza; Rajan, Nicholas; Bagni, Claudia

2013-01-01

The fragile X mental retardation protein (FMRP) is an RNA-binding protein that regulates mRNA metabolism. FMRP has been largely studied in the brain, where the absence of this protein leads to fragile X syndrome, the most frequent form of inherited intellectual disability. Since the identification of the FMRP gene in 1991, many studies have primarily focused on understanding the function/s of this protein. Hundreds of potential FMRP mRNA targets and several interacting proteins have been identified. Here, we report the identification of FMRP mRNA targets in the mammalian brain that support the key role of this protein during brain development and in regulating synaptic plasticity. We compared the genes from databases and genome-wide association studies with the brain FMRP transcriptome, and identified several FMRP mRNA targets associated with autism spectrum disorders, mood disorders and schizophrenia, showing a potential common pathway/s for these apparently different disorders. PMID:24167470
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

PubMed

Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

2016-01-01

Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock

PubMed Central

Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

2016-01-01

Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469
DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs

PubMed Central

Knox, Craig; Law, Vivian; Jewison, Timothy; Liu, Philip; Ly, Son; Frolkis, Alex; Pon, Allison; Banco, Kelly; Mak, Christine; Neveu, Vanessa; Djoumbou, Yannick; Eisner, Roman; Guo, An Chi; Wishart, David S.

2011-01-01

DrugBank (http://www.drugbank.ca) is a richly annotated database of drug and drug target information. It contains extensive data on the nomenclature, ontology, chemistry, structure, function, action, pharmacology, pharmacokinetics, metabolism and pharmaceutical properties of both small molecule and large molecule (biotech) drugs. It also contains comprehensive information on the target diseases, proteins, genes and organisms on which these drugs act. First released in 2006, DrugBank has become widely used by pharmacists, medicinal chemists, pharmaceutical researchers, clinicians, educators and the general public. Since its last update in 2008, DrugBank has been greatly expanded through the addition of new drugs, new targets and the inclusion of more than 40 new data fields per drug entry (a 40% increase in data ‘depth’). These data field additions include illustrated drug-action pathways, drug transporter data, drug metabolite data, pharmacogenomic data, adverse drug response data, ADMET data, pharmacokinetic data, computed property data and chemical classification data. DrugBank 3.0 also offers expanded database links, improved search tools for drug–drug and food–drug interaction, new resources for querying and viewing drug pathways and hundreds of new drug entries with detailed patent, pricing and manufacturer data. These additions have been complemented by enhancements to the quality and quantity of existing data, particularly with regard to drug target, drug description and drug action data. DrugBank 3.0 represents the result of 2 years of manual annotation work aimed at making the database much more useful for a wide range of ‘omics’ (i.e. pharmacogenomic, pharmacoproteomic, pharmacometabolomic and even pharmacoeconomic) applications. PMID:21059682
funRiceGenes dataset for comprehensive understanding and application of rice functional genes.

PubMed

Yao, Wen; Li, Guangwei; Yu, Yiming; Ouyang, Yidan

2018-01-01

As a main staple food, rice is also a model plant for functional genomic studies of monocots. Decoding of every DNA element of the rice genome is essential for genetic improvement to address increasing food demands. The past 15 years have witnessed extraordinary advances in rice functional genomics. Systematic characterization and proper deposition of every rice gene are vital for both functional studies and crop genetic improvement. We built a comprehensive and accurate dataset of ∼2800 functionally characterized rice genes and ∼5000 members of different gene families by integrating data from available databases and reviewing every publication on rice functional genomic studies. The dataset accounts for 19.2% of the 39 045 annotated protein-coding rice genes, which provides the most exhaustive archive for investigating the functions of rice genes. We also constructed 214 gene interaction networks based on 1841 connections between 1310 genes. The largest network with 762 genes indicated that pleiotropic genes linked different biological pathways. Increasing degree of conservation of the flowering pathway was observed among more closely related plants, implying substantial value of rice genes for future dissection of flowering regulation in other crops. All data are deposited in the funRiceGenes database (https://funricegenes.github.io/). Functionality for advanced search and continuous updating of the database are provided by a Shiny application (http://funricegenes.ncpgr.cn/). The funRiceGenes dataset would enable further exploring of the crosslink between gene functions and natural variations in rice, which can also facilitate breeding design to improve target agronomic traits of rice. © The Authors 2017. Published by Oxford University Press.
Insight into the transcriptome of Arthrobotrys conoides using high throughput sequencing.

PubMed

Ramesh, Pandit; Reena, Patel; Amitbikram, Mohapatra; Chaitanya, Joshi; Anju, Kunjadia

2015-12-01

Arthrobotrys conoides is a nematode-trapping fungus belonging to Orbiliales, Ascomycota group, and traps prey nematodes by means of adhesive network. Fungus has a potential to be used as a biocontrol agent against plant parasitic nematodes. In the present study, we characterized the transcriptome of A. conoides using high-throughput sequencing technology and characterized its virulence unigenes. Total 7,255 cDNA contigs with an average length of 425 bp were generated and 6184 (61.81%) transcripts were functionally annotated and characterized. Majority of unigenes were found analogous to the genes of plant pathogenic fungi. A total of 1749 transcripts were found to be orthologous with eukaryotic proteins of KOG database. Several carbohydrate active enzymes and peptidases were identified. We also analyzed classically and nonclassically secreted proteins and confirmed by BLASTP against fungal secretome database. A total of 916 contigs were analogous to 556 unique proteins of Pathogen Host Interaction (PHI) database. Further, we identified 91 unigenes homologous to the database of fungal virulence factor (DFVF). A total of 104 putative protein kinases coding transcripts were identified by BLASTP against KinBase database, which are major players in signaling pathways. This study provides a comprehensive look at the transcriptome of A. conoides and the identified unigenes might have a role in catching and killing prey nematodes by A. conoides. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ISAAC - InterSpecies Analysing Application using Containers.

PubMed

Baier, Herbert; Schultz, Jörg

2014-01-15

Information about genes, transcripts and proteins is spread over a wide variety of databases. Different tools have been developed using these databases to identify biological signals in gene lists from large scale analysis. Mostly, they search for enrichments of specific features. But, these tools do not allow an explorative walk through different views and to change the gene lists according to newly upcoming stories. To fill this niche, we have developed ISAAC, the InterSpecies Analysing Application using Containers. The central idea of this web based tool is to enable the analysis of sets of genes, transcripts and proteins under different biological viewpoints and to interactively modify these sets at any point of the analysis. Detailed history and snapshot information allows tracing each action. Furthermore, one can easily switch back to previous states and perform new analyses. Currently, sets can be viewed in the context of genomes, protein functions, protein interactions, pathways, regulation, diseases and drugs. Additionally, users can switch between species with an automatic, orthology based translation of existing gene sets. As todays research usually is performed in larger teams and consortia, ISAAC provides group based functionalities. Here, sets as well as results of analyses can be exchanged between members of groups. ISAAC fills the gap between primary databases and tools for the analysis of large gene lists. With its highly modular, JavaEE based design, the implementation of new modules is straight forward. Furthermore, ISAAC comes with an extensive web-based administration interface including tools for the integration of third party data. Thus, a local installation is easily feasible. In summary, ISAAC is tailor made for highly explorative interactive analyses of gene, transcript and protein sets in a collaborative environment.
Bioinformatic perspectives on NRPS/PKS megasynthases: advances and challenges.

PubMed

Jenke-Kodama, Holger; Dittmann, Elke

2009-07-01

The increased understanding of both fundamental principles and mechanistic variations of NRPS/PKS megasynthases along with the unprecedented availability of microbial sequences has inspired a number of in silico studies of both enzyme families. The insights that can be extracted from these analyses go far beyond a rough classification of data and have turned bioinformatics into a frontier field of natural products research. As databases are flooded with NRPS/PKS gene sequence of microbial genomes and metagenomes, increasingly reliable structural prediction methods can help to uncover hidden treasures. Already, phylogenetic analyses have revealed that NRPS/PKS pathways should not simply be regarded as enzyme complexes, specifically evolved to product a selected natural product. Rather, they represent a collection of genetic opinions, allowing biosynthetic pathways to be shuffled in a process of perpetual chemical innovations and pathways diversification in nature can give impulses for specificities, protein interactions and genetic engineering of libraries of novel peptides and polyketides. The successful translation of the knowledge obtained from bioinformatic dissection of NRPS/PKS megasynthases into new techniques for drug discovery and design remain challenges for the future.
Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens

PubMed Central

Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping

2012-01-01

We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955
Differences in gene expression profiles and signaling pathways in rhabdomyolysis-induced acute kidney injury

PubMed Central

Geng, Xiaodong; Wang, Yuanda; Hong, Quan; Yang, Jurong; Zheng, Wei; Zhang, Gang; Cai, Guangyan; Chen, Xiangmei; Wu, Di

2015-01-01

Purpose: Rhabdomyolysis is a threatening syndrome because it causes the breakdown of skeletal muscle. Muscle destruction leads to the release of myoglobin, intracellular proteins, and electrolytes into the circulation. The aim of this study was to investigate the differences in gene expression profiles and signaling pathways upon rhabdomyolysis-induced acute kidney injury (AKI). Methods: In this study, we used glycerol-induced renal injury as a model of rhabdomyolysis-induced AKI. We analyzed data and relevant information from the Gene Expression Omnibus database (No: GSE44925). The gene expression data for three untreated mice were compared to data for five mice with rhabdomyolysis-induced AKI. The expression profiling of the three untreated mice and the five rhabdomyolysis-induced AKI mice was performed using microarray analysis. We examined the levels of Cyp3a13, Rela, Aldh7a1, Jun, CD14. And Cdkn1a using RT-PCR to determine the accuracy of the microarray results. Results: The microarray analysis showed that there were 1050 downregulated and 659 upregulated genes in the rhabdomyolysis-induced AKI mice compared to the control group. The interactions of all differentially expressed genes in the Signal-Net were analyzed. Cyp3a13 and Rela had the most interactions with other genes. The data showed that Rela and Aldh7a1 were the key nodes and had important positions in the Signal-Net. The genes Jun, CD14, and Cdkn1a were also significantly upregulated. The pathway analysis classified the differentially expressed genes into 71 downregulated and 48 upregulated pathways including the PI3K/Akt, MAPK, and NF-κB signaling pathways. Conclusion: The results of this study indicate that the NF-κB, MAPK, PI3K/Akt, and apoptotic pathways are regulated in rhabdomyolysis-induced AKI. PMID:26823722
Exploring the cross talk between ER stress and inflammation in age-related macular degeneration.

PubMed

Kheitan, Samira; Minuchehr, Zarrin; Soheili, Zahra-Soheila

2017-01-01

Increasing evidence demonstrates that inflammation and endoplasmic reticulum (ER) stress is implicated in the development and progression of age-related macular degeneration (AMD), a multifactorial neurodegenerative disease. However the cross talk between these cellular mechanisms has not been clearly and fully understood. The present study investigates a possible intersection between ER stress and inflammation in AMD. In this study, we recruited two collections of involved protein markers to retrieve their interaction information from IMEx-curated databases, which are the most well- known protein-protein interaction collections, allowing us to design an intersection network for AMD that is unprecedented. In order to find expression activated subnetworks, we utilized AMD expression profiles in our network. In addition, we studied topological characteristics of the most expressed active subnetworks to identify the hubs. With regard to topological quantifications and expressional activity, we reported a list of the most pivotal hubs which are potentially applicable as probable therapeutic targets. Furthermore, we introduced MAPK signaling pathway as a significantly involved pathway in the association between ER stress and inflammation, leading to promising new directions in discovering AMD formation mechanisms and possible treatments.
Exploring the cross talk between ER stress and inflammation in age-related macular degeneration

PubMed Central

Kheitan, Samira; Soheili, Zahra-Soheila

2017-01-01

Increasing evidence demonstrates that inflammation and endoplasmic reticulum (ER) stress is implicated in the development and progression of age-related macular degeneration (AMD), a multifactorial neurodegenerative disease. However the cross talk between these cellular mechanisms has not been clearly and fully understood. The present study investigates a possible intersection between ER stress and inflammation in AMD. In this study, we recruited two collections of involved protein markers to retrieve their interaction information from IMEx-curated databases, which are the most well- known protein-protein interaction collections, allowing us to design an intersection network for AMD that is unprecedented. In order to find expression activated subnetworks, we utilized AMD expression profiles in our network. In addition, we studied topological characteristics of the most expressed active subnetworks to identify the hubs. With regard to topological quantifications and expressional activity, we reported a list of the most pivotal hubs which are potentially applicable as probable therapeutic targets. Furthermore, we introduced MAPK signaling pathway as a significantly involved pathway in the association between ER stress and inflammation, leading to promising new directions in discovering AMD formation mechanisms and possible treatments. PMID:28742151
SuperTarget and Matador: resources for exploring drug-target relationships.

PubMed

Günther, Stefan; Kuhn, Michael; Dunkel, Mathias; Campillos, Monica; Senger, Christian; Petsalaki, Evangelia; Ahmed, Jessica; Urdiales, Eduardo Garcia; Gewiess, Andreas; Jensen, Lars Juhl; Schneider, Reinhard; Skoblo, Roman; Russell, Robert B; Bourne, Philip E; Bork, Peer; Preissner, Robert

2008-01-01

The molecular basis of drug action is often not well understood. This is partly because the very abundant and diverse information generated in the past decades on drugs is hidden in millions of medical articles or textbooks. Therefore, we developed a one-stop data warehouse, SuperTarget that integrates drug-related information about medical indication areas, adverse drug effects, drug metabolization, pathways and Gene Ontology terms of the target proteins. An easy-to-use query interface enables the user to pose complex queries, for example to find drugs that target a certain pathway, interacting drugs that are metabolized by the same cytochrome P450 or drugs that target the same protein but are metabolized by different enzymes. Furthermore, we provide tools for 2D drug screening and sequence comparison of the targets. The database contains more than 2500 target proteins, which are annotated with about 7300 relations to 1500 drugs; the vast majority of entries have pointers to the respective literature source. A subset of these drugs has been annotated with additional binding information and indirect interactions and is available as a separate resource called Matador. SuperTarget and Matador are available at http://insilico.charite.de/supertarget and http://matador.embl.de.

Efficient exploration of pan-cancer networks by generalized covariance selection and interactive web content

PubMed Central

Kling, Teresia; Johansson, Patrik; Sanchez, José; Marinescu, Voichita D.; Jörnsten, Rebecka; Nelander, Sven

2015-01-01

Statistical network modeling techniques are increasingly important tools to analyze cancer genomics data. However, current tools and resources are not designed to work across multiple diagnoses and technical platforms, thus limiting their applicability to comprehensive pan-cancer datasets such as The Cancer Genome Atlas (TCGA). To address this, we describe a new data driven modeling method, based on generalized Sparse Inverse Covariance Selection (SICS). The method integrates genetic, epigenetic and transcriptional data from multiple cancers, to define links that are present in multiple cancers, a subset of cancers, or a single cancer. It is shown to be statistically robust and effective at detecting direct pathway links in data from TCGA. To facilitate interpretation of the results, we introduce a publicly accessible tool (cancerlandscapes.org), in which the derived networks are explored as interactive web content, linked to several pathway and pharmacological databases. To evaluate the performance of the method, we constructed a model for eight TCGA cancers, using data from 3900 patients. The model rediscovered known mechanisms and contained interesting predictions. Possible applications include prediction of regulatory relationships, comparison of network modules across multiple forms of cancer and identification of drug targets. PMID:25953855
Identification of key target genes and pathways in laryngeal carcinoma

PubMed Central

Liu, Feng; Du, Jintao; Liu, Jun; Wen, Bei

2016-01-01

The purpose of the present study was to screen the key genes associated with laryngeal carcinoma and to investigate the molecular mechanism of laryngeal carcinoma progression. The gene expression profile of GSE10935 [Gene Expression Omnibus (GEO) accession number], including 12 specimens from laryngeal papillomas and 12 specimens from normal laryngeal epithelia controls, was downloaded from the GEO database. Differentially expressed genes (DEGs) were screened in laryngeal papillomas compared with normal controls using Limma package in R language, followed by Gene Ontology (GO) enrichment analysis and pathway enrichment analysis. Furthermore, the protein-protein interaction (PPI) network of DEGs was constructed using Cytoscape software and modules were analyzed using MCODE plugin from the PPI network. Furthermore, significant biological pathway regions (sub-pathway) were identified by using iSubpathwayMiner analysis. A total of 67 DEGs were identified, including 27 up-regulated genes and 40 down-regulated genes and they were involved in different GO terms and pathways. PPI network analysis revealed that Ras association (RalGDS/AF-6) domain family member 1 (RASSF1) was a hub protein. The sub-pathway analysis identified 9 significantly enriched sub-pathways, including glycolysis/gluconeogenesis and nitrogen metabolism. Genes such as phosphoglycerate kinase 1 (PGK1), carbonic anhydrase II (CA2), and carbonic anhydrase XII (CA12) whose node degrees were >10 were identified in the disease risk sub-pathway. Genes in the sub-pathway, such as RASSF1, PGK1, CA2 and CA12 were presumed to serve critical roles in laryngeal carcinoma. The present study identified DEGs and their sub-pathways in the disease, which may serve as potential targets for treatment of laryngeal carcinoma. PMID:27446427
From 20th century metabolic wall charts to 21st century systems biology: database of mammalian metabolic enzymes

PubMed Central

Corcoran, Callan C.; Grady, Cameron R.; Pisitkun, Trairak; Parulekar, Jaya

2017-01-01

The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database (https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database (Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. PMID:27974320
LeishCyc: a guide to building a metabolic pathway database and visualization of metabolomic data.

PubMed

Saunders, Eleanor C; MacRae, James I; Naderer, Thomas; Ng, Milica; McConville, Malcolm J; Likić, Vladimir A

2012-01-01

The complexity of the metabolic networks in even the simplest organisms has raised new challenges in organizing metabolic information. To address this, specialized computer frameworks have been developed to capture, manage, and visualize metabolic knowledge. The leading databases of metabolic information are those organized under the umbrella of the BioCyc project, which consists of the reference database MetaCyc, and a number of pathway/genome databases (PGDBs) each focussed on a specific organism. A number of PGDBs have been developed for bacterial, fungal, and protozoan pathogens, greatly facilitating dissection of the metabolic potential of these organisms and the identification of new drug targets. Leishmania are protozoan parasites belonging to the family Trypanosomatidae that cause a broad spectrum of diseases in humans. In this work we use the LeishCyc database, the BioCyc database for Leishmania major, to describe how to build a BioCyc database from genomic sequences and associated annotations. By using metabolomic data generated in our group, we show how such databases can be utilized to elucidate specific changes in parasite metabolism.
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tu, Q.; Deng, Ye; Lin, Lu

Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca

PubMed Central

Naithani, Sushma; Partipilo, Christina M.; Raja, Rajani; Elser, Justin L.; Jaiswal, Pankaj

2016-01-01

FragariaCyc is a strawberry-specific cellular metabolic network based on the annotated genome sequence of Fragaria vesca L. ssp. vesca, accession Hawaii 4. It was built on the Pathway-Tools platform using MetaCyc as the reference. The experimental evidences from published literature were used for supporting/editing existing entities and for the addition of new pathways, enzymes, reactions, compounds, and small molecules in the database. To date, FragariaCyc comprises 66 super-pathways, 488 unique pathways, 2348 metabolic reactions, 3507 enzymes, and 2134 compounds. In addition to searching and browsing FragariaCyc, researchers can compare pathways across various plant metabolic networks and analyze their data using Omics Viewer tool. We view FragariaCyc as a resource for the community of researchers working with strawberry and related fruit crops. It can help understanding the regulation of overall metabolism of strawberry plant during development and in response to diseases and abiotic stresses. FragariaCyc is available online at http://pathways.cgrb.oregonstate.edu. PMID:26973684
KEGGtranslator: visualizing and converting the KEGG PATHWAY database to various formats.

PubMed

Wrzodek, Clemens; Dräger, Andreas; Zell, Andreas

2011-08-15

The KEGG PATHWAY database provides a widely used service for metabolic and nonmetabolic pathways. It contains manually drawn pathway maps with information about the genes, reactions and relations contained therein. To store these pathways, KEGG uses KGML, a proprietary XML-format. Parsers and translators are needed to process the pathway maps for usage in other applications and algorithms. We have developed KEGGtranslator, an easy-to-use stand-alone application that can visualize and convert KGML formatted XML-files into multiple output formats. Unlike other translators, KEGGtranslator supports a plethora of output formats, is able to augment the information in translated documents (e.g. MIRIAM annotations) beyond the scope of the KGML document, and amends missing components to fragmentary reactions within the pathway to allow simulations on those. KEGGtranslator is freely available as a Java(™) Web Start application and for download at http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator/. KGML files can be downloaded from within the application. clemens.wrzodek@uni-tuebingen.de Supplementary data are available at Bioinformatics online.
Consensus and conflict cards for metabolic pathway databases

PubMed Central

2013-01-01

Background The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial. Results We introduce the concept of Consensus and Conflict Cards (C2Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C2CardsHuman, as a web application http://www.molgenis.org/c2cards, covering five human pathway databases. Conclusions C2Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C2Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement. PMID:23803311
Consensus and conflict cards for metabolic pathway databases.

PubMed

Stobbe, Miranda D; Swertz, Morris A; Thiele, Ines; Rengaw, Trebor; van Kampen, Antoine H C; Moerland, Perry D

2013-06-26

The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial. We introduce the concept of Consensus and Conflict Cards (C₂Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C₂Cards(Human), as a web application http://www.molgenis.org/c2cards, covering five human pathway databases. C₂Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C₂Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement.
Prediction of Binding Energy of Keap1 Interaction Motifs in the Nrf2 Antioxidant Pathway and Design of Potential High-Affinity Peptides.

PubMed

Karttunen, Mikko; Choy, Wing-Yiu; Cino, Elio A

2018-06-07

Nuclear factor erythroid 2-related factor 2 (Nrf2) is a transcription factor and principal regulator of the antioxidant pathway. The Kelch domain of Kelch-like ECH-associated protein 1 (Keap1) binds to motifs in the N-terminal region of Nrf2, promoting its degradation. There is interest in developing ligands that can compete with Nrf2 for binding to Kelch, thereby activating its transcriptional activities and increasing antioxidant levels. Using experimental Δ G bind values of Kelch-binding motifs determined previously, a revised hydrophobicity-based model was developed for estimating Δ G bind from amino acid sequence and applied to rank potential uncharacterized Kelch-binding motifs identified from interaction databases and BLAST searches. Model predictions and molecular dynamics (MD) simulations suggested that full-length MAD2A binds Kelch more favorably than a high-affinity 20-mer Nrf2 E78P peptide, but that the motif in isolation is not a particularly strong binder. Endeavoring to develop shorter peptides for activating Nrf2, new designs were created based on the E78P peptide, some of which showed considerable propensity to form binding-competent structures in MD, and were predicted to interact with Kelch more favorably than the E78P peptide. The peptides could be promising new ligands for enhancing the oxidative stress response.
BiKEGG: a COBRA toolbox extension for bridging the BiGG and KEGG databases.

PubMed

Jamialahmadi, Oveis; Motamedian, Ehsan; Hashemi-Najafabadi, Sameereh

2016-10-18

Development of an interface tool between the Biochemical, Genetic and Genomic (BiGG) and KEGG databases is necessary for simultaneous access to the features of both databases. For this purpose, we present the BiKEGG toolbox, an open source COBRA toolbox extension providing a set of functions to infer the reaction correspondences between the KEGG reaction identifiers and those in the BiGG knowledgebase using a combination of manual verification and computational methods. Inferred reaction correspondences using this approach are supported by evidence from the literature, which provides a higher number of reconciled reactions between these two databases compared to the MetaNetX and MetRxn databases. This set of equivalent reactions is then used to automatically superimpose the predicted fluxes using COBRA methods on classical KEGG pathway maps or to create a customized metabolic map based on the KEGG global metabolic pathway, and to find the corresponding reactions in BiGG based on the genome annotation of an organism in the KEGG database. Customized metabolic maps can be created for a set of pathways of interest, for the whole KEGG global map or exclusively for all pathways for which there exists at least one flux carrying reaction. This flexibility in visualization enables BiKEGG to indicate reaction directionality as well as to visualize the reaction fluxes for different static or dynamic conditions in an animated manner. BiKEGG allows the user to export (1) the output visualized metabolic maps to various standard image formats or save them as a video or animated GIF file, and (2) the equivalent reactions for an organism as an Excel spreadsheet.
Analysis of differentially co-expressed genes based on microarray data of hepatocellular carcinoma.

PubMed

Wang, Y; Jiang, T; Li, Z; Lu, L; Zhang, R; Zhang, D; Wang, X; Tan, J

2017-01-01

Hepatocellular carcinoma (HCC) is the third leading cause of cancer related death worldwide. Although great progress in diagnosis and management of HCC have been made, the exact molecular mechanisms remain poorly understood. The study aims to identify potential biomarkers for HCC progression, mainly at transcription level. In this study, chip data GSE 29721 was utilized, which contains 10 HCC samples and 10 normal adjacent tissue samples. Differentially expressed genes (DEGs) between two sample types were selected by t-test method. Following, the differentially co-expressed genes (DCGs) and differentially co-expressed Links (DCLs) were identified by DCGL package in R with the threshold of q < 0.25. Afterwards, pathway enrichment analysis of the DCGs was carried out by DAVID. Then, DCLs were mapped to TRANSFAC database to reveal associations between relevant transcriptional factors (TFs) and their target genes. Quantitative real-time RT-PCR was performed for TFs or genes of interest. As a result, a total of 388 DCGs and 35,771 DCLs were obtained. The predominant pathways enriched by these genes were Cytokine-cytokine receptor interaction, ECM-receptor interaction and TGF-β signaling pathway. Three TF-target interactions, LEF1-NCAM1, EGR1-FN1 and FOS-MT2A were predicted. Compared with control, expressions of the TF genes EGR1, FOS and ETS2 were all up-regulated in the HCC cell line, HepG2; while LEF1 was down-regulated. Except NCAM1, all the target genes were up-regulated in HepG2. Our findings suggest these TFs and genes might play important roles in the pathogenesis of HCC and may be used as therapeutic targets for HCC management.
The care pathway: concepts and theories: an introduction.

PubMed

Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

2012-01-01

This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day.
The care pathway: concepts and theories: an introduction

PubMed Central

Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

2012-01-01

This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day. PMID:23593066
The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

PubMed

Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf

2004-02-01

A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
A geographically-diverse collection of 418 human gut microbiome pathway genome databases

PubMed Central

Hahn, Aria S.; Altman, Tomer; Konwar, Kishori M.; Hanson, Niels W.; Kim, Dongjae; Relman, David A.; Dill, David L.; Hallam, Steven J.

2017-01-01

Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn’s disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools. PMID:28398290
Expression profiling indicating low selenium-sensitive microRNA levels linked to cell cycle and cell stress response pathways in the CaCo-2 cell line.

PubMed

McCann, Mark J; Rotjanapun, Kunjana; Hesketh, John E; Roy, Nicole C

2017-05-01

Se is an essential micronutrient for human health, and fluctuations in Se levels and the potential cellular dysfunction associated with it may increase the risk for disease. Although Se has been shown to influence several biological pathways important in health, little is known about the effect of Se on the expression of microRNA (miRNA) molecules regulating these pathways. To explore the potential role of Se-sensitive miRNA in regulating pathways linked with colon cancer, we profiled the expression of 800 miRNA in the CaCo-2 human adenocarcinoma cell line in response to a low-Se (72 h at <40 nm) environment using nCounter direct quantification. These data were then examined using a range of in silico databases to identify experimentally validated miRNA-mRNA interactions and the biological pathways involved. We identified ten Se-sensitive miRNA (hsa-miR-93-5p, hsa-miR-106a-5p, hsa-miR-205-5p, hsa-miR-200c-3p, hsa-miR-99b-5p, hsa-miR-302d-3p, hsa-miR-373-3p, hsa-miR-483-3p, hsa-miR-512-5p and hsa-miR-4454), which regulate 3588 mRNA in key pathways such as the cell cycle, the cellular response to stress, and the canonical Wnt/β-catenin, p53 and ERK/MAPK signalling pathways. Our data show that the effects of low Se on biological pathways may, in part, be due to these ten Se-sensitive miRNA. Dysregulation of the cell cycle and of the stress response pathways due to low Se may influence key genes involved in carcinogenesis.
Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application

PubMed Central

Cantor, Rita M.; Lange, Kenneth; Sinsheimer, Janet S.

2010-01-01

Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach. PMID:20074509
Cancer-related marketing centrality motifs acting as pivot units in the human signaling network and mediating cross-talk between biological pathways.

PubMed

Li, Wan; Chen, Lina; Li, Xia; Jia, Xu; Feng, Chenchen; Zhang, Liangcai; He, Weiming; Lv, Junjie; He, Yuehan; Li, Weiguo; Qu, Xiaoli; Zhou, Yanyan; Shi, Yuchen

2013-12-01

Network motifs in central positions are considered to not only have more in-coming and out-going connections but are also localized in an area where more paths reach the networks. These central motifs have been extensively investigated to determine their consistent functions or associations with specific function categories. However, their functional potentials in the maintenance of cross-talk between different functional communities are unclear. In this paper, we constructed an integrated human signaling network from the Pathway Interaction Database. We identified 39 essential cancer-related motifs in central roles, which we called cancer-related marketing centrality motifs, using combined centrality indices on the system level. Our results demonstrated that these cancer-related marketing centrality motifs were pivotal units in the signaling network, and could mediate cross-talk between 61 biological pathways (25 could be mediated by one motif on average), most of which were cancer-related pathways. Further analysis showed that molecules of most marketing centrality motifs were in the same or adjacent subcellular localizations, such as the motif containing PI3K, PDK1 and AKT1 in the plasma membrane, to mediate signal transduction between 32 cancer-related pathways. Finally, we analyzed the pivotal roles of cancer genes in these marketing centrality motifs in the pathogenesis of cancers, and found that non-cancer genes were potential cancer-related genes.
Screening key candidate genes and pathways involved in insulinoma by microarray analysis.

PubMed

Zhou, Wuhua; Gong, Li; Li, Xuefeng; Wan, Yunyan; Wang, Xiangfei; Li, Huili; Jiang, Bin

2018-06-01

Insulinoma is a rare type tumor and its genetic features remain largely unknown. This study aimed to search for potential key genes and relevant enriched pathways of insulinoma.The gene expression data from GSE73338 were downloaded from Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified between insulinoma tissues and normal pancreas tissues, followed by pathway enrichment analysis, protein-protein interaction (PPI) network construction, and module analysis. The expressions of candidate key genes were validated by quantitative real-time polymerase chain reaction (RT-PCR) in insulinoma tissues.A total of 1632 DEGs were obtained, including 1117 upregulated genes and 514 downregulated genes. Pathway enrichment results showed that upregulated DEGs were significantly implicated in insulin secretion, and downregulated DEGs were mainly enriched in pancreatic secretion. PPI network analysis revealed 7 hub genes with degrees more than 10, including GCG (glucagon), GCGR (glucagon receptor), PLCB1 (phospholipase C, beta 1), CASR (calcium sensing receptor), F2R (coagulation factor II thrombin receptor), GRM1 (glutamate metabotropic receptor 1), and GRM5 (glutamate metabotropic receptor 5). DEGs involved in the significant modules were enriched in calcium signaling pathway, protein ubiquitination, and platelet degranulation. Quantitative RT-PCR data confirmed that the expression trends of these hub genes were similar to the results of bioinformatic analysis.The present study demonstrated that candidate DEGs and enriched pathways were the potential critical molecule events involved in the development of insulinoma, and these findings were useful for better understanding of insulinoma genesis.

Defining the Protein–Protein Interaction Network of the Human Hippo Pathway*

PubMed Central

Wang, Wenqi; Li, Xu; Huang, Jun; Feng, Lin; Dolinta, Keithlee G.; Chen, Junjie

2014-01-01

The Hippo pathway, which is conserved from Drosophila to mammals, has been recognized as a tumor suppressor signaling pathway governing cell proliferation and apoptosis, two key events involved in organ size control and tumorigenesis. Although several upstream regulators, the conserved kinase cascade and key downstream effectors including nuclear transcriptional factors have been defined, the global organization of this signaling pathway is not been fully understood. Thus, we conducted a proteomic analysis of human Hippo pathway, which revealed the involvement of an extensive protein–protein interaction network in this pathway. The mass spectrometry data were deposited to ProteomeXchange with identifier PXD000415. Our data suggest that 550 interactions within 343 unique protein components constitute the central protein–protein interaction landscape of human Hippo pathway. Our study provides a glimpse into the global organization of Hippo pathway, reveals previously unknown interactions within this pathway, and uncovers new potential components involved in the regulation of this pathway. Understanding these interactions will help us further dissect the Hippo signaling-pathway and extend our knowledge of organ size control. PMID:24126142
Visualization and Analysis of MiRNA-Targets Interactions Networks.

PubMed

León, Luis E; Calligaris, Sebastián D

2017-01-01

MicroRNAs are a class of small, noncoding RNA molecules of 21-25 nucleotides in length that regulate the gene expression by base-pairing with the target mRNAs, mainly leading to down-regulation or repression of the target genes. MicroRNAs are involved in diverse regulatory pathways in normal and pathological conditions. In this context, it is highly important to identify the targets of specific microRNA in order to understand the mechanism of its regulation and consequently its involvement in disease. However, the microRNA target identification is experimentally laborious and time-consuming. The in silico prediction of microRNA targets is an extremely useful approach because you can identify potential mRNA targets, reduce the number of possibilities and then, validate a few microRNA-mRNA interactions in an in vitro experimental model. In this chapter, we describe, in a simple way, bioinformatics guidelines to use miRWalk database and Cytoscape software for analyzing microRNA-mRNA interactions through their visualization as a network.
A systematic review of pathways to and processes associated with radicalization and extremism amongst Muslims in Western societies.

PubMed

McGilloway, Angela; Ghosh, Priyo; Bhui, Kamaldeep

2015-02-01

Following the terrorist attacks of 9/11 in the USA and 7/7 in the UK, academic interest in factors involved in radicalization and terrorism has increased dramatically. Many related social and psychological theories have been put forward, however terrorism literature still lacks empirical research. In particular, little is known about the early processes and pathways to radicalization. Our aim is to investigate original research on pathways and processes associated with radicalization and extremism amongst people of Muslim heritage living in Western societies, that is, the group prioritized by counter-terrorism policy. Studies included in the review were original qualitative or quantitative primary research published in peer-reviewed journals, identified by searching research databases. All disciplines of journals were included. No single cause or pathway was implicated in radicalization and violent extremism. Individuals may demonstrate vulnerabilities that increase exposure to radicalization; however, the only common characteristic determined that terrorists are generally well-integrated, 'normal' individuals. Engagement in such activity is dependent on a wide range of interacting variables influenced by personal, localized and externalized factors. Further research should examine broader determinants of radicalization in susceptible populations. Future policy should follow this public health approach rather than constructing from perpetrators already committed to engaging in terrorism.
Hybrid drug combination: Combination of ferulic acid and metformin as anti-diabetic therapy.

PubMed

Nankar, Rakesh; Prabhakar, P K; Doble, Mukesh

2017-12-15

Ferulic acid, an anti-oxidant phytochemical present in several dietary components, is known to produce wide range of pharmacological effects. It is approved for use in food industry as a preservative and in sports food. Previous reports from our lab have shown synergistic interaction of ferulic acid with metformin in cell lines and diabetic rats. The purpose of this review is to compile information about anti-diabetic activity of ferulic acid in in vitro and in vivo models with special emphasis on activity of ferulic acid when combined with metformin. The mechanism of synergistic interaction between ferulic acid and metformin is also proposed after carefully studying effects of these compounds on molecules involved in glucose metabolism. Scientific literature for the purpose of this review was collected using online search engines and databases such as ScienceDirect, Scopus, PubMed and Google scholar. Ferulic acid forms resonance stabilized phenoxyl radical which scavenges free radicals and reduce oxidative stress. It improves glucose and lipid profile in diabetic rats by enhancing activities of antioxidant enzymes, superoxide dismutase and catalase in the pancreatic tissue. Combining ferulic acid with metformin improves both, in vitro glucose uptake activity and in vivo hypoglycemic activity of the latter. It is possible to reduce the dose of metformin by four folds (from 50 to 12.5 mg/kg body weight) by combining it with 10 mg of ferulic acid/kg body weight in diabetic rats. Ferulic acid improves glucose uptake through PI3-K pathway whereas metformin activates AMPK pathway to improve glucose uptake. The synergistic interaction of ferulic acid and metformin is due their action on parallel pathways which are involved in glucose uptake. Due to synergistic nature of their interaction, it is possible to reduce the dose of metformin (by combining with ferulic acid) required to achieve normoglycemia. Since the dose of metformin is reduced, the dose associated side effects of metformin therapy can be reduced. Copyright © 2017 Elsevier GmbH. All rights reserved.
Identification of key candidate genes and pathways in hepatitis B virus-associated acute liver failure by bioinformatical analysis

PubMed Central

Lin, Huapeng; Zhang, Qian; Li, Xiaocheng; Wu, Yushen; Liu, Ye; Hu, Yingchun

2018-01-01

Abstract Hepatitis B virus-associated acute liver failure (HBV-ALF) is a rare but life-threatening syndrome that carried a high morbidity and mortality. Our study aimed to explore the possible molecular mechanisms of HBV-ALF by means of bioinformatics analysis. In this study, genes expression microarray datasets of HBV-ALF from Gene Expression Omnibus were collected, and then we identified differentially expressed genes (DEGs) by the limma package in R. After functional enrichment analysis, we constructed the protein–protein interaction (PPI) network by the Search Tool for the Retrieval of Interacting Genes online database and weighted genes coexpression network by the WGCNA package in R. Subsequently, we picked out the hub genes among the DEGs. A total of 423 DEGs with 198 upregulated genes and 225 downregulated genes were identified between HBV-ALF and normal samples. The upregulated genes were mainly enriched in immune response, and the downregulated genes were mainly enriched in complement and coagulation cascades. Orosomucoid 1 (ORM1), orosomucoid 2 (ORM2), plasminogen (PLG), and aldehyde oxidase 1 (AOX1) were picked out as the hub genes that with a high degree in both PPI network and weighted genes coexpression network. The weighted genes coexpression network analysis found out 3 of the 5 modules that upregulated genes enriched in were closely related to immune system. The downregulated genes enriched in only one module, and the genes in this module majorly enriched in the complement and coagulation cascades pathway. In conclusion, 4 genes (ORM1, ORM2, PLG, and AOX1) with immune response and the complement and coagulation cascades pathway may take part in the pathogenesis of HBV-ALF, and these candidate genes and pathways could be therapeutic targets for HBV-ALF. PMID:29384847
Signaling Networks among Stem Cell Precursors, Transit-Amplifying Progenitors, and their Niche in Developing Hair Follicles.

PubMed

Rezza, Amélie; Wang, Zichen; Sennett, Rachel; Qiao, Wenlian; Wang, Dongmei; Heitman, Nicholas; Mok, Ka Wai; Clavel, Carlos; Yi, Rui; Zandstra, Peter; Ma'ayan, Avi; Rendl, Michael

2016-03-29

The hair follicle (HF) is a complex miniorgan that serves as an ideal model system to study stem cell (SC) interactions with the niche during growth and regeneration. Dermal papilla (DP) cells are required for SC activation during the adult hair cycle, but signal exchange between niche and SC precursors/transit-amplifying cell (TAC) progenitors that regulates HF morphogenetic growth is largely unknown. Here we use six transgenic reporters to isolate 14 major skin and HF cell populations. With next-generation RNA sequencing, we characterize their transcriptomes and define unique molecular signatures. SC precursors, TACs, and the DP niche express a plethora of ligands and receptors. Signaling interaction network analysis reveals a bird's-eye view of pathways implicated in epithelial-mesenchymal interactions. Using a systematic tissue-wide approach, this work provides a comprehensive platform, linked to an interactive online database, to identify and further explore the SC/TAC/niche crosstalk regulating HF growth. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Karp, Peter D.

Pathway Tools is a systems-biology software package written by SRI International (SRI) that produces Pathway/Genome Databases (PGDBs) for organisms with a sequenced genome. Pathway Tools also provides a wide range of capabilities for analyzing predicted metabolic networks and user-generated omics data. More than 5,000 academic, industrial, and government groups have licensed Pathway Tools. This user community includes researchers at all three DOE bioenergy centers, as well as academic and industrial metabolic engineering (ME) groups. An integral part of the Pathway Tools software is MetaCyc, a large, multiorganism database of metabolic pathways and enzymes that SRI and its academic collaborators manuallymore » curate. This project included two main goals: I. Enhance the MetaCyc content of bioenergy-related enzymes and pathways. II. Develop computational tools for engineering metabolic pathways that satisfy specified design goals, in particular for bioenergy-related pathways. In part I, SRI proposed to significantly expand the coverage of bioenergy-related metabolic information in MetaCyc, followed by the generation of organism-specific PGDBs for all energy-relevant organisms sequenced at the DOE Joint Genome Institute (JGI). Part I objectives included: 1: Expand the content of MetaCyc to include bioenergy-related enzymes and pathways. 2: Enhance the Pathway Tools software to enable display of complex polymer degradation processes. 3: Create new PGDBs for the energy-related organisms sequenced by JGI, update existing PGDBs with new MetaCyc content, and make these data available to JBEI via the BioCyc website. In part II, SRI proposed to develop an efficient computational tool for the engineering of metabolic pathways. Part II objectives included: 4: Develop computational tools for generating metabolic pathways that satisfy specified design goals, enabling users to specify parameters such as starting and ending compounds, and preferred or disallowed intermediate compounds. The pathways were to be generated using metabolic reactions from a reference database (DB). 5: Develop computational tools for ranking the pathways generated in objective (4) according to their optimality. The ranking criteria include stoichiometric yield, the number and cost of additional inputs and the cofactor compounds required by the pathway, pathway length, and pathway energetics. 6: Develop tools for visualizing generated pathways to facilitate the evaluation of a large space of generated pathways.« less
The ADAMS interactive interpreter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rietscha, E.R.

1990-12-17

The ADAMS (Advanced DAta Management System) project is exploring next generation database technology. Database management does not follow the usual programming paradigm. Instead, the database dictionary provides an additional name space environment that should be interactively created and tested before writing application code. This document describes the implementation and operation of the ADAMS Interpreter, an interactive interface to the ADAMS data dictionary and runtime system. The Interpreter executes individual statements of the ADAMS Interface Language, providing a fast, interactive mechanism to define and access persistent databases. 5 refs.
Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger

PubMed Central

Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P

2010-01-01

Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes

PubMed Central

Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka

2008-01-01

Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
From 20th century metabolic wall charts to 21st century systems biology: database of mammalian metabolic enzymes.

PubMed

Corcoran, Callan C; Grady, Cameron R; Pisitkun, Trairak; Parulekar, Jaya; Knepper, Mark A

2017-03-01

The organization of the mammalian genome into gene subsets corresponding to specific functional classes has provided key tools for systems biology research. Here, we have created a web-accessible resource called the Mammalian Metabolic Enzyme Database ( https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/MetabolicEnzymeDatabase.html) keyed to the biochemical reactions represented on iconic metabolic pathway wall charts created in the previous century. Overall, we have mapped 1,647 genes to these pathways, representing ~7 percent of the protein-coding genome. To illustrate the use of the database, we apply it to the area of kidney physiology. In so doing, we have created an additional database ( Database of Metabolic Enzymes in Kidney Tubule Segments: https://hpcwebapps.cit.nih.gov/ESBL/Database/MetabolicEnzymes/), mapping mRNA abundance measurements (mined from RNA-Seq studies) for all metabolic enzymes to each of 14 renal tubule segments. We carry out bioinformatics analysis of the enzyme expression pattern among renal tubule segments and mine various data sources to identify vasopressin-regulated metabolic enzymes in the renal collecting duct. Copyright © 2017 the American Physiological Society.
YMDB 2.0: a significantly expanded version of the yeast metabolome database.

PubMed

Ramirez-Gaona, Miguel; Marcu, Ana; Pon, Allison; Guo, An Chi; Sajed, Tanvir; Wishart, Noah A; Karu, Naama; Djoumbou Feunang, Yannick; Arndt, David; Wishart, David S

2017-01-04

YMDB or the Yeast Metabolome Database (http://www.ymdb.ca/) is a comprehensive database containing extensive information on the genome and metabolome of Saccharomyces cerevisiae Initially released in 2012, the YMDB has gone through a significant expansion and a number of improvements over the past 4 years. This manuscript describes the most recent version of YMDB (YMDB 2.0). More specifically, it provides an updated description of the database that was previously described in the 2012 NAR Database Issue and it details many of the additions and improvements made to the YMDB over that time. Some of the most important changes include a 7-fold increase in the number of compounds in the database (from 2007 to 16 042), a 430-fold increase in the number of metabolic and signaling pathway diagrams (from 66 to 28 734), a 16-fold increase in the number of compounds linked to pathways (from 742 to 12 733), a 17-fold increase in the numbers of compounds with nuclear magnetic resonance or MS spectra (from 783 to 13 173) and an increase in both the number of data fields and the number of links to external databases. In addition to these database expansions, a number of improvements to YMDB's web interface and its data visualization tools have been made. These additions and improvements should greatly improve the ease, the speed and the quantity of data that can be extracted, searched or viewed within YMDB. Overall, we believe these improvements should not only improve the understanding of the metabolism of S. cerevisiae, but also allow more in-depth exploration of its extensive metabolic networks, signaling pathways and biochemistry. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

PubMed Central

Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

2015-01-01

Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068
Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.).

PubMed

Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A

2016-01-01

The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.
ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome

PubMed Central

Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J.; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M.; Aldana-Montes, José F.; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M. Gonzalo

2015-01-01

Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066
microRNAs Databases: Developmental Methodologies, Structural and Functional Annotations.

PubMed

Singh, Nagendra Kumar

2017-09-01

microRNA (miRNA) is an endogenous and evolutionary conserved non-coding RNA, involved in post-transcriptional process as gene repressor and mRNA cleavage through RNA-induced silencing complex (RISC) formation. In RISC, miRNA binds in complementary base pair with targeted mRNA along with Argonaut proteins complex, causes gene repression or endonucleolytic cleavage of mRNAs and results in many diseases and syndromes. After the discovery of miRNA lin-4 and let-7, subsequently large numbers of miRNAs were discovered by low-throughput and high-throughput experimental techniques along with computational process in various biological and metabolic processes. The miRNAs are important non-coding RNA for understanding the complex biological phenomena of organism because it controls the gene regulation. This paper reviews miRNA databases with structural and functional annotations developed by various researchers. These databases contain structural and functional information of animal, plant and virus miRNAs including miRNAs-associated diseases, stress resistance in plant, miRNAs take part in various biological processes, effect of miRNAs interaction on drugs and environment, effect of variance on miRNAs, miRNAs gene expression analysis, sequence of miRNAs, structure of miRNAs. This review focuses on the developmental methodology of miRNA databases such as computational tools and methods used for extraction of miRNAs annotation from different resources or through experiment. This study also discusses the efficiency of user interface design of every database along with current entry and annotations of miRNA (pathways, gene ontology, disease ontology, etc.). Here, an integrated schematic diagram of construction process for databases is also drawn along with tabular and graphical comparison of various types of entries in different databases. Aim of this paper is to present the importance of miRNAs-related resources at a single place.
Alterations in mRNA profiles of trastuzumab‑resistant Her‑2‑positive breast cancer.

PubMed

Zhao, Bin; Zhao, Yang; Sun, Yan; Niu, Haitao; Sheng, Long; Huang, Dongfang; Li, Li

2018-05-07

Breast cancer is one of the most common malignancies in women. Neoadjuvant trastuzumab therapy improves the prognosis of certain Her‑2‑positive breast cancer patients, however around two‑thirds of patients with Her‑2‑positive breast cancer do not benefit from Her‑2‑targeted therapy. To investigate the key mechanisms in trastuzumab resistance, potential biomarkers for neoadjuvant trastuzumab sensitivity were investigated using the gene expression omnibus (GEO) database for mRNA microarray data of Her‑2‑positive breast cancer patients who received neoadjuvant trastuzumab therapy. GEO profiles of 22 patients with a complete response and 48 patients with a partial response were identified in the GSE22358, GSE62327 and GSE66305 datasets. A total of 2,376, 1,000 and 1,152 differentially expressed genes in GSE22358, GSE62327 and GSE66305 datasets were demonstrated, respectively, utilizing GEO2R software. Furthermore, enriched gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways were analyzed using the Database for Annotation, Visualization and Integrated Discovery software. Subsequently, a protein‑protein interaction network was established using STRING software. The results demonstrated that low sex‑determining region Y‑box 11 and high Bcl‑2 expression may be employed as markers for neoadjuvant trastuzumab therapy for Her‑2‑positive breast cancer. More importantly, phosphoinositide 3‑kinase/Akt and angiogenesis pathways, which are known to be the key targets of trastuzumab, were activated at a lower level in the partial response patients, while the Wnt and estrogen receptor signaling pathways were activated in these patients. Therefore, combination therapy of trastuzumab and anti‑Wnt or hormone therapy may be a promising treatment modality and should be tested in further studies.
Constraints on signaling network logic reveal functional subgraphs on Multiple Myeloma OMIC data.

PubMed

Miannay, Bertrand; Minvielle, Stéphane; Magrangeas, Florence; Guziolowski, Carito

2018-03-21

The integration of gene expression profiles (GEPs) and large-scale biological networks derived from pathways databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic, however our approach does not require a prior species selection according to their gene expression level. We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this confrontation independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We apply our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph links Multiple Myeloma (MM) genes to known receptors for this blood cancer. We discover that the NCI-PID MM graph had 15 independent components, and when confronted to 611 MM GEPs, we find 1 component as being more specific to represent the difference between cancer and healthy profiles.
Recent advances in proteomics of cereals.

PubMed

Bansal, Monika; Sharma, Madhu; Kanwar, Priyanka; Goyal, Aakash

Cereals contribute a major part of human nutrition and are considered as an integral source of energy for human diets. With genomic databases already available in cereals such as rice, wheat, barley, and maize, the focus has now moved to proteome analysis. Proteomics studies involve the development of appropriate databases based on developing suitable separation and purification protocols, identification of protein functions, and can confirm their functional networks based on already available data from other sources. Tremendous progress has been made in the past decade in generating huge data-sets for covering interactions among proteins, protein composition of various organs and organelles, quantitative and qualitative analysis of proteins, and to characterize their modulation during plant development, biotic, and abiotic stresses. Proteomics platforms have been used to identify and improve our understanding of various metabolic pathways. This article gives a brief review of efforts made by different research groups on comparative descriptive and functional analysis of proteomics applications achieved in the cereal science so far.
Computer applications making rapid advances in high throughput microbial proteomics (HTMP).

PubMed

Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen

2014-02-01

The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.

Whole-exome sequencing in obsessive-compulsive disorder identifies rare mutations in immunological and neurodevelopmental pathways

PubMed Central

Cappi, C; Brentani, H; Lima, L; Sanders, S J; Zai, G; Diniz, B J; Reis, V N S; Hounie, A G; Conceição do Rosário, M; Mariani, D; Requena, G L; Puga, R; Souza-Duran, F L; Shavitt, R G; Pauls, D L; Miguel, E C; Fernandez, T V

2016-01-01

Studies of rare genetic variation have identified molecular pathways conferring risk for developmental neuropsychiatric disorders. To date, no published whole-exome sequencing studies have been reported in obsessive-compulsive disorder (OCD). We sequenced all the genome coding regions in 20 sporadic OCD cases and their unaffected parents to identify rare de novo (DN) single-nucleotide variants (SNVs). The primary aim of this pilot study was to determine whether DN variation contributes to OCD risk. To this aim, we evaluated whether there is an elevated rate of DN mutations in OCD, which would justify this approach toward gene discovery in larger studies of the disorder. Furthermore, to explore functional molecular correlations among genes with nonsynonymous DN SNVs in OCD probands, a protein–protein interaction (PPI) network was generated based on databases of direct molecular interactions. We applied Degree-Aware Disease Gene Prioritization (DADA) to rank the PPI network genes based on their relatedness to a set of OCD candidate genes from two OCD genome-wide association studies (Stewart et al., 2013; Mattheisen et al., 2014). In addition, we performed a pathway analysis with genes from the PPI network. The rate of DN SNVs in OCD was 2.51 × 10−8 per base per generation, significantly higher than a previous estimated rate in unaffected subjects using the same sequencing platform and analytic pipeline. Several genes harboring DN SNVs in OCD were highly interconnected in the PPI network and ranked high in the DADA analysis. Nearly all the DN SNVs in this study are in genes expressed in the human brain, and a pathway analysis revealed enrichment in immunological and central nervous system functioning and development. The results of this pilot study indicate that further investigation of DN variation in larger OCD cohorts is warranted to identify specific risk genes and to confirm our preliminary finding with regard to PPI network enrichment for particular biological pathways and functions. PMID:27023170
Keys and the crisis in taxonomy: extinction or reinvention?

PubMed

Walter, David Evans; Winterton, Shaun

2007-01-01

Dichotomous keys that follow a single pathway of character state choices to an end point have been the primary tools for the identification of unknown organisms for more than two centuries. However, a revolution in computer diagnostics is now under way that may result in the replacement of traditional keys by matrix-based computer interactive keys that have many paths to a correct identification and make extensive use of hypertext to link to images, glossaries, and other support material. Progress is also being made on replacing keys entirely by optical matching of specimens to digital databases and DNA sequences. These new tools may go some way toward alleviating the taxonomic impediment to biodiversity studies and other ecological and evolutionary research, especially with better coordination between those who produce keys and those who use them and by integrating interactive keys into larger biological Web sites.
Approaches for Defining the Hsp90-dependent Proteome

PubMed Central

Hartson, Steven D.; Matts, Robert L.

2011-01-01

Hsp90 is the target of ongoing drug discovery studies seeking new compounds to treat cancer, neurodegenerative diseases, and protein folding disorders. To better understand Hsp90’s roles in cellular pathologies and in normal cells, numerous studies have utilized proteomics assays and related high-throughput tools to characterize its physical and functional protein partnerships. This review surveys these studies, and summarizes the strengths and limitations of the individual attacks. We also include downloadable spreadsheets compiling all of the Hsp90-interacting proteins identified in more than 23 studies. These tools include cross-references among gene aliases, human homologues of yeast Hsp90-interacting proteins, hyperlinks to database entries, summaries of canonical pathways that are enriched in the Hsp90 interactome, and additional bioinformatic annotations. In addition to summarizing Hsp90 proteomics studies performed to date and the insights they have provided, we identify gaps in our current understanding of Hsp90-mediated proteostasis. PMID:21906632
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions

PubMed Central

Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.

2007-01-01

SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
From Databases to Modelling of Functional Pathways

PubMed Central

2004-01-01

This short review comments on current informatics resources and methodologies in the study of functional pathways in cell biology. It highlights recent achievements in unveiling the structural design of protein and gene networks and discusses current approaches to model and simulate the dynamics of regulatory pathways in the cell. PMID:18629070
From databases to modelling of functional pathways.

PubMed

Nasi, Sergio

2004-01-01

This short review comments on current informatics resources and methodologies in the study of functional pathways in cell biology. It highlights recent achievements in unveiling the structural design of protein and gene networks and discusses current approaches to model and simulate the dynamics of regulatory pathways in the cell.
AOP-DB: A database resource for the exploration of Adverse Outcome Pathways through integrated association networks.

EPA Science Inventory

The Adverse Outcome Pathway (AOP) framework describes the progression of a toxicity pathway from molecular perturbation to population-level outcome in a series of measurable, mechanistic responses. The controlled, computer-readable vocabulary that defines an AOP has the ability t...
In silico database screening of potential targets and pathways of compounds contained in plants used for psoriasis vulgaris.

PubMed

May, Brian H; Deng, Shiqiang; Zhang, Anthony L; Lu, Chuanjian; Xue, Charlie C L

2015-09-01

Reviews and meta-analyses of clinical trials identified plants used as traditional medicines (TMs) that show promise for psoriasis. These include Rehmannia glutinosa, Camptotheca acuminata, Indigo naturalis and Salvia miltiorrhiza. Compounds contained in these TMs have shown activities of relevance to psoriasis in experimental models. To further investigate the likely mechanisms of action of the multiple compounds in these TMs, we undertook a computer-based in silico investigation of the proteins known to be regulated by these compounds and their associated biological pathways. The proteins reportedly regulated by compounds in these four TMs were identified using the HIT (Herbal Ingredients' Targets) database. The resultant data were entered into the PANTHER (Protein ANnotation THrough Evolutionary Relationship) database to identify the pathways in which the proteins could be involved. The study identified 237 compounds in the TMs and these retrieved 287 proteins from HIT. These proteins identified 59 pathways in PANTHER with most proteins being located in the Apoptosis, Angiogenesis, Inflammation mediated by chemokine and cytokine, Gonadotropin releasing hormone receptor, and/or Interleukin signaling pathways. All four TMs contained compounds that had regulating effects on Apoptosis regulator BAX, Apoptosis regulator Bcl-2, Caspase-3, Tumor necrosis factor (TNF) or Prostaglandin G/H synthase 2 (COX2). The main proteins and pathways are primarily related to inflammation, proliferation and angiogenesis which are all processes involved in psoriasis. Experimental studies have reported that certain compounds from these TMs can regulate the expression of proteins involved in each of these pathways.
Metabolomics analysis: Finding out metabolic building blocks

PubMed Central

2017-01-01

In this paper we propose a new methodology for the analysis of metabolic networks. We use the notion of strongly connected components of a graph, called in this context metabolic building blocks. Every strongly connected component is contracted to a single node in such a way that the resulting graph is a directed acyclic graph, called a metabolic DAG, with a considerably reduced number of nodes. The property of being a directed acyclic graph brings out a background graph topology that reveals the connectivity of the metabolic network, as well as bridges, isolated nodes and cut nodes. Altogether, it becomes a key information for the discovery of functional metabolic relations. Our methodology has been applied to the glycolysis and the purine metabolic pathways for all organisms in the KEGG database, although it is general enough to work on any database. As expected, using the metabolic DAGs formalism, a considerable reduction on the size of the metabolic networks has been obtained, specially in the case of the purine pathway due to its relative larger size. As a proof of concept, from the information captured by a metabolic DAG and its corresponding metabolic building blocks, we obtain the core of the glycolysis pathway and the core of the purine metabolism pathway and detect some essential metabolic building blocks that reveal the key reactions in both pathways. Finally, the application of our methodology to the glycolysis pathway and the purine metabolism pathway reproduce the tree of life for the whole set of the organisms represented in the KEGG database which supports the utility of this research. PMID:28493998
Application of the ToxMiner Database: Network Analysis of ...

EPA Pesticide Factsheets

The US EPA ToxCast program is using in vitro HTS (High-Throughput Screening) methods to profile and model bioactivity of environmental chemicals. The main goals of the ToxCast program are to generate predictive signatures of toxicity, and ultimately provide rapid and cost-effective alternatives to animal testing. The chemicals selected for Phase I are composed largely by a diverse set of pesticide active ingredients, which had sufficient supporting in vivo data included as part of their registration process with the EPA. Other miscellaneous chemicals of environmental concern were also included. Application of HTS to environmental toxicants is a novel approach to predictive toxicology and health risk assessment, and differs from what is required for drug efficacy screening in that biochemical interaction of environmental chemicals are sometimes weaker than that seen with drugs and their intended targets. Additionally, the chemical space covered by environmental chemicals is much broader compared to that of pharmaceuticals. The ToxMiner database has been created and added to the EPA’s ACToR (Aggregated Computational Toxicology Resource) chemical database. One purpose of the ToxMiner database is to link biological, metabolic and cellular pathway data to genes and in vitro assay data for the initial subset of chemicals screened in the ToxCast Phase I HTS assays. Also included in ToxMiner is human disease information, which correlates with ToxCast assays that tar
Application of the ToxMiner Database: Network Analysis ...

EPA Pesticide Factsheets

The US EPA ToxCast program is using in vitro HTS (High-Throughput Screening) methods to profile and model bioactivity of environmental chemicals. The main goals of the ToxCast program are to generate predictive signatures of toxicity, and ultimately provide rapid and cost-effective alternatives to animal testing. The chemicals selected for Phase I are composed largely by a diverse set of pesticide active ingredients, which had sufficient supporting in vivo data included as part of their registration process with the EPA. Other miscellaneous chemicals of environmental concern were also included. Application of HTS to environmental toxicants is a novel approach to predictive toxicology and health risk assessment, and differs from what is required for drug efficacy screening in that biochemical interaction of environmental chemicals are sometimes weaker than that seen with drugs and their intended targets. Additionally, the chemical space covered by environmental chemicals is much broader compared to that of pharmaceuticals. The ToxMiner database has been created and added to the EPA’s ACToR (Aggregated Computational Toxicology Resource) chemical database. One purpose of the ToxMiner database is to link biological, metabolic, and cellular pathway data to genes and in vitro assay data for the initial subset of chemicals screened in the ToxCast Phase I HTS assays. Also included in ToxMiner is human disease information, which correlates with ToxCast assays that ta
pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes.

PubMed

Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth

2013-01-01

RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/
Regulatory interactions between long noncoding RNA LINC00968 and miR-9-3p in non-small cell lung cancer: A bioinformatic analysis based on miRNA microarray, GEO and TCGA.

PubMed

Li, Dong-Yao; Chen, Wen-Jie; Shang, Jun; Chen, Gang; Li, Shi-Kang

2018-06-01

Long non-coding RNAs (lncRNAs) have been demonstrated to mediate carcinogenesis in various types of cancer. However, the regulatory role of lncRNA LINC00968 in lung adenocarcinoma remains unclear. The microRNA (miRNA) expression in LINC00968-overexpressing human lung adenocarcinoma A549 cells was detected using miRNA microarray analysis. miR-9-3p was selected for further analysis, and its expression was verified in the Gene Expression Omnibus (GEO) database. In addition, the regulatory axis of LINC00968 was validated using The Cancer Genome Atlas (TCGA) database. Results of the GEO database indicated miR-9-3p expression in lung adenocarcinoma was significantly higher compared with normal tissues. Functional enrichment analyses of the target genes of miR-9-3p indicated protein binding and the AMP-activated protein kinase pathway were the most enriched Gene Ontology and KEGG terms, respectively. Combining target genes with the correlated genes of LINC00968 and miR-9-3p, 120 objective genes were obtained, which were used to construct a protein-protein interaction (PPI) network. Cyclin A2 (CCNA2) was identified to have a vital role in the PPI network. Significant correlations were detected between LINC00968, miR-9-3p and CCNA2 in lung adenocarcinoma. The LINC00968/miR-9-3p/CCNA2 regulatory axis provides a new foundation for further evaluating the regulatory mechanisms of LINC00968 in lung adenocarcinoma.
Active subnetwork recovery with a mechanism-dependent scoring function; with application to angiogenesis and organogenesis studies

PubMed Central

2013-01-01

Background The learning active subnetworks problem involves finding subnetworks of a bio-molecular network that are active in a particular condition. Many approaches integrate observation data (e.g., gene expression) with the network topology to find candidate subnetworks. Increasingly, pathway databases contain additional annotation information that can be mined to improve prediction accuracy, e.g., interaction mechanism (e.g., transcription, microRNA, cleavage) annotations. We introduce a mechanism-based approach to active subnetwork recovery which exploits such annotations. We suggest that neighboring interactions in a network tend to be co-activated in a way that depends on the “correlation” of their mechanism annotations. e.g., neighboring phosphorylation and de-phosphorylation interactions may be more likely to be co-activated than neighboring phosphorylation and covalent bonding interactions. Results Our method iteratively learns the mechanism correlations and finds the most likely active subnetwork. We use a probabilistic graphical model with a Markov Random Field component which creates dependencies between the states (active or non-active) of neighboring interactions, that incorporates a mechanism-based component to the function. We apply a heuristic-based EM-based algorithm suitable for the problem. We validated our method’s performance using simulated data in networks downloaded from GeneGO against the same approach without the mechanism-based component, and two other existing methods. We validated our methods performance in correctly recovering (1) the true interaction states, and (2) global network properties of the original network against these other methods. We applied our method to networks generated from time-course gene expression studies in angiogenesis and lung organogenesis and validated the findings from a biological perspective against current literature. Conclusions The advantage of our mechanism-based approach is best seen in networks composed of connected regions with a large number of interactions annotated with a subset of mechanisms, e.g., a regulatory region of transcription interactions, or a cleavage cascade region. When applied to real datasets, our method recovered novel and biologically meaningful putative interactions, e.g., interactions from an integrin signaling pathway using the angiogenesis dataset, and a group of regulatory microRNA interactions in an organogenesis network. PMID:23432934
An integrated global regulatory network of hematopoietic precursor cell self-renewal and differentiation.

PubMed

You, Yanan; Cuevas-Diaz Duran, Raquel; Jiang, Lihua; Dong, Xiaomin; Zong, Shan; Snyder, Michael; Wu, Jia Qian

2018-06-12

Systematic study of the regulatory mechanisms of Hematopoietic Stem Cell and Progenitor Cell (HSPC) self-renewal is fundamentally important for understanding hematopoiesis and for manipulating HSPCs for therapeutic purposes. Previously, we have characterized gene expression and identified important transcription factors (TFs) regulating the switch between self-renewal and differentiation in a multipotent Hematopoietic Progenitor Cell (HPC) line, EML (Erythroid, Myeloid, and Lymphoid) cells. Herein, we report binding maps for additional TFs (SOX4 and STAT3) by using chromatin immunoprecipitation (ChIP)-Sequencing, to address the underlying mechanisms regulating self-renewal properties of lineage-CD34+ subpopulation (Lin-CD34+ EML cells). Furthermore, we applied the Assay for Transposase Accessible Chromatin (ATAC)-Sequencing to globally identify the open chromatin regions associated with TF binding in the self-renewing Lin-CD34+ EML cells. Mass spectrometry (MS) was also used to quantify protein relative expression levels. Finally, by integrating the protein-protein interaction database, we built an expanded transcriptional regulatory and interaction network. We found that MAPK (Mitogen-activated protein kinase) pathway and TGF-β/SMAD signaling pathway components were highly enriched among the binding targets of these TFs in Lin-CD34+ EML cells. The present study integrates regulatory information at multiple levels to paint a more comprehensive picture of the HSPC self-renewal mechanisms.
How do natural, uncultivated microbes interact with organic matter? Insights from single cell genomics and metagenomics

NASA Astrophysics Data System (ADS)

Lloyd, K. G.; Bird, J.; Schreiber, L.; Petersen, D.; Kjeldsen, K.; Schramm, A.; Stepanauskas, R.; Jørgensen, B. B.

2013-12-01

Since most of the microbes in marine sediments remain uncultured, little is known about the mechanisms by which these natural communities degrade organic matter (OM). Likewise, little is known about the make-up of labile OM in marine sediments beyond general functional classes such as proteins, carbohydrates, and lipids, measured as monomers. However, microbes have complex interactions with specific polymers within these functional classes, which can be indicated by a microbe's enzymatic toolkit. We found that four single cell genomes of archaea have very different peptidase compositions than four single cells of bacteria, suggesting that archaea and bacteria may play different roles in OM degradation. We also found that predicted extracellular cysteine peptidases, which require chemically reducing conditions, were common in IMG database metagenomes from marine sediments, and absent in those from seawater. This suggests that the pathways, and not just the rates, of OM degradation may differ between seawater and sediments. By comparing enzyme classes in different organisms, or in different types of marine environments, we present an emerging view of the microbial potential for specific carbon remineralization pathways in marine sediments. In addition, the methods we present hold promise for characterizing OM degradation in any environment where genomic information is available.
Commensurate distances and similar motifs in genetic congruence and protein interaction networks in yeast

PubMed Central

Ye, Ping; Peyser, Brian D; Spencer, Forrest A; Bader, Joel S

2005-01-01

Background In a genetic interaction, the phenotype of a double mutant differs from the combined phenotypes of the underlying single mutants. When the single mutants have no growth defect, but the double mutant is lethal or exhibits slow growth, the interaction is termed synthetic lethality or synthetic fitness. These genetic interactions reveal gene redundancy and compensating pathways. Recently available large-scale data sets of genetic interactions and protein interactions in Saccharomyces cerevisiae provide a unique opportunity to elucidate the topological structure of biological pathways and how genes function in these pathways. Results We have defined congruent genes as pairs of genes with similar sets of genetic interaction partners and constructed a genetic congruence network by linking congruent genes. By comparing path lengths in three types of networks (genetic interaction, genetic congruence, and protein interaction), we discovered that high genetic congruence not only exhibits correlation with direct protein interaction linkage but also exhibits commensurate distance with the protein interaction network. However, consistent distances were not observed between genetic and protein interaction networks. We also demonstrated that congruence and protein networks are enriched with motifs that indicate network transitivity, while the genetic network has both transitive (triangle) and intransitive (square) types of motifs. These results suggest that robustness of yeast cells to gene deletions is due in part to two complementary pathways (square motif) or three complementary pathways, any two of which are required for viability (triangle motif). Conclusion Genetic congruence is superior to genetic interaction in prediction of protein interactions and function associations. Genetically interacting pairs usually belong to parallel compensatory pathways, which can generate transitive motifs (any two of three pathways needed) or intransitive motifs (either of two pathways needed). PMID:16283923
Representing metabolic pathway information: an object-oriented approach.

PubMed

Ellis, L B; Speedie, S M; McLeish, R

1998-01-01

The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) is a website providing information and dynamic links for microbial metabolic pathways, enzyme reactions, and their substrates and products. The Compound, Organism, Reaction and Enzyme (CORE) object-oriented database management system was developed to contain and serve this information. CORE was developed using Java, an object-oriented programming language, and PSE persistent object classes from Object Design, Inc. CORE dynamically generates descriptive web pages for reactions, compounds and enzymes, and reconstructs ad hoc pathway maps starting from any UM-BBD reaction. CORE code is available from the authors upon request. CORE is accessible through the UM-BBD at: http://www. labmed.umn.edu/umbbd/index.html.
Co-regulation of pluripotency and genetic integrity at the genomic level.

PubMed

Cooper, Daniel J; Walter, Christi A; McCarrey, John R

2014-11-01

The Disposable Soma Theory holds that genetic integrity will be maintained at more pristine levels in germ cells than in somatic cells because of the unique role germ cells play in perpetuating the species. We tested the hypothesis that the same concept applies to pluripotent cells compared to differentiated cells. Analyses of transcriptome and cistrome databases, along with canonical pathway analysis and chromatin immunoprecipitation confirmed differential expression of DNA repair and cell death genes in embryonic stem cells and induced pluripotent stem cells relative to fibroblasts, and predicted extensive direct and indirect interactions between the pluripotency and genetic integrity gene networks in pluripotent cells. These data suggest that enhanced maintenance of genetic integrity is fundamentally linked to the epigenetic state of pluripotency at the genomic level. In addition, these findings demonstrate how a small number of key pluripotency factors can regulate large numbers of downstream genes in a pathway-specific manner. Copyright © 2014. Published by Elsevier B.V.
Gene Polymorphism Studies in a Teaching Laboratory

NASA Astrophysics Data System (ADS)

Shultz, Jeffry

2009-02-01

I present a laboratory procedure for illustrating transcription, post-transcriptional modification, gene conservation, and comparative genetics for use in undergraduate biology education. Students are individually assigned genes in a targeted biochemical pathway, for which they design and test polymerase chain reaction (PCR) primers. In this example, students used genes annotated for the steroid biosynthesis pathway in soybean. The authoritative Kyoto encyclopedia of genes and genomes (KEGG) interactive database and other online resources were used to design primers based first on soybean expressed sequence tags (ESTs), then on ESTs from an alternate organism if soybean sequence was unavailable. Students designed a total of 50 gene-based primer pairs (37 soybean, 13 alternative) and tested these for polymorphism state and similarity between two soybean and two pea lines. Student assessment was based on acquisition of laboratory skills and successful project completion. This simple procedure illustrates conservation of genes and is not limited to soybean or pea. Cost per student estimates are included, along with a detailed protocol and flow diagram of the procedure.

A two-step approach for mining patient treatment pathways in administrative healthcare databases.

PubMed

Najjar, Ahmed; Reinharz, Daniel; Girouard, Catherine; Gagné, Christian

2018-05-01

Clustering electronic medical records allows the discovery of information on healthcare practices. Entries in such medical records are usually composed of a succession of diagnostics or therapeutic steps. The corresponding processes are complex and heterogeneous since they depend on medical knowledge integrating clinical guidelines, the physician's individual experience, and patient data and conditions. To analyze such data, we are first proposing to cluster medical visits, consultations, and hospital stays into homogeneous groups, and then to construct higher-level patient treatment pathways over these different groups. These pathways are then also clustered to distill typical pathways, enabling interpretation of clusters by experts. This approach is evaluated on a real-world administrative database of elderly people in Québec suffering from heart failures. Copyright © 2018 Elsevier B.V. All rights reserved.
Proteome reference map and regulation network of neonatal rat cardiomyocyte

PubMed Central

Li, Zi-jian; Liu, Ning; Han, Qi-de; Zhang, You-yi

2011-01-01

Aim: To study and establish a proteome reference map and regulation network of neonatal rat cardiomyocyte. Methods: Cultured cardiomyocytes of neonatal rats were used. All proteins expressed in the cardiomyocytes were separated and identified by two-dimensional polyacrylamide gel electrophoresis (2-DE) and matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS). Biological networks and pathways of the neonatal rat cardiomyocytes were analyzed using the Ingenuity Pathway Analysis (IPA) program (www.ingenuity.com). A 2-DE database was made accessible on-line by Make2ddb package on a web server. Results: More than 1000 proteins were separated on 2D gels, and 148 proteins were identified. The identified proteins were used for the construction of an extensible markup language-based database. Biological networks and pathways were constructed to analyze the functions associate with cardiomyocyte proteins in the database. The 2-DE database of rat cardiomyocyte proteins can be accessed at http://2d.bjmu.edu.cn. Conclusion: A proteome reference map and regulation network of the neonatal rat cardiomyocytes have been established, which may serve as an international platform for storage, analysis and visualization of cardiomyocyte proteomic data. PMID:21841810
Characterizing Protein Interactions Employing a Genome-Wide siRNA Cellular Phenotyping Screen

PubMed Central

Suratanee, Apichat; Schaefer, Martin H.; Betts, Matthew J.; Soons, Zita; Mannsperger, Heiko; Harder, Nathalie; Oswald, Marcus; Gipp, Markus; Ramminger, Ellen; Marcus, Guillermo; Männer, Reinhard; Rohr, Karl; Wanker, Erich; Russell, Robert B.; Andrade-Navarro, Miguel A.; Eils, Roland; König, Rainer

2014-01-01

Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest. PMID:25255318
Critical assessment of human metabolic pathway databases: a stepping stone for future integration

PubMed Central

2011-01-01

Background Multiple pathway databases are available that describe the human metabolic network and have proven their usefulness in many applications, ranging from the analysis and interpretation of high-throughput data to their use as a reference repository. However, so far the various human metabolic networks described by these databases have not been systematically compared and contrasted, nor has the extent to which they differ been quantified. For a researcher using these databases for particular analyses of human metabolism, it is crucial to know the extent of the differences in content and their underlying causes. Moreover, the outcomes of such a comparison are important for ongoing integration efforts. Results We compared the genes, EC numbers and reactions of five frequently used human metabolic pathway databases. The overlap is surprisingly low, especially on reaction level, where the databases agree on 3% of the 6968 reactions they have combined. Even for the well-established tricarboxylic acid cycle the databases agree on only 5 out of the 30 reactions in total. We identified the main causes for the lack of overlap. Importantly, the databases are partly complementary. Other explanations include the number of steps a conversion is described in and the number of possible alternative substrates listed. Missing metabolite identifiers and ambiguous names for metabolites also affect the comparison. Conclusions Our results show that each of the five networks compared provides us with a valuable piece of the puzzle of the complete reconstruction of the human metabolic network. To enable integration of the networks, next to a need for standardizing the metabolite names and identifiers, the conceptual differences between the databases should be resolved. Considerable manual intervention is required to reach the ultimate goal of a unified and biologically accurate model for studying the systems biology of human metabolism. Our comparison provides a stepping stone for such an endeavor. PMID:21999653
The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice.

PubMed

Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K

2015-01-01

Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of Key Transcription Factors Associated with Lung Squamous Cell Carcinoma

PubMed Central

Zhang, Feng; Chen, Xia; Wei, Ke; Liu, Daoming; Xu, Xiaodong; Zhang, Xing; Shi, Hong

2017-01-01

Background Lung squamous cell carcinoma (lung SCC) is a common type of lung cancer, but its mechanism of pathogenesis is unclear. The aim of this study was to identify key transcription factors in lung SCC and elucidate its mechanism. Material/Methods Six published microarray datasets of lung SCC were downloaded from Gene Expression Omnibus (GEO) for integrated bioinformatics analysis. Significance analysis of microarrays was used to identify differentially expressed genes (DEGs) between lung SCC and normal controls. The biological functions and signaling pathways of DEGs were mapped in the Gene Otology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, respectively. A transcription factor gene regulatory network was used to obtain insights into the functions of DEGs. Results A total of 1,011 genes, including 539 upregulated genes and 462 downregulated genes, were filtered as DEGs between lung SCC and normal controls. DEGs were significantly enriched in cell cycle, DNA replication, p53 signaling pathway, pathways in cancer, adherens junction, and cell adhesion molecules signaling pathways. There were 57 transcription factors identified, which were used to construct a regulatory network. The network consisted of 736 interactions between 49 transcription factors and 486 DEGs. NFIC, BRCA1, and NFATC2 were the top 3 transcription factors that had the highest connectivity with DEGs and that regulated 83, 82, and 75 DEGs in the network, respectively. Conclusions NFIC, BRCA1, and NFATC2 might be the key transcription factors in the development of lung SCC by regulating the genes involved in cell cycle and DNA replication pathways. PMID:28081052
Genome sequence analysis of a flocculant-producing bacterium, Paenibacillus shenyangensis.

PubMed

Fu, Lili; Jiang, Binhui; Liu, Jinliang; Zhao, Xin; Liu, Qian; Hu, Xiaomin

2016-03-01

To explore the metabolic process of Paenibacillus shenyangensis that is an efficient bioflocculant-producing bacterium. The biosynthesis mechanism of bioflocculation was used to enrich the genome of Paenibacillus shenyangensis and provide a basis for molecular genetics and functional genomics analyses. According to the analysis of de novo assembly, a total of 5,501,467 bp clean reads were generated, and were assembled into 92 contigs. 4800 unigenes were predicted of which 4393 were annotated showing a specific gene function in the NCBI-Nr database. 3423 genes were found in the database of cluster of orthologous groups. Among the 168 Kyoto Encyclopedia of Genes and Genomes database, cell growth and metabolism were the main biological processes, and a potential metabolic pathway was predicted from glucose to exopolysaccharide within the starch and sucrose metabolism pathway. By using the high-throughput sequencing technology, we provide a genome analysis of Paenibacillus shenyangensis that predicts the main metabolic processes and a potential pathway of exopolysaccharide biosynthesis.
Perspective: Interactive material property databases through aggregation of literature data

NASA Astrophysics Data System (ADS)

Seshadri, Ram; Sparks, Taylor D.

2016-05-01

Searchable, interactive, databases of material properties, particularly those relating to functional materials (magnetics, thermoelectrics, photovoltaics, etc.) are curiously missing from discussions of machine-learning and other data-driven methods for advancing new materials discovery. Here we discuss the manual aggregation of experimental data from the published literature for the creation of interactive databases that allow the original experimental data as well additional metadata to be visualized in an interactive manner. The databases described involve materials for thermoelectric energy conversion, and for the electrodes of Li-ion batteries. The data can be subject to machine-learning, accelerating the discovery of new materials.
Integrated analysis of microRNA and gene expression profiles reveals a functional regulatory module associated with liver fibrosis.

PubMed

Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong

2017-12-15

Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.
Investigating dysregulated pathways in Staphylococcus aureus (SA) exposed macrophages based on pathway interaction network.

PubMed

Zhou, Wei; Zhang, Yan; Li, Yue-Hua; Wang, Shuang; Zhang, Jing-Jing; Zhang, Cui-Xia; Zhang, Zhi-Sheng

2017-02-01

This work aimed to identify dysregulated pathways for Staphylococcus aureus (SA) exposed macrophages based on pathway interaction network (PIN). The inference of dysregulated pathways was comprised of four steps: preparing gene expression data, protein-protein interaction (PPI) data and pathway data; constructing a PIN dependent on the data and Pearson correlation coefficient (PCC); selecting seed pathway from PIN by computing activity score for each pathway according to principal component analysis (PCA) method; and investigating dysregulated pathways in a minimum set of pathways (MSP) utilizing seed pathway and the area under the receiver operating characteristics curve (AUC) index implemented in support vector machines (SVM) model. A total of 20,545 genes, 449,833 interactions and 1189 pathways were obtained in the gene expression data, PPI data and pathway data, respectively. The PIN was consisted of 8388 interactions and 1189 nodes, and Respiratory electron transport, ATP synthesis by chemiosmotic coupling, and heat production by uncoupling proteins was identified as the seed pathway. Finally, 15 dysregulated pathways in MSP (AUC=0.999) were obtained for SA infected samples, such as Respiratory electron transport and DNA Replication. We have identified 15 dysregulated pathways for SA infected macrophages based on PIN. The findings might provide potential biomarkers for early detection and therapy of SA infection, and give insights to reveal the molecular mechanism underlying SA infections. However, how these dysregulated pathways worked together still needs to be studied. Copyright © 2016 Elsevier Ltd. All rights reserved.
The role of drug profiles as similarity metrics: applications to repurposing, adverse effects detection and drug-drug interactions.

PubMed

Vilar, Santiago; Hripcsak, George

2017-07-01

Explosion of the availability of big data sources along with the development in computational methods provides a useful framework to study drugs' actions, such as interactions with pharmacological targets and off-targets. Databases related to protein interactions, adverse effects and genomic profiles are available to be used for the construction of computational models. In this article, we focus on the description of biological profiles for drugs that can be used as a system to compare similarity and create methods to predict and analyze drugs' actions. We highlight profiles constructed with different biological data, such as target-protein interactions, gene expression measurements, adverse effects and disease profiles. We focus on the discovery of new targets or pathways for drugs already in the pharmaceutical market, also called drug repurposing, in the interaction with off-targets responsible for adverse reactions and in drug-drug interaction analysis. The current and future applications, strengths and challenges facing all these methods are also discussed. Biological profiles or signatures are an important source of data generation to deeply analyze biological actions with important implications in drug-related studies. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A Systems Biology Approach to the Coordination of Defensive and Offensive Molecular Mechanisms in the Innate and Adaptive Host–Pathogen Interaction Networks

PubMed Central

Wu, Chia-Chou; Chen, Bor-Sen

2016-01-01

Infected zebrafish coordinates defensive and offensive molecular mechanisms in response to Candida albicans infections, and invasive C. albicans coordinates corresponding molecular mechanisms to interact with the host. However, knowledge of the ensuing infection-activated signaling networks in both host and pathogen and their interspecific crosstalk during the innate and adaptive phases of the infection processes remains incomplete. In the present study, dynamic network modeling, protein interaction databases, and dual transcriptome data from zebrafish and C. albicans during infection were used to infer infection-activated host–pathogen dynamic interaction networks. The consideration of host–pathogen dynamic interaction systems as innate and adaptive loops and subsequent comparisons of inferred innate and adaptive networks indicated previously unrecognized crosstalk between known pathways and suggested roles of immunological memory in the coordination of host defensive and offensive molecular mechanisms to achieve specific and powerful defense against pathogens. Moreover, pathogens enhance intraspecific crosstalk and abrogate host apoptosis to accommodate enhanced host defense mechanisms during the adaptive phase. Accordingly, links between physiological phenomena and changes in the coordination of defensive and offensive molecular mechanisms highlight the importance of host–pathogen molecular interaction networks, and consequent inferences of the host–pathogen relationship could be translated into biomedical applications. PMID:26881892
A Systems Biology Approach to the Coordination of Defensive and Offensive Molecular Mechanisms in the Innate and Adaptive Host-Pathogen Interaction Networks.

PubMed

Wu, Chia-Chou; Chen, Bor-Sen

2016-01-01

Infected zebrafish coordinates defensive and offensive molecular mechanisms in response to Candida albicans infections, and invasive C. albicans coordinates corresponding molecular mechanisms to interact with the host. However, knowledge of the ensuing infection-activated signaling networks in both host and pathogen and their interspecific crosstalk during the innate and adaptive phases of the infection processes remains incomplete. In the present study, dynamic network modeling, protein interaction databases, and dual transcriptome data from zebrafish and C. albicans during infection were used to infer infection-activated host-pathogen dynamic interaction networks. The consideration of host-pathogen dynamic interaction systems as innate and adaptive loops and subsequent comparisons of inferred innate and adaptive networks indicated previously unrecognized crosstalk between known pathways and suggested roles of immunological memory in the coordination of host defensive and offensive molecular mechanisms to achieve specific and powerful defense against pathogens. Moreover, pathogens enhance intraspecific crosstalk and abrogate host apoptosis to accommodate enhanced host defense mechanisms during the adaptive phase. Accordingly, links between physiological phenomena and changes in the coordination of defensive and offensive molecular mechanisms highlight the importance of host-pathogen molecular interaction networks, and consequent inferences of the host-pathogen relationship could be translated into biomedical applications.
Analyzing the differentially expressed genes and pathway cross-talk in aggressive breast cancer.

PubMed

Chen, Wen-Yan; Wu, Fang; You, Zhen-Yu; Zhang, Zhan-Min; Guo, Yu-Ling; Zhong, Lu-Xing

2015-01-01

The aim of this study was to explore the genes and pathways involved in the aggressive breast cancer cells. The gene expression profiles of GSE40057, including four aggressive breast cell lines and six less aggressive cell lines, were downloaded from the Gene Expression Omnibus (GEO) database. The gene differential expression analysis was carried out with limma software with the method of Bayes for multiple tests. The gene ontology (GO) term enrichment and pathway cross-talk analysis were performed with the online tool of DAVID and Cytoscape software. A total of 401 differentially expressed genes (DEG), such as pentraxin 3 (PTX3), snail family zinc finger 2 (SNAI2), interleukin-8/6 (IL-8/6), osteonectin (SPARC), matrix metallopeptidase-1 (MMP-1) and Ras-related protein Rab-25 (Rab 25), were identified between aggressive and less aggressive cell lines. They were mainly enriched in the GO terms of response to wounding, negative regulation of cell proliferation and calcium binding. Pathways in cancer dysfunctionally interacted with glyoxylate and dicarboxylate metabolism (P < 0.0001), basal transcription factors (P < 0.0001), tyrosine metabolism (P < 0.0001), calcium signaling pathway (P = 0.0021), FcγR-mediated phagocytosis (P = 0.0022), metabolism of xenobiotics by cytochrome P450 (P = 0.0097) and phagosome (P = 0.0102). The screened aggressive cancer-associated DEG (PTX3, SNAI2, IL-8/6, SPARC, MMP-1 and Rab25) and significant pathways (calcium signaling pathway, tyrosine metabolism, alanine, aspartate and glutamate metabolism) give us new insights into the mechanism of aggressive breast cancer cells, and these DEG may become promising target genes in the treatment of metastatic breast cancer. © 2014 The Authors. Journal of Obstetrics and Gynaecology Research © 2014 Japan Society of Obstetrics and Gynecology.
MESSI: metabolic engineering target selection and best strain identification tool.

PubMed

Kang, Kang; Li, Jun; Lim, Boon Leong; Panagiotou, Gianni

2015-01-01

Metabolic engineering and synthetic biology are synergistically related fields for manipulating target pathways and designing microorganisms that can act as chemical factories. Saccharomyces cerevisiae's ideal bioprocessing traits make yeast a very attractive chemical factory for production of fuels, pharmaceuticals, nutraceuticals as well as a wide range of chemicals. However, future attempts of engineering S. cerevisiae's metabolism using synthetic biology need to move towards more integrative models that incorporate the high connectivity of metabolic pathways and regulatory processes and the interactions in genetic elements across those pathways and processes. To contribute in this direction, we have developed Metabolic Engineering target Selection and best Strain Identification tool (MESSI), a web server for predicting efficient chassis and regulatory components for yeast bio-based production. The server provides an integrative platform for users to analyse ready-to-use public high-throughput metabolomic data, which are transformed to metabolic pathway activities for identifying the most efficient S. cerevisiae strain for the production of a compound of interest. As input MESSI accepts metabolite KEGG IDs or pathway names. MESSI outputs a ranked list of S. cerevisiae strains based on aggregation algorithms. Furthermore, through a genome-wide association study of the metabolic pathway activities with the strains' natural variation, MESSI prioritizes genes and small variants as potential regulatory points and promising metabolic engineering targets. Users can choose various parameters in the whole process such as (i) weight and expectation of each metabolic pathway activity in the final ranking of the strains, (ii) Weighted AddScore Fuse or Weighted Borda Fuse aggregation algorithm, (iii) type of variants to be included, (iv) variant sets in different biological levels.Database URL: http://sbb.hku.hk/MESSI/. © The Author(s) 2015. Published by Oxford University Press.
PSMA redirects cell survival signaling from the MAPK to the PI3K-AKT pathways to promote the progression of prostate cancer

PubMed Central

Caromile, Leslie Ann; Dortche, Kristina; Rahman, M. Mamunur; Grant, Christina L.; Stoddard, Christopher; Ferrer, Fernando A.; Shapiro, Linda H.

2017-01-01

Increased abundance of the prostate-specific membrane antigen (PSMA) on prostate epithelium is a hallmark of advanced metastatic prostate cancer (PCa) and correlates negatively with prognosis. However, direct evidence that PSMA functionally contributes to PCa progression remains elusive. We generated mice bearing PSMA-positive or PSMA-negative PCa by crossing PSMA-deficient mice with transgenic PCa (TRAMP) models, enabling direct assessment of PCa incidence and progression in the presence or absence of PSMA. Compared with PSMA-positive tumors, PSMA-negative tumors were smaller, lower-grade, and more apoptotic with fewer blood vessels, consistent with the recognized proangiogenic function of PSMA. Relative to PSMA-positive tumors, tumors lacking PSMA had less than half the abundance of type 1 insulin-like growth factor receptor (IGF-1R), less activity in the survival pathway mediated by PI3K-AKT signaling, and more activity in the proliferative pathway mediated by MAPK-ERK1/2 signaling. Biochemically, PSMA interacted with the scaffolding protein RACK1, disrupting signaling between the β1 integrin and IGF-1R complex to the MAPK pathway, enabling activation of the AKT pathway instead. Manipulation of PSMA abundance in PCa cell lines recapitulated this signaling pathway switch. Analysis of published databases indicated that IGF-1R abundance, cell proliferation, and expression of transcripts for antiapoptotic markers positively correlated with PSMA abundance in patients, suggesting that this switch may be relevant to human PCa. Our findings suggest that increase in PSMA in prostate tumors contributes to progression by altering normal signal transduction pathways to drive PCa progression and that enhanced signaling through the IGF-1R/β1 integrin axis may occur in other tumors. PMID:28292957
DIMA 3.0: Domain Interaction Map.

PubMed

Luo, Qibin; Pagel, Philipp; Vilne, Baiba; Frishman, Dmitrij

2011-01-01

Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46,900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

PubMed

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Detection of Significant Pneumococcal Meningitis Biomarkers by Ego Network.

PubMed

Wang, Qian; Lou, Zhifeng; Zhai, Liansuo; Zhao, Haibin

2017-06-01

To identify significant biomarkers for detection of pneumococcal meningitis based on ego network. Based on the gene expression data of pneumococcal meningitis and global protein-protein interactions (PPIs) data recruited from open access databases, the authors constructed a differential co-expression network (DCN) to identify pneumococcal meningitis biomarkers in a network view. Here EgoNet algorithm was employed to screen the significant ego networks that could accurately distinguish pneumococcal meningitis from healthy controls, by sequentially seeking ego genes, searching candidate ego networks, refinement of candidate ego networks and significance analysis to identify ego networks. Finally, the functional inference of the ego networks was performed to identify significant pathways for pneumococcal meningitis. By differential co-expression analysis, the authors constructed the DCN that covered 1809 genes and 3689 interactions. From the DCN, a total of 90 ego genes were identified. Starting from these ego genes, three significant ego networks (Module 19, Module 70 and Module 71) that could predict clinical outcomes for pneumococcal meningitis were identified by EgoNet algorithm, and the corresponding ego genes were GMNN, MAD2L1 and TPX2, respectively. Pathway analysis showed that these three ego networks were related to CDT1 association with the CDC6:ORC:origin complex, inactivation of APC/C via direct inhibition of the APC/C complex pathway, and DNA strand elongation, respectively. The authors successfully screened three significant ego modules which could accurately predict the clinical outcomes for pneumococcal meningitis and might play important roles in host response to pathogen infection in pneumococcal meningitis.
Phylogenetically informed logic relationships improve detection of biological network organization

PubMed Central

2011-01-01

Background A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction. PMID:22172058

BiologicalNetworks 2.0 - an integrative view of genome biology data

PubMed Central

2010-01-01

Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org. PMID:21190573
A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.

PubMed

Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo

2015-07-01

In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.
Drug Repositioning by Kernel-Based Integration of Molecular Structure, Molecular Activity, and Phenotype Data

PubMed Central

Wang, Yongcui; Chen, Shilong; Deng, Naiyang; Wang, Yong

2013-01-01

Computational inference of novel therapeutic values for existing drugs, i.e., drug repositioning, offers the great prospect for faster and low-risk drug development. Previous researches have indicated that chemical structures, target proteins, and side-effects could provide rich information in drug similarity assessment and further disease similarity. However, each single data source is important in its own way and data integration holds the great promise to reposition drug more accurately. Here, we propose a new method for drug repositioning, PreDR (Predict Drug Repositioning), to integrate molecular structure, molecular activity, and phenotype data. Specifically, we characterize drug by profiling in chemical structure, target protein, and side-effects space, and define a kernel function to correlate drugs with diseases. Then we train a support vector machine (SVM) to computationally predict novel drug-disease interactions. PreDR is validated on a well-established drug-disease network with 1,933 interactions among 593 drugs and 313 diseases. By cross-validation, we find that chemical structure, drug target, and side-effects information are all predictive for drug-disease relationships. More experimentally observed drug-disease interactions can be revealed by integrating these three data sources. Comparison with existing methods demonstrates that PreDR is competitive both in accuracy and coverage. Follow-up database search and pathway analysis indicate that our new predictions are worthy of further experimental validation. Particularly several novel predictions are supported by clinical trials databases and this shows the significant prospects of PreDR in future drug treatment. In conclusion, our new method, PreDR, can serve as a useful tool in drug discovery to efficiently identify novel drug-disease interactions. In addition, our heterogeneous data integration framework can be applied to other problems. PMID:24244318
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Yitan; Xu, Yanxun; Helseth, Donald L.

Background: Genetic interactions play a critical role in cancer development. Existing knowledge about cancer genetic interactions is incomplete, especially lacking evidences derived from large-scale cancer genomics data. The Cancer Genome Atlas (TCGA) produces multimodal measurements across genomics and features of thousands of tumors, which provide an unprecedented opportunity to investigate the interplays of genes in cancer. Methods: We introduce Zodiac, a computational tool and resource to integrate existing knowledge about cancer genetic interactions with new information contained in TCGA data. It is an evolution of existing knowledge by treating it as a prior graph, integrating it with a likelihood modelmore » derived by Bayesian graphical model based on TCGA data, and producing a posterior graph as updated and data-enhanced knowledge. In short, Zodiac realizes “Prior interaction map + TCGA data → Posterior interaction map.” Results: Zodiac provides molecular interactions for about 200 million pairs of genes. All the results are generated from a big-data analysis and organized into a comprehensive database allowing customized search. In addition, Zodiac provides data processing and analysis tools that allow users to customize the prior networks and update the genetic pathways of their interest. Zodiac is publicly available at www.compgenome.org/ZODIAC. Conclusions: Zodiac recapitulates and extends existing knowledge of molecular interactions in cancer. It can be used to explore novel gene-gene interactions, transcriptional regulation, and other types of molecular interplays in cancer.« less
Computational Reconstruction of NFκB Pathway Interaction Mechanisms during Prostate Cancer

PubMed Central

Börnigen, Daniela; Tyekucheva, Svitlana; Wang, Xiaodong; Rider, Jennifer R.; Lee, Gwo-Shu; Mucci, Lorelei A.; Sweeney, Christopher; Huttenhower, Curtis

2016-01-01

Molecular research in cancer is one of the largest areas of bioinformatic investigation, but it remains a challenge to understand biomolecular mechanisms in cancer-related pathways from high-throughput genomic data. This includes the Nuclear-factor-kappa-B (NFκB) pathway, which is central to the inflammatory response and cell proliferation in prostate cancer development and progression. Despite close scrutiny and a deep understanding of many of its members’ biomolecular activities, the current list of pathway members and a systems-level understanding of their interactions remains incomplete. Here, we provide the first steps toward computational reconstruction of interaction mechanisms of the NFκB pathway in prostate cancer. We identified novel roles for ATF3, CXCL2, DUSP5, JUNB, NEDD9, SELE, TRIB1, and ZFP36 in this pathway, in addition to new mechanistic interactions between these genes and 10 known NFκB pathway members. A newly predicted interaction between NEDD9 and ZFP36 in particular was validated by co-immunoprecipitation, as was NEDD9's potential biological role in prostate cancer cell growth regulation. We combined 651 gene expression datasets with 1.4M gene product interactions to predict the inclusion of 40 additional genes in the pathway. Molecular mechanisms of interaction among pathway members were inferred using recent advances in Bayesian data integration to simultaneously provide information specific to biological contexts and individual biomolecular activities, resulting in a total of 112 interactions in the fully reconstructed NFκB pathway: 13 (11%) previously known, 29 (26%) supported by existing literature, and 70 (63%) novel. This method is generalizable to other tissue types, cancers, and organisms, and this new information about the NFκB pathway will allow us to further understand prostate cancer and to develop more effective prevention and treatment strategies. PMID:27078000
Transcriptome Sequencing in a Tibetan Barley Landrace with High Resistance to Powdery Mildew

PubMed Central

Zeng, Xing-Quan; Luo, Xiao-Mei; Wang, Yu-Lin; Xu, Qi-Jun; Bai, Li-Jun; Yuan, Hong-Jun; Tashi, Nyima

2014-01-01

Hulless barley is an important cereal crop worldwide, especially in Tibet of China. However, this crop is usually susceptible to powdery mildew caused by Blumeria graminis f. sp. hordei. In this study, we aimed to understand the functions and pathways of genes involved in the disease resistance by transcriptome sequencing of a Tibetan barley landrace with high resistance to powdery mildew. A total of 831 significant differentially expressed genes were found in the infected seedlings, covering 19 functions. Either “cell,” “cell part,” and “extracellular region” in the cellular component category or “binding” and “catalytic” in the category of molecular function as well as “metabolic process” and “cellular process” in the biological process category together demonstrated that these functions may be involved in the resistance to powdery mildew of the hulless barley. In addition, 330 KEGG pathways were found using BLASTx with an E-value cut-off of <10−5. Among them, three pathways, namely, “photosynthesis,” “plant-pathogen interaction,” and “photosynthesis-antenna proteins” had significant matches in the database. Significant expressions of the three pathways were detected at 24 h, 48 h, and 96 h after infection, respectively. These results indicated a complex process of barley response to powdery mildew infection. PMID:25587568
Simulation of a Petri net-based model of the terpenoid biosynthesis pathway.

PubMed

Hawari, Aliah Hazmah; Mohamed-Hussein, Zeti-Azura

2010-02-09

The development and simulation of dynamic models of terpenoid biosynthesis has yielded a systems perspective that provides new insights into how the structure of this biochemical pathway affects compound synthesis. These insights may eventually help identify reactions that could be experimentally manipulated to amplify terpenoid production. In this study, a dynamic model of the terpenoid biosynthesis pathway was constructed based on the Hybrid Functional Petri Net (HFPN) technique. This technique is a fusion of three other extended Petri net techniques, namely Hybrid Petri Net (HPN), Dynamic Petri Net (HDN) and Functional Petri Net (FPN). The biological data needed to construct the terpenoid metabolic model were gathered from the literature and from biological databases. These data were used as building blocks to create an HFPNe model and to generate parameters that govern the global behaviour of the model. The dynamic model was simulated and validated against known experimental data obtained from extensive literature searches. The model successfully simulated metabolite concentration changes over time (pt) and the observations correlated with known data. Interactions between the intermediates that affect the production of terpenes could be observed through the introduction of inhibitors that established feedback loops within and crosstalk between the pathways. Although this metabolic model is only preliminary, it will provide a platform for analysing various high-throughput data, and it should lead to a more holistic understanding of terpenoid biosynthesis.
MicroRNA expression, target genes, and signaling pathways in infants with a ventricular septal defect.

PubMed

Chai, Hui; Yan, Zhaoyuan; Huang, Ke; Jiang, Yuanqing; Zhang, Lin

2018-02-01

This study aimed to systematically investigate the relationship between miRNA expression and the occurrence of ventricular septal defect (VSD), and characterize the miRNA target genes and pathways that can lead to VSD. The miRNAs that were differentially expressed in blood samples from VSD and normal infants were screened and validated by implementing miRNA microarrays and qRT-PCR. The target genes regulated by differentially expressed miRNAs were predicted using three target gene databases. The functions and signaling pathways of the target genes were enriched using the GO database and KEGG database, respectively. The transcription and protein expression of specific target genes in critical pathways were compared in the VSD and normal control groups using qRT-PCR and western blotting, respectively. Compared with the normal control group, the VSD group had 22 differentially expressed miRNAs; 19 were downregulated and three were upregulated. The 10,677 predicted target genes participated in many biological functions related to cardiac development and morphogenesis. Four target genes (mGLUR, Gq, PLC, and PKC) were involved in the PKC pathway and four (ECM, FAK, PI3 K, and PDK1) were involved in the PI3 K-Akt pathway. The transcription and protein expression of these eight target genes were significantly upregulated in the VSD group. The 22 miRNAs that were dysregulated in the VSD group were mainly downregulated, which may result in the dysregulation of several key genes and biological functions related to cardiac development. These effects could also be exerted via the upregulation of eight specific target genes, the subsequent over-activation of the PKC and PI3 K-Akt pathways, and the eventual abnormal cardiac development and VSD.
iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence

PubMed Central

Turner, Brian; Razick, Sabry; Turinsky, Andrei L.; Vlasblom, James; Crowdy, Edgard K.; Cho, Emerson; Morrison, Kyle; Wodak, Shoshana J.

2010-01-01

We present iRefWeb, a web interface to protein interaction data consolidated from 10 public databases: BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI and OPHID. iRefWeb enables users to examine aggregated interactions for a protein of interest, and presents various statistical summaries of the data across databases, such as the number of organism-specific interactions, proteins and cited publications. Through links to source databases and supporting evidence, researchers may gauge the reliability of an interaction using simple criteria, such as the detection methods, the scale of the study (high- or low-throughput) or the number of cited publications. Furthermore, iRefWeb compares the information extracted from the same publication by different databases, and offers means to follow-up possible inconsistencies. We provide an overview of the consolidated protein–protein interaction landscape and show how it can be automatically cropped to aid the generation of meaningful organism-specific interactomes. iRefWeb can be accessed at: http://wodaklab.org/iRefWeb. Database URL: http://wodaklab.org/iRefWeb/ PMID:20940177
SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803.

PubMed

Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

2008-01-01

Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at http://synechocystis.org/ or directly at http://bioportal.kobic.kr/SynechoNET/.
A novel approach to select differential pathways associated with hypertrophic cardiomyopathy based on gene co‑expression analysis.

PubMed

Chen, Xiao-Min; Feng, Ming-Jun; Shen, Cai-Jie; He, Bin; Du, Xian-Feng; Yu, Yi-Bo; Liu, Jing; Chu, Hui-Min

2017-07-01

The present study was designed to develop a novel method for identifying significant pathways associated with human hypertrophic cardiomyopathy (HCM), based on gene co‑expression analysis. The microarray dataset associated with HCM (E‑GEOD‑36961) was obtained from the European Molecular Biology Laboratory‑European Bioinformatics Institute database. Informative pathways were selected based on the Reactome pathway database and screening treatments. An empirical Bayes method was utilized to construct co‑expression networks for informative pathways, and a weight value was assigned to each pathway. Differential pathways were extracted based on weight threshold, which was calculated using a random model. In order to assess whether the co‑expression method was feasible, it was compared with traditional pathway enrichment analysis of differentially expressed genes, which were identified using the significance analysis of microarrays package. A total of 1,074 informative pathways were screened out for subsequent investigations and their weight values were also obtained. According to the threshold of weight value of 0.01057, 447 differential pathways, including folding of actin by chaperonin containing T‑complex protein 1 (CCT)/T‑complex protein 1 ring complex (TRiC), purine ribonucleoside monophosphate biosynthesis and ubiquinol biosynthesis, were obtained. Compared with traditional pathway enrichment analysis, the number of pathways obtained from the co‑expression approach was increased. The results of the present study demonstrated that this method may be useful to predict marker pathways for HCM. The pathways of folding of actin by CCT/TRiC and purine ribonucleoside monophosphate biosynthesis may provide evidence of the underlying molecular mechanisms of HCM, and offer novel therapeutic directions for HCM.
Pathway-based discovery of genetic interactions in breast cancer

PubMed Central

Xu, Zack Z.; Boone, Charles; Lange, Carol A.

2017-01-01

Breast cancer is the second largest cause of cancer death among U.S. women and the leading cause of cancer death among women worldwide. Genome-wide association studies (GWAS) have identified several genetic variants associated with susceptibility to breast cancer, but these still explain less than half of the estimated genetic contribution to the disease. Combinations of variants (i.e. genetic interactions) may play an important role in breast cancer susceptibility. However, due to a lack of statistical power, the current tests for genetic interactions from GWAS data mainly leverage prior knowledge to focus on small sets of genes or SNPs that are known to have an association with breast cancer. Thus, many genetic interactions, particularly among novel variants, remain understudied. Reverse-genetic interaction screens in model organisms have shown that genetic interactions frequently cluster into highly structured motifs, where members of the same pathway share similar patterns of genetic interactions. Based on this key observation, we recently developed a method called BridGE to search for such structured motifs in genetic networks derived from GWAS studies and identify pathway-level genetic interactions in human populations. We applied BridGE to six independent breast cancer cohorts and identified significant pathway-level interactions in five cohorts. Joint analysis across all five cohorts revealed a high confidence consensus set of genetic interactions with support in multiple cohorts. The discovered interactions implicated the glutathione conjugation, vitamin D receptor, purine metabolism, mitotic prometaphase, and steroid hormone biosynthesis pathways as major modifiers of breast cancer risk. Notably, while many of the pathways identified by BridGE show clear relevance to breast cancer, variants in these pathways had not been previously discovered by traditional single variant association tests, or single pathway enrichment analysis that does not consider SNP-SNP interactions. PMID:28957314
Detection of characteristic sub pathway network for angiogenesis based on the comprehensive pathway network.

PubMed

Huang, Yezhou; Li, Shao

2010-01-18

Pathways in biological system often cooperate with each other to function. Changes of interactions among pathways tightly associate with alterations in the properties and functions of the cell and hence alterations in the phenotype. So, the pathway interactions and especially their changes over time corresponding to specific phenotype are critical to understanding cell functions and phenotypic plasticity. With prior-defined pathways and incorporated protein-protein interaction (PPI) data, we counted PPIs between corresponding gene sets of each pair of distinct pathways to construct a comprehensive pathway network. Then we proposed a novel concept, characteristic sub pathway network (CSPN), to realize the phenotype-specific pathway interactions. By adding gene expression data regarding a given phenotype, angiogenesis, active PPIs corresponding to stimulation of interleukin-1 (IL-1) and tumor necrosis factor alpha (TNF-alpha) on human umbilical vein endothelial cells (HUVECs) respectively were derived. Two kinds of CSPN, namely the static or the dynamic CSPN, were detected by counting active PPIs. A comprehensive pathway network containing 37 signalling pathways as nodes and 263 pathway interactions were obtained. Two phenotype-specific CSPNs for angiogenesis, corresponding to stimulation of IL-1 and TNF-alpha on HUVEC respectively, were addressed. From phenotype-specific CSPNs, a static CSPN involving interactions among B cell receptor, T cell receptor, Toll-like receptor, MAPK, VEGF, and ErbB signalling pathways, and a dynamic CSPN involving interactions among TGF-beta, Wnt, p53 signalling pathways and cell cycle pathway, were detected for angiogenesis on HUVEC after stimulation of IL-1 and TNF-alpha respectively. We inferred that, in certain case, the static CSPN maintains related basic functions of the cells, whereas the dynamic CSPN manifests the cells' plastic responses to stimulus and therefore reflects the cells' phenotypic plasticity. The comprehensive pathway network helps us realize the cooperative behaviours among pathways. Moreover, two kinds of potential CSPNs found in this work, the static CSPN and the dynamic CSPN, are helpful to deeply understand the specific function of HUVEC and its phenotypic plasticity in regard to angiogenesis.
Detection of characteristic sub pathway network for angiogenesis based on the comprehensive pathway network

PubMed Central

2010-01-01

Background Pathways in biological system often cooperate with each other to function. Changes of interactions among pathways tightly associate with alterations in the properties and functions of the cell and hence alterations in the phenotype. So, the pathway interactions and especially their changes over time corresponding to specific phenotype are critical to understanding cell functions and phenotypic plasticity. Methods With prior-defined pathways and incorporated protein-protein interaction (PPI) data, we counted PPIs between corresponding gene sets of each pair of distinct pathways to construct a comprehensive pathway network. Then we proposed a novel concept, characteristic sub pathway network (CSPN), to realize the phenotype-specific pathway interactions. By adding gene expression data regarding a given phenotype, angiogenesis, active PPIs corresponding to stimulation of interleukin-1 (IL-1) and tumor necrosis factor α (TNF-α) on human umbilical vein endothelial cells (HUVECs) respectively were derived. Two kinds of CSPN, namely the static or the dynamic CSPN, were detected by counting active PPIs. Results A comprehensive pathway network containing 37 signalling pathways as nodes and 263 pathway interactions were obtained. Two phenotype-specific CSPNs for angiogenesis, corresponding to stimulation of IL-1 and TNF-α on HUVEC respectively, were addressed. From phenotype-specific CSPNs, a static CSPN involving interactions among B cell receptor, T cell receptor, Toll-like receptor, MAPK, VEGF, and ErbB signalling pathways, and a dynamic CSPN involving interactions among TGF-β, Wnt, p53 signalling pathways and cell cycle pathway, were detected for angiogenesis on HUVEC after stimulation of IL-1 and TNF-α respectively. We inferred that, in certain case, the static CSPN maintains related basic functions of the cells, whereas the dynamic CSPN manifests the cells' plastic responses to stimulus and therefore reflects the cells' phenotypic plasticity. Conclusion The comprehensive pathway network helps us realize the cooperative behaviours among pathways. Moreover, two kinds of potential CSPNs found in this work, the static CSPN and the dynamic CSPN, are helpful to deeply understand the specific function of HUVEC and its phenotypic plasticity in regard to angiogenesis. PMID:20122205
Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

PubMed

Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

2011-09-15

Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.
Bioinformatics approach reveals systematic mechanism underlying lung adenocarcinoma.

PubMed

Wu, Xiya; Zhang, Wei; Hu, Yunhua; Yi, Xianghua

2015-01-01

The purpose of this work was to explore the systematic molecular mechanism of lung adenocarcinoma and gain a deeper insight into it. Comprehensive bioinformatics methods were applied. Initially, significant differentially expressed genes (DEGs) were analyzed from the Affymetrix microarray data (GSE27262) deposited in the Gene Expression Omnibus (GEO). Subsequently, gene ontology (GO) analysis was performed using online Database for Annotation, Visualization and Integration Discovery (DAVID) software. Finally, significant pathway crosstalk was investigated based on the information derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. According to our results, the N-terminal globular domain of the type X collagen (COL10A1) gene and transmembrane protein 100 (TMEM100) gene were identified to be the most significant DEGs in tumor tissue compared with the adjacent normal tissues. The main GO categories were biological process, cellular component and molecular function. In addition, the crosstalk was significantly different between non-small cell lung cancer pathways and inositol phosphate metabolism pathway, focal adhesion signal pathway, vascular smooth muscle contraction signal pathway, peroxisome proliferator-activated receptor (PPAR) signaling pathway and calcium signaling pathway in tumor. Dysfunctional genes and pathways may play key roles in the progression and development of lung adenocarcinoma. Our data provide a systematic perspective for understanding this mechanism and may be helpful in discovering an effective treatment for lung adenocarcinoma.
Contextualization of drug-mediator relations using evidence networks.

PubMed

Tran, Hai Joey; Speyer, Gil; Kiefer, Jeff; Kim, Seungchan

2017-05-31

Genomic analysis of drug response can provide unique insights into therapies that can be used to match the "right drug to the right patient." However, the process of discovering such therapeutic insights using genomic data is not straightforward and represents an area of active investigation. EDDY (Evaluation of Differential DependencY), a statistical test to detect differential statistical dependencies, is one method that leverages genomic data to identify differential genetic dependencies. EDDY has been used in conjunction with the Cancer Therapeutics Response Portal (CTRP), a dataset with drug-response measurements for more than 400 small molecules, and RNAseq data of cell lines in the Cancer Cell Line Encyclopedia (CCLE) to find potential drug-mediator pairs. Mediators were identified as genes that showed significant change in genetic statistical dependencies within annotated pathways between drug sensitive and drug non-sensitive cell lines, and the results are presented as a public web-portal (EDDY-CTRP). However, the interpretability of drug-mediator pairs currently hinders further exploration of these potentially valuable results. In this study, we address this challenge by constructing evidence networks built with protein and drug interactions from the STITCH and STRING interaction databases. STITCH and STRING are sister databases that catalog known and predicted drug-protein interactions and protein-protein interactions, respectively. Using these two databases, we have developed a method to construct evidence networks to "explain" the relation between a drug and a mediator. RESULTS: We applied this approach to drug-mediator relations discovered in EDDY-CTRP analysis and identified evidence networks for ~70% of drug-mediator pairs where most mediators were not known direct targets for the drug. Constructed evidence networks enable researchers to contextualize the drug-mediator pair with current research and knowledge. Using evidence networks, we were able to improve the interpretability of the EDDY-CTRP results by linking the drugs and mediators with genes associated with both the drug and the mediator. We anticipate that these evidence networks will help inform EDDY-CTRP results and enhance the generation of important insights to drug sensitivity that will lead to improved precision medicine applications.
The BioGRID interaction database: 2017 update

PubMed Central

Chatr-aryamontri, Andrew; Oughtred, Rose; Boucher, Lorrie; Rust, Jennifer; Chang, Christie; Kolas, Nadine K.; O'Donnell, Lara; Oster, Sara; Theesfeld, Chandra; Sellam, Adnane; Stark, Chris; Breitkreutz, Bobby-Joe; Dolinski, Kara; Tyers, Mike

2017-01-01

The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the annotation and archival of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2016 (build 3.4.140), the BioGRID contains 1 072 173 genetic and protein interactions, and 38 559 post-translational modifications, as manually annotated from 48 114 publications. This dataset represents interaction records for 66 model organisms and represents a 30% increase compared to the previous 2015 BioGRID update. BioGRID curates the biomedical literature for major model organism species, including humans, with a recent emphasis on central biological processes and specific human diseases. To facilitate network-based approaches to drug discovery, BioGRID now incorporates 27 501 chemical–protein interactions for human drug targets, as drawn from the DrugBank database. A new dynamic interaction network viewer allows the easy navigation and filtering of all genetic and protein interaction data, as well as for bioactive compounds and their established targets. BioGRID data are directly downloadable without restriction in a variety of standardized formats and are freely distributed through partner model organism databases and meta-databases. PMID:27980099
Predicting the points of interaction of small molecules in the NF-κB pathway

PubMed Central

2011-01-01

Background The similarity property principle has been used extensively in drug discovery to identify small compounds that interact with specific drug targets. Here we show it can be applied to identify the interactions of small molecules within the NF-κB signalling pathway. Results Clusters that contain compounds with a predominant interaction within the pathway were created, which were then used to predict the interaction of compounds not included in the clustering analysis. Conclusions The technique successfully predicted the points of interactions of compounds that are known to interact with the NF-κB pathway. The method was also shown to be successful when compounds for which the interaction points were unknown were included in the clustering analysis. PMID:21342508
Computational prediction of secretion systems and secretomes of Brucella: identification of novel type IV effectors and their interaction with the host.

PubMed

Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Dinakaran, Vasudevan; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

2016-01-01

Brucella spp. are facultative intracellular pathogens that cause brucellosis in various mammals including humans. Brucella survive inside the host cells by forming vacuoles and subverting host defence systems. This study was aimed to predict the secretion systems and the secretomes of Brucella spp. from 39 complete genome sequences available in the databases. Furthermore, an attempt was made to identify the type IV secretion effectors and their interactions with host proteins. We predicted the secretion systems of Brucella by the KEGG pathway and SecReT4. Brucella secretomes and type IV effectors (T4SEs) were predicted through genome-wide screening using JVirGel and S4TE, respectively. Protein-protein interactions of Brucella T4SEs with their hosts were analyzed by HPIDB 2.0. Genes coding for Sec and Tat pathways of secretion and type I (T1SS), type IV (T4SS) and type V (T5SS) secretion systems were identified and they are conserved in all the species of Brucella. In addition to the well-known VirB operon coding for the type IV secretion system (T4SS), we have identified the presence of additional genes showing homology with T4SS of other organisms. On the whole, 10.26 to 14.94% of total proteomes were found to be either secreted (secretome) or membrane associated (membrane proteome). Approximately, 1.7 to 3.0% of total proteomes were identified as type IV secretion effectors (T4SEs). Prediction of protein-protein interactions showed 29 and 36 host-pathogen specific interactions between Bos taurus (cattle)-B. abortus and Ovis aries (sheep)-B. melitensis, respectively. Functional characterization of the predicted T4SEs and their interactions with their respective hosts may reveal the secrets of host specificity of Brucella.

Microarray‑based screening of differentially expressed genes in glucocorticoid‑induced avascular necrosis.

PubMed

Huang, Gangyong; Wei, Yibing; Zhao, Guanglei; Xia, Jun; Wang, Siqun; Wu, Jianguo; Chen, Feiyan; Chen, Jie; Shi, Jingshen

2017-06-01

The underlying mechanisms of glucocorticoid (GC)‑induced avascular necrosis of the femoral head (ANFH) have yet to be fully understood, in particular the mechanisms associated with the change of gene expression pattern. The present study aimed to identify key genes with a differential expression pattern in GC‑induced ANFH. E‑MEXP‑2751 microarray data were downloaded from the ArrayExpress database. Differentially expressed genes (DEGs) were identified in 5 femoral head samples of steroid‑induced ANFH rats compared with 5 placebo‑treated rat samples. Gene Ontology (GO) and pathway enrichment analyses were performed upon these DEGs. A total 93 DEGs (46 upregulated and 47 downregulated genes) were identified in GC‑induced ANFH samples. These DEGs were enriched in different GO terms and pathways, including chondrocyte differentiation and detection of chemical stimuli. The enrichment map revealed that skeletal system development was interconnected with several other GO terms by gene overlap. The literature mined network analysis revealed that 5 upregulated genes were associated with femoral necrosis, including parathyroid hormone receptor 1 (PTHR1), vitamin D (1,25‑Dihydroxyvitamin D3) receptor (VDR), collagen, type II, α1, proprotein convertase subtilisin/kexin type 6 and zinc finger protein 354C (ZFP354C). In addition, ZFP354C and VDR were identified to transcription factors. Furthermore, PTHR1 was revealed to interact with VDR, and α‑2‑macroglobulin (A2M) interacted with fibronectin 1 (FN1) in the PPI network. PTHR1 may be involved in GC‑induced ANFH via interacting with VDR. A2M may also be involved in the development of GC‑induced ANFH through interacting with FN1. An improved understanding of the molecular mechanisms underlying GC‑induced ANFH may provide novel targets for diagnostics and therapeutic treatment.
Microarray-based screening of differentially expressed genes in glucocorticoid-induced avascular necrosis

PubMed Central

Huang, Gangyong; Wei, Yibing; Zhao, Guanglei; Xia, Jun; Wang, Siqun; Wu, Jianguo; Chen, Feiyan; Chen, Jie; Shi, Jingshen

2017-01-01

The underlying mechanisms of glucocorticoid (GC)-induced avascular necrosis of the femoral head (ANFH) have yet to be fully understood, in particular the mechanisms associated with the change of gene expression pattern. The present study aimed to identify key genes with a differential expression pattern in GC-induced ANFH. E-MEXP-2751 microarray data were downloaded from the ArrayExpress database. Differentially expressed genes (DEGs) were identified in 5 femoral head samples of steroid-induced ANFH rats compared with 5 placebo-treated rat samples. Gene Ontology (GO) and pathway enrichment analyses were performed upon these DEGs. A total 93 DEGs (46 upregulated and 47 downregulated genes) were identified in GC-induced ANFH samples. These DEGs were enriched in different GO terms and pathways, including chondrocyte differentiation and detection of chemical stimuli. The enrichment map revealed that skeletal system development was interconnected with several other GO terms by gene overlap. The literature mined network analysis revealed that 5 upregulated genes were associated with femoral necrosis, including parathyroid hormone receptor 1 (PTHR1), vitamin D (1,25-Dihydroxyvitamin D3) receptor (VDR), collagen, type II, α1, proprotein convertase subtilisin/kexin type 6 and zinc finger protein 354C (ZFP354C). In addition, ZFP354C and VDR were identified to transcription factors. Furthermore, PTHR1 was revealed to interact with VDR, and α-2-macroglobulin (A2M) interacted with fibronectin 1 (FN1) in the PPI network. PTHR1 may be involved in GC-induced ANFH via interacting with VDR. A2M may also be involved in the development of GC-induced ANFH through interacting with FN1. An improved understanding of the molecular mechanisms underlying GC-induced ANFH may provide novel targets for diagnostics and therapeutic treatment. PMID:28393228
CellNetVis: a web tool for visualization of biological networks using force-directed layout constrained by cellular components.

PubMed

Heberle, Henry; Carazzolle, Marcelo Falsarella; Telles, Guilherme P; Meirelles, Gabriela Vaz; Minghim, Rosane

2017-09-13

The advent of "omics" science has brought new perspectives in contemporary biology through the high-throughput analyses of molecular interactions, providing new clues in protein/gene function and in the organization of biological pathways. Biomolecular interaction networks, or graphs, are simple abstract representations where the components of a cell (e.g. proteins, metabolites etc.) are represented by nodes and their interactions are represented by edges. An appropriate visualization of data is crucial for understanding such networks, since pathways are related to functions that occur in specific regions of the cell. The force-directed layout is an important and widely used technique to draw networks according to their topologies. Placing the networks into cellular compartments helps to quickly identify where network elements are located and, more specifically, concentrated. Currently, only a few tools provide the capability of visually organizing networks by cellular compartments. Most of them cannot handle large and dense networks. Even for small networks with hundreds of nodes the available tools are not able to reposition the network while the user is interacting, limiting the visual exploration capability. Here we propose CellNetVis, a web tool to easily display biological networks in a cell diagram employing a constrained force-directed layout algorithm. The tool is freely available and open-source. It was originally designed for networks generated by the Integrated Interactome System and can be used with networks from others databases, like InnateDB. CellNetVis has demonstrated to be applicable for dynamic investigation of complex networks over a consistent representation of a cell on the Web, with capabilities not matched elsewhere.
ORENZA: a web resource for studying ORphan ENZyme activities

PubMed Central

Lespinet, Olivier; Labedan, Bernard

2006-01-01

Background Despite the current availability of several hundreds of thousands of amino acid sequences, more than 36% of the enzyme activities (EC numbers) defined by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) are not associated with any amino acid sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found for nearly all classes of enzymes. Thus, there is an urgent need to explore these sequence-less EC numbers, in order to progressively close this gap. Description We designed ORENZA, a PostgreSQL database of ORphan ENZyme Activities, to collate information about the EC numbers defined by the NC-IUBMB with specific emphasis on orphan enzyme activities. Complete lists of all EC numbers and of orphan EC numbers are available and will be periodically updated. ORENZA allows one to browse the complete list of EC numbers or the subset associated with orphan enzymes or to query a specific EC number, an enzyme name or a species name for those interested in particular organisms. It is possible to search ORENZA for the different biochemical properties of the defined enzymes, the metabolic pathways in which they participate, the taxonomic data of the organisms whose genomes encode them, and many other features. The association of an enzyme activity with an amino acid sequence is clearly underlined, making it easy to identify at once the orphan enzyme activities. Interactive publishing of suggestions by the community would provide expert evidence for re-annotation of orphan EC numbers in public databases. Conclusion ORENZA is a Web resource designed to progressively bridge the unwanted gap between function (enzyme activities) and sequence (dataset present in public databases). ORENZA should increase interactions between communities of biochemists and of genomicists. This is expected to reduce the number of orphan enzyme activities by allocating gene sequences to the relevant enzymes. PMID:17026747
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.

PubMed

Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong

2015-01-01

In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
The online Tabloid Proteome: an annotated database of protein associations

PubMed Central

Turan, Demet; Tavernier, Jan

2018-01-01

Abstract A complete knowledge of the proteome can only be attained by determining the associations between proteins, along with the nature of these associations (e.g. physical contact in protein–protein interactions, participation in complex formation or different roles in the same pathway). Despite extensive efforts in elucidating direct protein interactions, our knowledge on the complete spectrum of protein associations remains limited. We therefore developed a new approach that detects protein associations from identifications obtained after re-processing of large-scale, public mass spectrometry-based proteomics data. Our approach infers protein association based on the co-occurrence of proteins across many different proteomics experiments, and provides information that is almost completely complementary to traditional direct protein interaction studies. We here present a web interface to query and explore the associations derived from this method, called the online Tabloid Proteome. The online Tabloid Proteome also integrates biological knowledge from several existing resources to annotate our derived protein associations. The online Tabloid Proteome is freely available through a user-friendly web interface, which provides intuitive navigation and data exploration options for the user at http://iomics.ugent.be/tabloidproteome. PMID:29040688
Molecular modeling of the AhR structure and interactions can shed light on ligand-dependent activation and transformation mechanisms.

PubMed

Bonati, Laura; Corrada, Dario; Tagliabue, Sara Giani; Motta, Stefano

2017-02-01

Molecular modeling has given important contributions to elucidation of the main stages in the AhR signal transduction pathway. Despite the lack of experimentally determined structures of the AhR functional domains, information derived from homologous systems has been exploited for modeling their structure and interactions. Homology models of the AhR PASB domain have provided information on the binding cavity and contributed to elucidate species-specific differences in ligand binding. Molecular Docking simulations of the ligand binding process have given insights into differences in binding of diverse agonists, antagonists, and selective AhR modulators, and their application to virtual screening of large databases of compounds have allowed identification of novel AhR ligands. Recently available structural information on protein-protein and protein-DNA complexes of other bHLH-PAS systems has opened the way for modeling the AhR:ARNT dimer structure and investigating the mechanisms of AhR transformation and DNA binding. Future research directions should include simulation of the protein dynamics to obtain a more reliable description of intermolecular interactions involved in signal transmission.
Contact Between Police and People With Mental Disorders: A Review of Rates.

PubMed

Livingston, James D

2016-08-01

There is widespread belief that people with mental disorders are overrepresented in police encounters. The prevalence of such interactions is used as evidence of extensive problems in our health care and social support systems. The goal of this study was to estimate the rates of police arrests among people with mental disorders, police involvement in pathways to mental health care, and police calls for service involving persons with mental disorders. A systematic review was performed with seven multidisciplinary databases. Additional studies were identified by reviewing the reference lists of all included records and by using the "related articles" and "cited articles" tools in the Web of Science database. Studies were included if they were published in peer-reviewed journals, reported primary research findings, and were written in English. Eighty-five unique studies covering 329,461 cases met inclusion criteria. Data reported in 21 studies indicated that one in four people with mental disorders have histories of police arrest. Data from 48 studies indicated that about one in ten individuals have police involved in their pathway to mental health care. Data reported in 13 studies indicated that one in 100 police dispatches and encounters involve people with mental disorders. These estimates illuminate the magnitude of the issue and supply an empirically based reference point to scholars and practitioners in this area. The findings are useful for understanding how local trends regarding police involvement in the lives of people with mental disorders compare with rates in the broader research literature.
Genome-wide identification, classification, and functional analysis of the basic helix-loop-helix transcription factors in the cattle, Bos Taurus.

PubMed

Li, Fengmei; Liu, Wuyi

2017-06-01

The basic helix-loop-helix (bHLH) transcription factors (TFs) form a huge superfamily and play crucial roles in many essential developmental, genetic, and physiological-biochemical processes of eukaryotes. In total, 109 putative bHLH TFs were identified and categorized successfully in the genomic databases of cattle, Bos Taurus, after removing redundant sequences and merging genetic isoforms. Through phylogenetic analyses, 105 proteins among these bHLH TFs were classified into 44 families with 46, 25, 14, 3, 13, and 4 members in the high-order groups A, B, C, D, E, and F, respectively. The remaining 4 bHLH proteins were sorted out as 'orphans.' Next, these 109 putative bHLH proteins identified were further characterized as significantly enriched in 524 significant Gene Ontology (GO) annotations (corrected P value ≤ 0.05) and 21 significantly enriched pathways (corrected P value ≤ 0.05) that had been mapped by the web server KOBAS 2.0. Furthermore, 95 bHLH proteins were further screened and analyzed together with two uncharacterized proteins in the STRING online database to reconstruct the protein-protein interaction network of cattle bHLH TFs. Ultimately, 89 bHLH proteins were fully mapped in a network with 67 biological process, 13 molecular functions, 5 KEGG pathways, 12 PFAM protein domains, and 25 INTERPRO classified protein domains and features. These results provide much useful information and a good reference for further functional investigations and updated researches on cattle bHLH TFs.
PhosphoregDB: The tissue and sub-cellular distribution of mammalian protein kinases and phosphatases

PubMed Central

Forrest, Alistair RR; Taylor, Darrin F; Fink, J Lynn; Gongora, M Milena; Flegg, Cameron; Teasdale, Rohan D; Suzuki, Harukazu; Kanamori, Mutsumi; Kai, Chikatoshi; Hayashizaki, Yoshihide; Grimmond, Sean M

2006-01-01

Background Protein kinases and protein phosphatases are the fundamental components of phosphorylation dependent protein regulatory systems. We have created a database for the protein kinase-like and phosphatase-like loci of mouse that integrates protein sequence, interaction, classification and pathway information with the results of a systematic screen of their sub-cellular localization and tissue specific expression data mined from the GNF tissue atlas of mouse. Results The database lets users query where a specific kinase or phosphatase is expressed at both the tissue and sub-cellular levels. Similarly the interface allows the user to query by tissue, pathway or sub-cellular localization, to reveal which components are co-expressed or co-localized. A review of their expression reveals 30% of these components are detected in all tissues tested while 70% show some level of tissue restriction. Hierarchical clustering of the expression data reveals that expression of these genes can be used to separate the samples into tissues of related lineage, including 3 larger clusters of nervous tissue, developing embryo and cells of the immune system. By overlaying the expression, sub-cellular localization and classification data we examine correlations between class, specificity and tissue restriction and show that tyrosine kinases are more generally expressed in fewer tissues than serine/threonine kinases. Conclusion Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered. PMID:16504016
The functional cancer map: a systems-level synopsis of genetic deregulation in cancer.

PubMed

Krupp, Markus; Maass, Thorsten; Marquardt, Jens U; Staib, Frank; Bauer, Tobias; König, Rainer; Biesterfeld, Stefan; Galle, Peter R; Tresch, Achim; Teufel, Andreas

2011-06-30

Cancer cells are characterized by massive dysegulation of physiological cell functions with considerable disruption of transcriptional regulation. Genome-wide transcriptome profiling can be utilized for early detection and molecular classification of cancers. Accurate discrimination of functionally different tumor types may help to guide selection of targeted therapy in translational research. Concise grouping of tumor types in cancer maps according to their molecular profile may further be helpful for the development of new therapeutic modalities or open new avenues for already established therapies. Complete available human tumor data of the Stanford Microarray Database was downloaded and filtered for relevance, adequacy and reliability. A total of 649 tumor samples from more than 1400 experiments and 58 different tissues were analyzed. Next, a method to score deregulation of KEGG pathway maps in different tumor entities was established, which was then used to convert hundreds of gene expression profiles into corresponding tumor-specific pathway activity profiles. Based on the latter, we defined a measure for functional similarity between tumor entities, which yielded to phylogeny of tumors. We provide a comprehensive, easy-to-interpret functional cancer map that characterizes tumor types with respect to their biological and functional behavior. Consistently, multiple pathways commonly associated with tumor progression were revealed as common features in the majority of the tumors. However, several pathways previously not linked to carcinogenesis were identified in multiple cancers suggesting an essential role of these pathways in cancer biology. Among these pathways were 'ECM-receptor interaction', 'Complement and Coagulation cascades', and 'PPAR signaling pathway'. The functional cancer map provides a systematic view on molecular similarities across different cancers by comparing tumors on the level of pathway activity. This work resulted in identification of novel superimposed functional pathways potentially linked to cancer biology. Therefore, our work may serve as a starting point for rationalizing combination of tumor therapeutics as well as for expanding the application of well-established targeted tumor therapies.
Knowledge representation in metabolic pathway databases.

PubMed

Stobbe, Miranda D; Jansen, Gerbert A; Moerland, Perry D; van Kampen, Antoine H C

2014-05-01

The accurate representation of all aspects of a metabolic network in a structured format, such that it can be used for a wide variety of computational analyses, is a challenge faced by a growing number of researchers. Analysis of five major metabolic pathway databases reveals that each database has made widely different choices to address this challenge, including how to deal with knowledge that is uncertain or missing. In concise overviews, we show how concepts such as compartments, enzymatic complexes and the direction of reactions are represented in each database. Importantly, also concepts which a database does not represent are described. Which aspects of the metabolic network need to be available in a structured format and to what detail differs per application. For example, for in silico phenotype prediction, a detailed representation of gene-protein-reaction relations and the compartmentalization of the network is essential. Our analysis also shows that current databases are still limited in capturing all details of the biology of the metabolic network, further illustrated with a detailed analysis of three metabolic processes. Finally, we conclude that the conceptual differences between the databases, which make knowledge exchange and integration a challenge, have not been resolved, so far, by the exchange formats in which knowledge representation is standardized.
Entamoeba histolytica: construction and applications of subgenomic databases.

PubMed

Hofer, Margit; Duchêne, Michael

2005-07-01

Knowledge about the influence of environmental stress such as the action of chemotherapeutic agents on gene expression in Entamoeba histolytica is limited. We plan to use oligonucleotide microarray hybridization to approach these questions. As the basis for our array, sequence data from the genome project carried out by the Institute for Genomic Research (TIGR) and the Sanger Institute were used to annotate parts of the parasite genome. Three subgenomic databases containing enzymes, cytoskeleton genes, and stress genes were compiled with the help of the ExPASy proteomics website and the BLAST servers at the two genome project sites. The known sequences from reference species, mostly human and Escherichia coli, were searched against TIGR and Sanger E. histolytica sequence contigs and the homologs were copied into a Microsoft Access database. In a similar way, two additional databases of cytoskeletal genes and stress genes were generated. Metabolic pathways could be assembled from our enzyme database, but sometimes they were incomplete as is the case for the sterol biosynthesis pathway. The raw databases contained a significant number of duplicate entries which were merged to obtain curated non-redundant databases. This procedure revealed that some E. histolytica genes may have several putative functions. Representative examples such as the case of the delta-aminolevulinate synthase/serine palmitoyltransferase are discussed.
BIOZON: a system for unification, management and analysis of heterogeneous biological data.

PubMed

Birkland, Aaron; Yona, Golan

2006-02-15

Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. Here we present a system (Biozon) that addresses these problems, and offers biologists a new knowledge resource to navigate through and explore. Biozon unifies multiple biological databases consisting of a variety of data types (such as DNA sequences, proteins, interactions and cellular pathways). It is fundamentally different from previous efforts as it uses a single extensive and tightly connected graph schema wrapped with hierarchical ontology of documents and relations. Beyond warehousing existing data, Biozon computes and stores novel derived data, such as similarity relationships and functional predictions. The integration of similarity data allows propagation of knowledge through inference and fuzzy searches. Sophisticated methods of query that span multiple data types were implemented and first-of-a-kind biological ranking systems were explored and integrated. The Biozon system is an extensive knowledge resource of heterogeneous biological data. Currently, it holds more than 100 million biological documents and 6.5 billion relations between them. The database is accessible through an advanced web interface that supports complex queries, "fuzzy" searches, data materialization and more, online at http://biozon.org.
EADB: An Estrogenic Activity Database for Assessing ...

EPA Pesticide Factsheets

Endocrine-active chemicals can potentially have adverse effects on both humans and wildlife. They can interfere with the body’s endocrine system through direct or indirect interactions with many protein targets. Estrogen receptors (ERs) are one of the major targets, and many endocrine disruptors are estrogenic and affect the normal estrogen signaling pathways. However, ERs can also serve as therapeutic targets for various medical conditions, such as menopausal symptoms, osteoporosis, and ER-positive breast cancer. Because of the decades-long interest in the safety and therapeutic utility of estrogenic chemicals, a large number of chemicals have been assayed for estrogenic activity, but these data exist in various sources and different formats that restrict the ability of regulatory and industry scientists to utilize them fully for assessing risk-benefit. To address this issue, we have developed an Estrogenic Activity Database (EADB; http://www.fda.gov/ScienceResearch/ BioinformaticsTools/EstrogenicActivityDatabaseEADB/default. htm) and made it freely available to the public. EADB contains 18,114 estrogenic activity data points collected for 8212 chemicals tested in 1284 binding, reporter gene, cell proliferation, and in vivo assays in 11 different species. The chemicals cover a broad chemical structure space and the data span a wide range of activities. A set of tools allow users to access EADB and evaluate potential endocrine activity of
Role for protein–protein interaction databases in human genetics

PubMed Central

Pattin, Kristine A; Moore, Jason H

2010-01-01

Proteomics and the study of protein–protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein–protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein–protein interactions in human genetics and genetic epidemiology. Since protein–protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies. PMID:19929610
Systematization of the protein sequence diversity in enzymes related to secondary metabolic pathways in plants, in the context of big data biology inspired by the KNApSAcK motorcycle database.

PubMed

Ikeda, Shun; Abe, Takashi; Nakamura, Yukiko; Kibinge, Nelson; Hirai Morita, Aki; Nakatani, Atsushi; Ono, Naoaki; Ikemura, Toshimichi; Nakamura, Kensuke; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

2013-05-01

Biology is increasingly becoming a data-intensive science with the recent progress of the omics fields, e.g. genomics, transcriptomics, proteomics and metabolomics. The species-metabolite relationship database, KNApSAcK Core, has been widely utilized and cited in metabolomics research, and chronological analysis of that research work has helped to reveal recent trends in metabolomics research. To meet the needs of these trends, the KNApSAcK database has been extended by incorporating a secondary metabolic pathway database called Motorcycle DB. We examined the enzyme sequence diversity related to secondary metabolism by means of batch-learning self-organizing maps (BL-SOMs). Initially, we constructed a map by using a big data matrix consisting of the frequencies of all possible dipeptides in the protein sequence segments of plants and bacteria. The enzyme sequence diversity of the secondary metabolic pathways was examined by identifying clusters of segments associated with certain enzyme groups in the resulting map. The extent of diversity of 15 secondary metabolic enzyme groups is discussed. Data-intensive approaches such as BL-SOM applied to big data matrices are needed for systematizing protein sequences. Handling big data has become an inevitable part of biology.
Comprehensive coverage of cardiovascular disease data in the disease portals at the Rat Genome Database.

PubMed

Wang, Shur-Jen; Laulederkind, Stanley J F; Hayman, G Thomas; Petri, Victoria; Smith, Jennifer R; Tutaj, Marek; Nigam, Rajni; Dwinell, Melinda R; Shimoyama, Mary

2016-08-01

Cardiovascular diseases are complex diseases caused by a combination of genetic and environmental factors. To facilitate progress in complex disease research, the Rat Genome Database (RGD) provides the community with a disease portal where genome objects and biological data related to cardiovascular diseases are systematically organized. The purpose of this study is to present biocuration at RGD, including disease, genetic, and pathway data. The RGD curation team uses controlled vocabularies/ontologies to organize data curated from the published literature or imported from disease and pathway databases. These organized annotations are associated with genes, strains, and quantitative trait loci (QTLs), thus linking functional annotations to genome objects. Screen shots from the web pages are used to demonstrate the organization of annotations at RGD. The human cardiovascular disease genes identified by annotations were grouped according to data sources and their annotation profiles were compared by in-house tools and other enrichment tools available to the public. The analysis results show that the imported cardiovascular disease genes from ClinVar and OMIM are functionally different from the RGD manually curated genes in terms of pathway and Gene Ontology annotations. The inclusion of disease genes from other databases enriches the collection of disease genes not only in quantity but also in quality. Copyright © 2016 the American Physiological Society.
The University of Minnesota Biocatalysis/Biodegradation Database: specialized metabolism for functional genomics.

PubMed Central

Ellis, L B; Hershberger, C D; Wackett, L P

1999-01-01

The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://www.labmed.umn.edu/umbbd/i nde x.html) first became available on the web in 1995 to provide information on microbial biocatalytic reactions of, and biodegradation pathways for, organic chemical compounds, especially those produced by man. Its goal is to become a representative database of biodegradation, spanning the diversity of known microbial metabolic routes, organic functional groups, and environmental conditions under which biodegradation occurs. The database can be used to enhance understanding of basic biochemistry, biocatalysis leading to speciality chemical manufacture, and biodegradation of environmental pollutants. It is also a resource for functional genomics, since it contains information on enzymes and genes involved in specialized metabolism not found in intermediary metabolism databases, and thus can assist in assigning functions to genes homologous to such less common genes. With information on >400 reactions and compounds, it is poised to become a resource for prediction of microbial biodegradation pathways for compounds it does not contain, a process complementary to predicting the functions of new classes of microbial genes. PMID:9847233
DockScreen: A database of in silico biomolecular interactions to support computational toxicology

EPA Science Inventory

We have developed DockScreen, a database of in silico biomolecular interactions designed to enable rational molecular toxicological insight within a computational toxicology framework. This database is composed of chemical/target (receptor and enzyme) binding scores calculated by...

An "EAR" on environmental surveillance and monitoring: A ...

EPA Pesticide Factsheets

Current environmental monitoring approaches focus primarily on chemical occurrence. However, based on chemical concentration alone, it can be difficult to identify which compounds may be of toxicological concern for prioritization for further monitoring or management. This can be problematic because toxicological characterization is lacking for many emerging contaminants. New sources of high throughput screening data like the ToxCast™ database, which contains data for over 9,000 compounds screened through up to 1,100 assays, are now available. Integrated analysis of chemical occurrence data with HTS data offers new opportunities to prioritize chemicals, sites, or biological effects for further investigation based on concentrations detected in the environment linked to relative potencies in pathway-based bioassays. As a case study, chemical occurrence data from a 2012 study in the Great Lakes Basin along with the ToxCast™ effects database were used to calculate exposure-activity ratios (EARs) as a prioritization tool. Technical considerations of data processing and use of the ToxCast™ database are presented and discussed. EAR prioritization identified multiple sites, biological pathways, and chemicals that warrant further investigation. Biological pathways were then linked to adverse outcome pathways to identify potential adverse outcomes and biomarkers for use in subsequent monitoring efforts. Anthropogenic contaminants are frequently reported in environm
SPV: a JavaScript Signaling Pathway Visualizer.

PubMed

Calderone, Alberto; Cesareni, Gianni

2018-03-24

The visualization of molecular interactions annotated in web resources is useful to offer to users such information in a clear intuitive layout. These interactions are frequently represented as binary interactions that are laid out in free space where, different entities, cellular compartments and interaction types are hardly distinguishable. SPV (Signaling Pathway Visualizer) is a free open source JavaScript library which offers a series of pre-defined elements, compartments and interaction types meant to facilitate the representation of signaling pathways consisting of causal interactions without neglecting simple protein-protein interaction networks. freely available under Apache version 2 license; Source code: https://github.com/Sinnefa/SPV_Signaling_Pathway_Visualizer_v1.0. Language: JavaScript; Web technology: Scalable Vector Graphics; Libraries: D3.js. sinnefa@gmail.com.
NPIDB: Nucleic acid-Protein Interaction DataBase.

PubMed

Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V

2013-01-01

The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
Application of Ferulic Acid for Alzheimer’s Disease: Combination of Text Mining and Experimental Validation

PubMed Central

Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Liu, Xueyuan

2018-01-01

Alzheimer’s disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies. PMID:29896095
Analysis of gene expression profile microarray data in complex regional pain syndrome.

PubMed

Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing

2017-09-01

The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Application of Ferulic Acid for Alzheimer's Disease: Combination of Text Mining and Experimental Validation.

PubMed

Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Zhao, Yanxin; Liu, Xueyuan

2018-01-01

Alzheimer's disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies.
Phthalic acid chemical probes synthesized for protein-protein interaction analysis.

PubMed

Liang, Shih-Shin; Liao, Wei-Ting; Kuo, Chao-Jen; Chou, Chi-Hsien; Wu, Chin-Jen; Wang, Hui-Min

2013-06-24

Plasticizers are additives that are used to increase the flexibility of plastic during manufacturing. However, in injection molding processes, plasticizers cannot be generated with monomers because they can peel off from the plastics into the surrounding environment, water, or food, or become attached to skin. Among the various plasticizers that are used, 1,2-benzenedicarboxylic acid (phthalic acid) is a typical precursor to generate phthalates. In addition, phthalic acid is a metabolite of diethylhexyl phthalate (DEHP). According to Gene_Ontology gene/protein database, phthalates can cause genital diseases, cardiotoxicity, hepatotoxicity, nephrotoxicity, etc. In this study, a silanized linker (3-aminopropyl triethoxyslane, APTES) was deposited on silicon dioxides (SiO2) particles and phthalate chemical probes were manufactured from phthalic acid and APTES-SiO2. These probes could be used for detecting proteins that targeted phthalic acid and for protein-protein interactions. The phthalic acid chemical probes we produced were incubated with epithelioid cell lysates of normal rat kidney (NRK-52E cells) to detect the interactions between phthalic acid and NRK-52E extracted proteins. These chemical probes interacted with a number of chaperones such as protein disulfide-isomerase A6, heat shock proteins, and Serpin H1. Ingenuity Pathways Analysis (IPA) software showed that these chemical probes were a practical technique for protein-protein interaction analysis.
Evaluating Land-Atmosphere Interactions with the North American Soil Moisture Database

NASA Astrophysics Data System (ADS)

Giles, S. M.; Quiring, S. M.; Ford, T.; Chavez, N.; Galvan, J.

2015-12-01

The North American Soil Moisture Database (NASMD) is a high-quality observational soil moisture database that was developed to study land-atmosphere interactions. It includes over 1,800 monitoring stations the United States, Canada and Mexico. Soil moisture data are collected from multiple sources, quality controlled and integrated into an online database (soilmoisture.tamu.edu). The period of record varies substantially and only a few of these stations have an observation record extending back into the 1990s. Daily soil moisture observations have been quality controlled using the North American Soil Moisture Database QAQC algorithm. The database is designed to facilitate observationally-driven investigations of land-atmosphere interactions, validation of the accuracy of soil moisture simulations in global land surface models, satellite calibration/validation for SMOS and SMAP, and an improved understanding of how soil moisture influences climate on seasonal to interannual timescales. This paper provides some examples of how the NASMD has been utilized to enhance understanding of land-atmosphere interactions in the U.S. Great Plains.
Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation.

PubMed

Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

2014-01-01

Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific information about genes or microRNAs is quick and easily accessible. Hence, this platform can support the ongoing OS research and biomarker discovery. Database URL: http://osteosarcoma-db.uni-muenster.de. © The Author(s) 2014. Published by Oxford University Press.
Structuring osteosarcoma knowledge: an osteosarcoma-gene association database based on literature mining and manual annotation

PubMed Central

Poos, Kathrin; Smida, Jan; Nathrath, Michaela; Maugg, Doris; Baumhoer, Daniel; Neumann, Anna; Korsching, Eberhard

2014-01-01

Osteosarcoma (OS) is the most common primary bone cancer exhibiting high genomic instability. This genomic instability affects multiple genes and microRNAs to a varying extent depending on patient and tumor subtype. Massive research is ongoing to identify genes including their gene products and microRNAs that correlate with disease progression and might be used as biomarkers for OS. However, the genomic complexity hampers the identification of reliable biomarkers. Up to now, clinico-pathological factors are the key determinants to guide prognosis and therapeutic treatments. Each day, new studies about OS are published and complicate the acquisition of information to support biomarker discovery and therapeutic improvements. Thus, it is necessary to provide a structured and annotated view on the current OS knowledge that is quick and easily accessible to researchers of the field. Therefore, we developed a publicly available database and Web interface that serves as resource for OS-associated genes and microRNAs. Genes and microRNAs were collected using an automated dictionary-based gene recognition procedure followed by manual review and annotation by experts of the field. In total, 911 genes and 81 microRNAs related to 1331 PubMed abstracts were collected (last update: 29 October 2013). Users can evaluate genes and microRNAs according to their potential prognostic and therapeutic impact, the experimental procedures, the sample types, the biological contexts and microRNA target gene interactions. Additionally, a pathway enrichment analysis of the collected genes highlights different aspects of OS progression. OS requires pathways commonly deregulated in cancer but also features OS-specific alterations like deregulated osteoclast differentiation. To our knowledge, this is the first effort of an OS database containing manual reviewed and annotated up-to-date OS knowledge. It might be a useful resource especially for the bone tumor research community, as specific information about genes or microRNAs is quick and easily accessible. Hence, this platform can support the ongoing OS research and biomarker discovery. Database URL: http://osteosarcoma-db.uni-muenster.de PMID:24865352
ReNE: A Cytoscape Plugin for Regulatory Network Enhancement

PubMed Central

Politano, Gianfranco; Benso, Alfredo; Savino, Alessandro; Di Carlo, Stefano

2014-01-01

One of the biggest challenges in the study of biological regulatory mechanisms is the integration, americanmodeling, and analysis of the complex interactions which take place in biological networks. Despite post transcriptional regulatory elements (i.e., miRNAs) are widely investigated in current research, their usage and visualization in biological networks is very limited. Regulatory networks are commonly limited to gene entities. To integrate networks with post transcriptional regulatory data, researchers are therefore forced to manually resort to specific third party databases. In this context, we introduce ReNE, a Cytoscape 3.x plugin designed to automatically enrich a standard gene-based regulatory network with more detailed transcriptional, post transcriptional, and translational data, resulting in an enhanced network that more precisely models the actual biological regulatory mechanisms. ReNE can automatically import a network layout from the Reactome or KEGG repositories, or work with custom pathways described using a standard OWL/XML data format that the Cytoscape import procedure accepts. Moreover, ReNE allows researchers to merge multiple pathways coming from different sources. The merged network structure is normalized to guarantee a consistent and uniform description of the network nodes and edges and to enrich all integrated data with additional annotations retrieved from genome-wide databases like NCBI, thus producing a pathway fully manageable through the Cytoscape environment. The normalized network is then analyzed to include missing transcription factors, miRNAs, and proteins. The resulting enhanced network is still a fully functional Cytoscape network where each regulatory element (transcription factor, miRNA, gene, protein) and regulatory mechanism (up-regulation/down-regulation) is clearly visually identifiable, thus enabling a better visual understanding of its role and the effect in the network behavior. The enhanced network produced by ReNE is exportable in multiple formats for further analysis via third party applications. ReNE can be freely installed from the Cytoscape App Store (http://apps.cytoscape.org/apps/rene) and the full source code is freely available for download through a SVN repository accessible at http://www.sysbio.polito.it/tools_svn/BioInformatics/Rene/releases/. ReNE enhances a network by only integrating data from public repositories, without any inference or prediction. The reliability of the introduced interactions only depends on the reliability of the source data, which is out of control of ReNe developers. PMID:25541727
Glioblastoma of the optic pathways: An Atypical case

PubMed Central

Brar, Rahat; Prasad, Abhishek; Brar, Manpreet

2009-01-01

We present a case of glioblastoma multiforme of the optic pathways in a 68 year old lady. Glioblastomas of the optic pathways are rare tumors; the predominant non enhancing component and the vast extent of involvement makes this a unique case. This case report further increases the database of knowledge available on the MRI characteristics of malignant optic glioma of adulthood. PMID:22470685
Glioblastoma of the optic pathways: An Atypical case.

PubMed

Brar, Rahat; Prasad, Abhishek; Brar, Manpreet

2009-01-01

We present a case of glioblastoma multiforme of the optic pathways in a 68 year old lady. Glioblastomas of the optic pathways are rare tumors; the predominant non enhancing component and the vast extent of involvement makes this a unique case. This case report further increases the database of knowledge available on the MRI characteristics of malignant optic glioma of adulthood.
DEOP: a database on osmoprotectants and associated pathways

PubMed Central

Bougouffa, Salim; Radovanovic, Aleksandar; Essack, Magbubah; Bajic, Vladimir B.

2014-01-01

Microorganisms are known to counteract salt stress through salt influx or by the accumulation of osmoprotectants (also called compatible solutes). Understanding the pathways that synthesize and/or breakdown these osmoprotectants is of interest to studies of crops halotolerance and to biotechnology applications that use microbes as cell factories for production of biomass or commercial chemicals. To facilitate the exploration of osmoprotectants, we have developed the first online resource, ‘Dragon Explorer of Osmoprotection associated Pathways’ (DEOP) that gathers and presents curated information about osmoprotectants, complemented by information about reactions and pathways that use or affect them. A combined total of 141 compounds were confirmed osmoprotectants, which were matched to 1883 reactions and 834 pathways. DEOP can also be used to map genes or microbial genomes to potential osmoprotection-associated pathways, and thus link genes and genomes to other associated osmoprotection information. Moreover, DEOP provides a text-mining utility to search deeper into the scientific literature for supporting evidence or for new associations of osmoprotectants to pathways, reactions, enzymes, genes or organisms. Two case studies are provided to demonstrate the usefulness of DEOP. The system can be accessed at. Database URL: http://www.cbrc.kaust.edu.sa/deop/ PMID:25326239
Axonal guidance signaling pathway interacting with smoking in modifying the risk of pancreatic cancer: a gene- and pathway-based interaction analysis of GWAS data.

PubMed

Tang, Hongwei; Wei, Peng; Duell, Eric J; Risch, Harvey A; Olson, Sara H; Bueno-de-Mesquita, H Bas; Gallinger, Steven; Holly, Elizabeth A; Petersen, Gloria; Bracci, Paige M; McWilliams, Robert R; Jenab, Mazda; Riboli, Elio; Tjønneland, Anne; Boutron-Ruault, Marie Christine; Kaaks, Rudolph; Trichopoulos, Dimitrios; Panico, Salvatore; Sund, Malin; Peeters, Petra H M; Khaw, Kay-Tee; Amos, Christopher I; Li, Donghui

2014-05-01

Cigarette smoking is the best established modifiable risk factor for pancreatic cancer. Genetic factors that underlie smoking-related pancreatic cancer have previously not been examined at the genome-wide level. Taking advantage of the existing Genome-wide association study (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study in 2028 cases and 2109 controls to examine gene-smoking interactions at pathway/gene/single nucleotide polymorphism (SNP) level. Using the likelihood ratio test nested in logistic regression models and ingenuity pathway analysis (IPA), we examined 172 KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, 3 manually curated gene sets, 3 nicotine dependency gene ontology pathways, 17 912 genes and 468 114 SNPs. None of the individual pathway/gene/SNP showed significant interaction with smoking after adjusting for multiple comparisons. Six KEGG pathways showed nominal interactions (P < 0.05) with smoking, and the top two are the pancreatic secretion and salivary secretion pathways (major contributing genes: RAB8A, PLCB and CTRB1). Nine genes, i.e. ZBED2, EXO1, PSG2, SLC36A1, CLSTN1, MTHFSD, FAT2, IL10RB and ATXN2 had P interaction < 0.0005. Five intergenic region SNPs and two SNPs of the EVC and KCNIP4 genes had P interaction < 0.00003. In IPA analysis of genes with nominal interactions with smoking, axonal guidance signaling $$\\left(P=2.12\\times 1{0}^{-7}\\right)$$ and α-adrenergic signaling $$\\left(P=2.52\\times 1{0}^{-5}\\right)$$ genes were significantly overrepresented canonical pathways. Genes contributing to the axon guidance signaling pathway included the SLIT/ROBO signaling genes that were frequently altered in pancreatic cancer. These observations need to be confirmed in additional data set. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE Office of Scientific and Technical Information (OSTI.GOV)

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

DOE PAGES

AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

2015-11-19

Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
sc-PDB-Frag: a database of protein-ligand interaction patterns for Bioisosteric replacements.

PubMed

Desaphy, Jérémy; Rognan, Didier

2014-07-28

Bioisosteric replacement plays an important role in medicinal chemistry by keeping the biological activity of a molecule while changing either its core scaffold or substituents, thereby facilitating lead optimization and patenting. Bioisosteres are classically chosen in order to keep the main pharmacophoric moieties of the substructure to replace. However, notably when changing a scaffold, no attention is usually paid as whether all atoms of the reference scaffold are equally important for binding to the desired target. We herewith propose a novel database for bioisosteric replacement (scPDBFrag), capitalizing on our recently published structure-based approach to scaffold hopping, focusing on interaction pattern graphs. Protein-bound ligands are first fragmented and the interaction of the corresponding fragments with their protein environment computed-on-the-fly. Using an in-house developed graph alignment tool, interaction patterns graphs can be compared, aligned, and sorted by decreasing similarity to any reference. In the herein presented sc-PDB-Frag database ( http://bioinfo-pharma.u-strasbg.fr/scPDBFrag ), fragments, interaction patterns, alignments, and pairwise similarity scores have been extracted from the sc-PDB database of 8077 druggable protein-ligand complexes and further stored in a relational database. We herewith present the database, its Web implementation, and procedures for identifying true bioisosteric replacements based on conserved interaction patterns.
Angiosperms Are Unique among Land Plant Lineages in the Occurrence of Key Genes in the RNA-Directed DNA Methylation (RdDM) Pathway

PubMed Central

Ma, Lu; Hatlen, Andrea; Kelly, Laura J.; Becher, Hannes; Wang, Wencai; Kovarik, Ales; Leitch, Ilia J.; Leitch, Andrew R.

2015-01-01

The RNA-directed DNA methylation (RdDM) pathway can be divided into three phases: 1) small interfering RNA biogenesis, 2) de novo methylation, and 3) chromatin modification. To determine the degree of conservation of this pathway we searched for key genes among land plants. We used OrthoMCL and the OrthoMCL Viridiplantae database to analyze proteomes of species in bryophytes, lycophytes, monilophytes, gymnosperms, and angiosperms. We also analyzed small RNA size categories and, in two gymnosperms, cytosine methylation in ribosomal DNA. Six proteins were restricted to angiosperms, these being NRPD4/NRPE4, RDM1, DMS3 (defective in meristem silencing 3), SHH1 (SAWADEE homeodomain homolog 1), KTF1, and SUVR2, although we failed to find the latter three proteins in Fritillaria persica, a species with a giant genome. Small RNAs of 24 nt in length were abundant only in angiosperms. Phylogenetic analyses of Dicer-like (DCL) proteins showed that DCL2 was restricted to seed plants, although it was absent in Gnetum gnemon and Welwitschia mirabilis. The data suggest that phases (1) and (2) of the RdDM pathway, described for model angiosperms, evolved with angiosperms. The absence of some features of RdDM in F. persica may be associated with its large genome. Phase (3) is probably the most conserved part of the pathway across land plants. DCL2, involved in virus defense and interaction with the canonical RdDM pathway to facilitate methylation of CHH, is absent outside seed plants. Its absence in G. gnemon, and W. mirabilis coupled with distinctive patterns of CHH methylation, suggest a secondary loss of DCL2 following the divergence of Gnetales. PMID:26338185
Pathway Interaction Network Analysis Identifies Dysregulated Pathways in Human Monocytes Infected by Listeria monocytogenes.

PubMed

Fan, Wufeng; Zhou, Yuhan; Li, Hao

2017-01-01

In our study, we aimed to extract dysregulated pathways in human monocytes infected by Listeria monocytogenes (LM) based on pathway interaction network (PIN) which presented the functional dependency between pathways. After genes were aligned to the pathways, principal component analysis (PCA) was used to calculate the pathway activity for each pathway, followed by detecting seed pathway. A PIN was constructed based on gene expression profile, protein-protein interactions (PPIs), and cellular pathways. Identifying dysregulated pathways from the PIN was performed relying on seed pathway and classification accuracy. To evaluate whether the PIN method was feasible or not, we compared the introduced method with standard network centrality measures. The pathway of RNA polymerase II pretranscription events was selected as the seed pathway. Taking this seed pathway as start, one pathway set (9 dysregulated pathways) with AUC score of 1.00 was identified. Among the 5 hub pathways obtained using standard network centrality measures, 4 pathways were the common ones between the two methods. RNA polymerase II transcription and DNA replication owned a higher number of pathway genes and DEGs. These dysregulated pathways work together to influence the progression of LM infection, and they will be available as biomarkers to diagnose LM infection.

Gramene 2016: comparative plant genomics and pathway resources

PubMed Central

Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

2016-01-01

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803
Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

PubMed Central

Bell, Lindsey; Chowdhary, Rajesh; Liu, Jun S.; Niu, Xufeng; Zhang, Jinfeng

2011-01-01

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs. PMID:21738677
The protein interactome of collapsin response mediator protein-2 (CRMP2/DPYSL2) reveals novel partner proteins in brain tissue.

PubMed

Martins-de-Souza, Daniel; Cassoli, Juliana S; Nascimento, Juliana M; Hensley, Kenneth; Guest, Paul C; Pinzon-Velasco, Andres M; Turck, Christoph W

2015-10-01

Collapsin response mediator protein-2 (CRMP2) is a CNS protein involved in neuronal development, axonal and neuronal growth, cell migration, and protein trafficking. Recent studies have linked perturbations in CRMP2 function to neurodegenerative disorders such as Alzheimer's disease, neuropathic pain, and Batten disease, and to psychiatric disorders such as schizophrenia. Like most proteins, CRMP2 functions though interactions with a molecular network of proteins and other molecules. Here, we have attempted to identify additional proteins of the CRMP2 interactome to provide further leads about its roles in neurological functions. We used a combined co-immunoprecipitation and shotgun proteomic approach in order to identify CRMP2 protein partners. We identified 78 CRMP2 protein partners not previously reported in public protein interaction databases. These were involved in seven biological processes, which included cell signaling, growth, metabolism, trafficking, and immune function, according to Gene Ontology classifications. Furthermore, 32 different molecular functions were found to be associated with these proteins, such as RNA binding, ribosomal functions, transporter activity, receptor activity, serine/threonine phosphatase activity, cell adhesion, cytoskeletal protein binding and catalytic activity. In silico pathway interactome construction revealed a highly connected network with the most overrepresented functions corresponding to semaphorin interactions, along with axon guidance and WNT5A signaling. Taken together, these findings suggest that the CRMP2 pathway is critical for regulating neuronal and synaptic architecture. Further studies along these lines might uncover novel biomarkers and drug targets for use in drug discovery. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

PubMed Central

Gunasekara, Chathura; Zhang, Kui; Deng, Wenping; Brown, Laura

2018-01-01

Abstract Despite their important roles, the regulators for most metabolic pathways and biological processes remain elusive. Presently, the methods for identifying metabolic pathway and biological process regulators are intensively sought after. We developed a novel algorithm called triple-gene mutual interaction (TGMI) for identifying these regulators using high-throughput gene expression data. It first calculated the regulatory interactions among triple gene blocks (two pathway genes and one transcription factor (TF)), using conditional mutual information, and then identifies significantly interacted triple genes using a newly identified novel mutual interaction measure (MIM), which was substantiated to reflect strengths of regulatory interactions within each triple gene block. The TGMI calculated the MIM for each triple gene block and then examined its statistical significance using bootstrap. Finally, the frequencies of all TFs present in all significantly interacted triple gene blocks were calculated and ranked. We showed that the TFs with higher frequencies were usually genuine pathway regulators upon evaluating multiple pathways in plants, animals and yeast. Comparison of TGMI with several other algorithms demonstrated its higher accuracy. Therefore, TGMI will be a valuable tool that can help biologists to identify regulators of metabolic pathways and biological processes from the exploded high-throughput gene expression data in public repositories. PMID:29579312
[Study on intersection and regulation mechanism of "efficacy-toxicity network" of aconite in combination environment of Sini decoction].

PubMed

Li, Zhi-yong; Bao, Hong-juan; Zhang, Shuo-feng; Ye, Tian-yuan; Yang, Ce; Li, Yan-wen

2015-02-01

To explore the intersection and regulation mechanism of "efficacy-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata's action gene in the combination environment of Sini decoction with the network pharmacological method. The gene interaction network of Aconiti Lateralis Radix Praeparata, Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma were mined and established with Cytoscape software and Agilent literature search plug-in. The "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata was formed according to its effects in anti-heart failure, neurotoxicity and cardiotoxicity. The target genes were clustered with Clusterviz plug-in. And the possible pathways of the "efficacy-tox- icity network" intersection of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata were forecasted in DAVID database. There were five genes related to neurotoxicity, cardiotoxicity and anti-heart failure function of Aconiti Lateralis Radix Praeparata, namely AKT1, BAX, HCC, IL6 and IL8, which formed 47 nodes genes in the "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata. There were 29 and 27 coincident genes in the "efficiency-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata. There were 23 and 17 possible regulatory pathways. In the combination environment of Sini decoction, Glycyrrhizae Radix et Rhizoma and Zingiberis Rhizoma may regulate the efficiency-toxicity network of Aconiti Lateralis Radix Praeparata by influencing immune-inflammatory signaling pathway, apoptosis-autophagy signaling pathway, nerve cell and myocardial ischemia and hypoxia protection signaling pathways.
Electronic coupling through natural amino acids.

PubMed

Berstis, Laura; Beckham, Gregg T; Crowley, Michael F

2015-12-14

Myriad scientific domains concern themselves with biological electron transfer (ET) events that span across vast scales of rate and efficiency through a remarkably fine-tuned integration of amino acid (AA) sequences, electronic structure, dynamics, and environment interactions. Within this intricate scheme, many questions persist as to how proteins modulate electron-tunneling properties. To help elucidate these principles, we develop a model set of peptides representing the common α-helix and β-strand motifs including all natural AAs within implicit protein-environment solvation. Using an effective Hamiltonian strategy with density functional theory, we characterize the electronic coupling through these peptides, furthermore considering side-chain dynamics. For both motifs, predictions consistently show that backbone-mediated electronic coupling is distinctly sensitive to AA type (aliphatic, polar, aromatic, negatively charged and positively charged), and to side-chain orientation. The unique properties of these residues may be employed to design activated, deactivated, or switch-like superexchange pathways. Electronic structure calculations and Green's function analyses indicate that localized shifts in the electron density along the peptide play a role in modulating these pathways, and further substantiate the experimentally observed behavior of proline residues as superbridges. The distinct sensitivities of tunneling pathways to sequence and conformation revealed in this electronic coupling database help improve our fundamental understanding of the broad diversity of ET reactivity and provide guiding principles for peptide design.
IntegromeDB: an integrated system and biological search engine.

PubMed

Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

2012-01-19

With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.
DDRprot: a database of DNA damage response-related proteins.

PubMed

Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M

2016-01-01

The DNA Damage Response (DDR) signalling network is an essential system that protects the genome's integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.Database URL: http://ddr.cbbio.es. © The Author(s) 2016. Published by Oxford University Press.
The DrugAge database of aging-related drugs.

PubMed

Barardo, Diogo; Thornton, Daniel; Thoppil, Harikrishnan; Walsh, Michael; Sharifi, Samim; Ferreira, Susana; Anžič, Andreja; Fernandes, Maria; Monteiro, Patrick; Grum, Tjaša; Cordeiro, Rui; De-Souza, Evandro Araújo; Budovsky, Arie; Araujo, Natali; Gruber, Jan; Petrascheck, Michael; Fraifeld, Vadim E; Zhavoronkov, Alexander; Moskalev, Alexey; de Magalhães, João Pedro

2017-06-01

Aging is a major worldwide medical challenge. Not surprisingly, identifying drugs and compounds that extend lifespan in model organisms is a growing research area. Here, we present DrugAge (http://genomics.senescence.info/drugs/), a curated database of lifespan-extending drugs and compounds. At the time of writing, DrugAge contains 1316 entries featuring 418 different compounds from studies across 27 model organisms, including worms, flies, yeast and mice. Data were manually curated from 324 publications. Using drug-gene interaction data, we also performed a functional enrichment analysis of targets of lifespan-extending drugs. Enriched terms include various functional categories related to glutathione and antioxidant activity, ion transport and metabolic processes. In addition, we found a modest but significant overlap between targets of lifespan-extending drugs and known aging-related genes, suggesting that some but not most aging-related pathways have been targeted pharmacologically in longevity studies. DrugAge is freely available online for the scientific community and will be an important resource for biogerontologists. © 2017 The Authors. Aging Cell published by the Anatomical Society and John Wiley & Sons Ltd.
KEGGParser: parsing and editing KEGG pathway maps in Matlab.

PubMed

Arakelyan, Arsen; Nersisyan, Lilit

2013-02-15

KEGG pathway database is a collection of manually drawn pathway maps accompanied with KGML format files intended for use in automatic analysis. KGML files, however, do not contain the required information for complete reproduction of all the events indicated in the static image of a pathway map. Several parsers and editors of KEGG pathways exist for processing KGML files. We introduce KEGGParser-a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab. The source code is available at http://www.mathworks.com/matlabcentral/fileexchange/37561.
Microarray and network-based identification of functional modules and pathways of active tuberculosis.

PubMed

Bian, Zhong-Rui; Yin, Juan; Sun, Wen; Lin, Dian-Jie

2017-04-01

Diagnose of active tuberculosis (TB) is challenging and treatment response is also difficult to efficiently monitor. The aim of this study was to use an integrated analysis of microarray and network-based method to the samples from publically available datasets to obtain a diagnostic module set and pathways in active TB. Towards this goal, background protein-protein interactions (PPI) network was generated based on global PPI information and gene expression data, following by identification of differential expression network (DEN) from the background PPI network. Then, ego genes were extracted according to the degree features in DEN. Next, module collection was conducted by ego gene expansion based on EgoNet algorithm. After that, differential expression of modules between active TB and controls was evaluated using random permutation test. Finally, biological significance of differential modules was detected by pathways enrichment analysis based on Reactome database, and Fisher's exact test was implemented to extract differential pathways for active TB. Totally, 47 ego genes and 47 candidate modules were identified from the DEN. By setting the cutoff-criteria of gene size >5 and classification accuracy ≥0.9, 7 ego modules (Module 4, Module 7, Module 9, Module 19, Module 25, Module 38 and Module 43) were extracted, and all of them had the statistical significance between active TB and controls. Then, Fisher's exact test was conducted to capture differential pathways for active TB. Interestingly, genes in Module 4, Module 25, Module 38, and Module 43 were enriched in the same pathway, formation of a pool of free 40S subunits. Significant pathway for Module 7 and Module 9 was eukaryotic translation termination, and for Module 19 was nonsense mediated decay enhanced by the exon junction complex (EJC). Accordingly, differential modules and pathways might be potential biomarkers for treating active TB, and provide valuable clues for better understanding of molecular mechanism of active TB. Copyright © 2017 Elsevier Ltd. All rights reserved.
Environmental Impact on Vascular Development Predicted by High-Throughput Screening

PubMed Central

Judson, Richard S.; Reif, David M.; Sipes, Nisha S.; Singh, Amar V.; Chandler, Kelly J.; DeWoskin, Rob; Dix, David J.; Kavlock, Robert J.; Knudsen, Thomas B.

2011-01-01

Background: Understanding health risks to embryonic development from exposure to environmental chemicals is a significant challenge given the diverse chemical landscape and paucity of data for most of these compounds. High-throughput screening (HTS) in the U.S. Environmental Protection Agency (EPA) ToxCast™ project provides vast data on an expanding chemical library currently consisting of > 1,000 unique compounds across > 500 in vitro assays in phase I (complete) and Phase II (under way). This public data set can be used to evaluate concentration-dependent effects on many diverse biological targets and build predictive models of prototypical toxicity pathways that can aid decision making for assessments of human developmental health and disease. Objective: We mined the ToxCast phase I data set to identify signatures for potential chemical disruption of blood vessel formation and remodeling. Methods: ToxCast phase I screened 309 chemicals using 467 HTS assays across nine assay technology platforms. The assays measured direct interactions between chemicals and molecular targets (receptors, enzymes), as well as downstream effects on reporter gene activity or cellular consequences. We ranked the chemicals according to individual vascular bioactivity score and visualized the ranking using ToxPi (Toxicological Priority Index) profiles. Results: Targets in inflammatory chemokine signaling, the vascular endothelial growth factor pathway, and the plasminogen-activating system were strongly perturbed by some chemicals, and we found positive correlations with developmental effects from the U.S. EPA ToxRefDB (Toxicological Reference Database) in vivo database containing prenatal rat and rabbit guideline studies. We observed distinctly different correlative patterns for chemicals with effects in rabbits versus rats, despite derivation of in vitro signatures based on human cells and cell-free biochemical targets, implying conservation but potentially differential contributions of developmental pathways among species. Follow-up analysis with antiangiogenic thalidomide analogs and additional in vitro vascular targets showed in vitro activity consistent with the most active environmental chemicals tested here. Conclusions: We predicted that blood vessel development is a target for environmental chemicals acting as putative vascular disruptor compounds (pVDCs) and identified potential species differences in sensitive vascular developmental pathways. PMID:21788198
Inverse relationship between Alzheimer's disease and cancer, and other factors contributing to Alzheimer's disease: a systematic review.

PubMed

Shafi, Ovais

2016-11-22

The AD etiology is yet not properly known. Interactions among environmental factors, multiple susceptibility genes and aging, contribute to AD. This study investigates the factors that play role in causing AD and how changes in cellular pathways contribute to AD. PUBMED database, MEDLINE database and Google Scholar were searched with no date restrictions for published articles involving cellular pathways with roles in cancers, cell survival, growth, proliferation, development, aging, and also contributing to Alzheimer's disease. This research explores inverse relationship between AD and cancer, also investigates other factors behind AD using several already published research literature to find the etiology of AD. Cancer and Alzheimer's disease have inverse relationship in many aspects such as P53, estrogen, neurotrophins and growth factors, growth and proliferation, cAMP, EGFR, Bcl-2, apoptosis pathways, IGF-1, HSV, TDP-43, APOE variants, notch signals and presenilins, NCAM, TNF alpha, PI3K/AKT/MTOR pathway, telomerase, ROS, ACE levels. AD occurs when brain neurons have weakened growth, cell survival responses, maintenance mechanisms, weakened anti-stress responses such as Vimentin, Carbonic anhydrases, HSPs, SAPK. In cancer, these responses are upregulated and maintained. Evolutionarily conserved responses and maintenance mechanisms such as FOXO are impaired in AD. Countermeasures or compensatory mechanisms by AD affected neurons such as Tau, Beta Amyloid, S100, are last attempts for survival which may be protective for certain time, or can speed up AD in Alzheimer's microenvironment via C-ABL activation, GSK3, neuro-inflammation. Alzheimer's disease and Cancer have inverse relationship; many factors that are upregulated in any cancer to sustain growth and survival are downregulated in Alzheimer's disease contributing to neuro-degeneration. When aged neurons or genetically susceptible neurons have weakened growth, cell survival and anti-stress responses, age related gene expression changes, altered regulation of cell death and maintenance mechanisms, they contribute to Alzheimer's disease. Countermeasures by AD neurons such as Beta Amyloid Plaques, NFTs, S100, are last attempts for survival and this provides neuroprotection for certain time and ultimately may become pathological and speed up AD. This study may contribute in developing new potential diagnostic tests, interventions and treatments.
Publications - DDS 8 | Alaska Division of Geological & Geophysical Surveys

Science.gov Websites

DGGS DDS 8 Publication Details Title: Alaska Volcano Observatory geochemical database Authors: Cameron ., Snedigar, S.F., and Nye, C.J., 2014, Alaska Volcano Observatory geochemical database: Alaska Division of ://doi.org/10.14509/29120 Publication Products Interactive Interactive Database Alaska Volcano Observatory
Minimal metabolic pathway structure is consistent with associated biomolecular interactions

PubMed Central

Bordbar, Aarash; Nagarajan, Harish; Lewis, Nathan E; Latif, Haythem; Ebrahim, Ali; Federowicz, Stephen; Schellenberger, Jan; Palsson, Bernhard O

2014-01-01

Pathways are a universal paradigm for functionally describing cellular processes. Even though advances in high-throughput data generation have transformed biology, the core of our biological understanding, and hence data interpretation, is still predicated on human-defined pathways. Here, we introduce an unbiased, pathway structure for genome-scale metabolic networks defined based on principles of parsimony that do not mimic canonical human-defined textbook pathways. Instead, these minimal pathways better describe multiple independent pathway-associated biomolecular interaction datasets suggesting a functional organization for metabolism based on parsimonious use of cellular components. We use the inherent predictive capability of these pathways to experimentally discover novel transcriptional regulatory interactions in Escherichia coli metabolism for three transcription factors, effectively doubling the known regulatory roles for Nac and MntR. This study suggests an underlying and fundamental principle in the evolutionary selection of pathway structures; namely, that pathways may be minimal, independent, and segregated. PMID:24987116
HoPaCI-DB: host-Pseudomonas and Coxiella interaction database

PubMed Central

Bleves, Sophie; Dunger, Irmtraud; Walter, Mathias C.; Frangoulidis, Dimitrios; Kastenmüller, Gabi; Voulhoux, Romé; Ruepp, Andreas

2014-01-01

Bacterial infectious diseases are the result of multifactorial processes affected by the interplay between virulence factors and host targets. The host-Pseudomonas and Coxiella interaction database (HoPaCI-DB) is a publicly available manually curated integrative database (http://mips.helmholtz-muenchen.de/HoPaCI/) of host–pathogen interaction data from Pseudomonas aeruginosa and Coxiella burnetii. The resource provides structured information on 3585 experimentally validated interactions between molecules, bioprocesses and cellular structures extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make HoPaCI-DB a versatile knowledge base for biologists and network biology approaches. PMID:24137008
Differential protein-coding gene and long noncoding RNA expression in smoking-related lung squamous cell carcinoma.

PubMed

Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie

2017-11-01

Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
The Innate Immune Database (IIDB)

PubMed Central

Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid

2008-01-01

Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser. Conclusion We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at . PMID:18321385
A proposed model for the flowering signaling pathway of sugarcane under photoperiodic control.

PubMed

Coelho, C P; Costa Netto, A P; Colasanti, J; Chalfun-Júnior, A

2013-04-25

Molecular analysis of floral induction in Arabidopsis has identified several flowering time genes related to 4 response networks defined by the autonomous, gibberellin, photoperiod, and vernalization pathways. Although grass flowering processes include ancestral functions shared by both mono- and dicots, they have developed their own mechanisms to transmit floral induction signals. Despite its high production capacity and its important role in biofuel production, almost no information is available about the flowering process in sugarcane. We searched the Sugarcane Expressed Sequence Tags database to look for elements of the flowering signaling pathway under photoperiodic control. Sequences showing significant similarity to flowering time genes of other species were clustered, annotated, and analyzed for conserved domains. Multiple alignments comparing the sequences found in the sugarcane database and those from other species were performed and their phylogenetic relationship assessed using the MEGA 4.0 software. Electronic Northerns were run with Cluster and TreeView programs, allowing us to identify putative members of the photoperiod-controlled flowering pathway of sugarcane.
Merging in-silico and in vitro salivary protein complex partners using the STRING database: A tutorial.

PubMed

Crosara, Karla Tonelli Bicalho; Moffa, Eduardo Buozi; Xiao, Yizhi; Siqueira, Walter Luiz

2018-01-16

Protein-protein interaction is a common physiological mechanism for protection and actions of proteins in an organism. The identification and characterization of protein-protein interactions in different organisms is necessary to better understand their physiology and to determine their efficacy. In a previous in vitro study using mass spectrometry, we identified 43 proteins that interact with histatin 1. Six previously documented interactors were confirmed and 37 novel partners were identified. In this tutorial, we aimed to demonstrate the usefulness of the STRING database for studying protein-protein interactions. We used an in-silico approach along with the STRING database (http://string-db.org/) and successfully performed a fast simulation of a novel constructed histatin 1 protein-protein network, including both the previously known and the predicted interactors, along with our newly identified interactors. Our study highlights the advantages and importance of applying bioinformatics tools to merge in-silico tactics with experimental in vitro findings for rapid advancement of our knowledge about protein-protein interactions. Our findings also indicate that bioinformatics tools such as the STRING protein network database can help predict potential interactions between proteins and thus serve as a guide for future steps in our exploration of the Human Interactome. Our study highlights the usefulness of the STRING protein database for studying protein-protein interactions. The STRING database can collect and integrate data about known and predicted protein-protein associations from many organisms, including both direct (physical) and indirect (functional) interactions, in an easy-to-use interface. Copyright © 2017 Elsevier B.V. All rights reserved.

Developmental continuity and change in physical, verbal, and relational aggression and peer victimization from childhood to adolescence.

PubMed

Ettekal, Idean; Ladd, Gary W

2017-09-01

To investigate the developmental course of aggression and peer victimization in childhood and adolescence, distinct subgroups of children were identified based on similarities and differences in their physical, verbal and relational aggression, and victimization. Developmental continuity and change were assessed by examining transitions within and between subgroups from Grades 1 to 11. This longitudinal study consisted of 482 children (50% females) and was based on peer report data on multiple forms of aggression and peer victimization. Using person-centered methods including latent profile and latent transition analyses, most of the identified subgroups were distinguishable by their frequencies (i.e., levels) of aggression and victimization, rather than forms (physical, verbal, and relational), with the exception of 1 group that appeared to be more form-specific. Across subgroups, multiple developmental patterns emerged characterized as early and late-onset, social interactional continuity, desistance, and heterotypic pathways. Collectively, these pathways support the perspective that the development of aggression and peer victimization in childhood and adolescence is characterized by heterogeneity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
CycADS: an annotation database system to ease the development and update of BioCyc databases

PubMed Central

Vellozo, Augusto F.; Véron, Amélie S.; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E.; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

2011-01-01

In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org PMID:21474551
dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins.

PubMed

Huang, Kai-Yao; Su, Min-Gang; Kao, Hui-Ju; Hsieh, Yun-Chung; Jhong, Jhih-Hua; Cheng, Kuang-Hao; Huang, Hsien-Da; Lee, Tzong-Yi

2016-01-04

Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genomic Target Database (GTD): A database of potential targets in human pathogenic bacteria

PubMed Central

Barh, Debmalya; Kumar, Anil; Misra, Amarendra Narayana

2009-01-01

A Genomic Target Database (GTD) has been developed having putative genomic drug targets for human bacterial pathogens. The selected pathogens are either drug resistant or vaccines are yet to be developed against them. The drug targets have been identified using subtractive genomics approaches and these are subsequently classified into Drug targets in pathogen specific unique metabolic pathways,Drug targets in host-pathogen common metabolic pathways, andMembrane localized drug targets. HTML code is used to link each target to its various properties and other available public resources. Essential resources and tools for subtractive genomic analysis, sub-cellular localization, vaccine and drug designing are also mentioned. To the best of authors knowledge, no such database (DB) is presently available that has listed metabolic pathways and membrane specific genomic drug targets based on subtractive genomics. Listed targets in GTD are readily available resource in developing drug and vaccine against the respective pathogen, its subtypes, and other family members. Currently GTD contains 58 drug targets for four pathogens. Shortly, drug targets for six more pathogens will be listed. Availability GTD is available at IIOAB website http://www.iioab.webs.com/GTD.htm. It can also be accessed at http://www.iioabdgd.webs.com.GTD is free for academic research and non-commercial use only. Commercial use is strictly prohibited without prior permission from IIOAB. PMID:20011153
Recent advances in prostate development and links to prostatic diseases

PubMed Central

Powers, Ginny L.

2013-01-01

The prostate is a branched ductal-acinar gland that is part of the male reproductive tract. Prostate development depends upon the integration of steroid hormone signals, paracrine interactions between the stromal and epithelial tissue layers, and the actions of cell autonomous factors. Several genes and signalling pathways are known to be required for one or more steps of prostate development including epithelial budding, duct elongation, branching morphogenesis, and/or cellular differentiation. Recent progress in the field of prostate development has included the application of genome-wide technologies including serial analysis of gene expression (SAGE), expression profiling microarrays, and other large scale approaches to identify new genes and pathways that are essential for prostate development. The aggregation of experimental results into online databases by organized multi-lab projects including the Genitourinary Developmental Molecular Atlas Project (GUDMAP) has also accelerated the understanding of molecular pathways that function during prostate development and identified links between prostate anatomy and molecular signaling. Rapid progress has also recently been made in understanding the nature and role of candidate stem cells in the developing and adult prostate. This has included the identification of putative prostate stem cell markers, lineage tracing, and organ reconstitution studies. However, several issues regarding their origin, precise nature, and possible role(s) in disease remain unresolved. Nevertheless, several links between prostatic developmental mechanisms and the pathogenesis of prostatic diseases including benign prostatic hyperplasia and prostate cancer have led to recent progress on targeting developmental pathways as therapeutic strategies for these diseases. PMID:23335485
DGIdb 3.0: a redesign and expansion of the drug-gene interaction database.

PubMed

Cotto, Kelsy C; Wagner, Alex H; Feng, Yang-Yang; Kiwala, Susanna; Coffman, Adam C; Spies, Gregory; Wollam, Alex; Spies, Nicholas C; Griffith, Obi L; Griffith, Malachi

2018-01-04

The drug-gene interaction database (DGIdb, www.dgidb.org) consolidates, organizes and presents drug-gene interactions and gene druggability information from papers, databases and web resources. DGIdb normalizes content from 30 disparate sources and allows for user-friendly advanced browsing, searching and filtering for ease of access through an intuitive web user interface, application programming interface (API) and public cloud-based server image. DGIdb v3.0 represents a major update of the database. Nine of the previously included 24 sources were updated. Six new resources were added, bringing the total number of sources to 30. These updates and additions of sources have cumulatively resulted in 56 309 interaction claims. This has also substantially expanded the comprehensive catalogue of druggable genes and anti-neoplastic drug-gene interactions included in the DGIdb. Along with these content updates, v3.0 has received a major overhaul of its codebase, including an updated user interface, preset interaction search filters, consolidation of interaction information into interaction groups, greatly improved search response times and upgrading the underlying web application framework. In addition, the expanded API features new endpoints which allow users to extract more detailed information about queried drugs, genes and drug-gene interactions, including listings of PubMed IDs, interaction type and other interaction metadata.
Gramene 2016: comparative plant genomics and pathway resources

USDA-ARS?s Scientific Manuscript database

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the data...
Design and Performance of a Xenobiotic Metabolism Database Manager for Building Metabolic Pathway Databases

EPA Science Inventory

A major challenge for scientists and regulators is accounting for the metabolic activation of chemicals that may lead to increased toxicity. Reliable forecasting of chemical metabolism is a critical factor in estimating a chemical’s toxic potential. Research is underway to develo...
ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii.

PubMed

May, Patrick; Christian, Jan-Ole; Kempa, Stefan; Walther, Dirk

2009-05-04

The unicellular green alga Chlamydomonas reinhardtii is an important eukaryotic model organism for the study of photosynthesis and plant growth. In the era of modern high-throughput technologies there is an imperative need to integrate large-scale data sets from high-throughput experimental techniques using computational methods and database resources to provide comprehensive information about the molecular and cellular organization of a single organism. In the framework of the German Systems Biology initiative GoFORSYS, a pathway database and web-portal for Chlamydomonas (ChlamyCyc) was established, which currently features about 250 metabolic pathways with associated genes, enzymes, and compound information. ChlamyCyc was assembled using an integrative approach combining the recently published genome sequence, bioinformatics methods, and experimental data from metabolomics and proteomics experiments. We analyzed and integrated a combination of primary and secondary database resources, such as existing genome annotations from JGI, EST collections, orthology information, and MapMan classification. ChlamyCyc provides a curated and integrated systems biology repository that will enable and assist in systematic studies of fundamental cellular processes in Chlamydomonas. The ChlamyCyc database and web-portal is freely available under http://chlamycyc.mpimp-golm.mpg.de.
Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks

PubMed Central

Ulitsky, Igor; Shamir, Ron

2007-01-01

The biological interpretation of genetic interactions is a major challenge. Recently, Kelley and Ideker proposed a method to analyze together genetic and physical networks, which explains many of the known genetic interactions as linking different pathways in the physical network. Here, we extend this method and devise novel analytic tools for interpreting genetic interactions in a physical context. Applying these tools on a large-scale Saccharomyces cerevisiae data set, our analysis reveals 140 between-pathway models that explain 3765 genetic interactions, roughly doubling those that were previously explained. Model genes tend to have short mRNA half-lives and many phosphorylation sites, suggesting that their stringent regulation is linked to pathway redundancy. We also identify ‘pivot' proteins that have many physical interactions with both pathways in our models, and show that pivots tend to be essential and highly conserved. Our analysis of models and pivots sheds light on the organization of the cellular machinery as well as on the roles of individual proteins. PMID:17437029
Insight into bacterial virulence mechanisms against host immune response via the Yersinia pestis-human protein-protein interaction network.

PubMed

Yang, Huiying; Ke, Yuehua; Wang, Jian; Tan, Yafang; Myeni, Sebenzile K; Li, Dong; Shi, Qinghai; Yan, Yanfeng; Chen, Hui; Guo, Zhaobiao; Yuan, Yanzhi; Yang, Xiaoming; Yang, Ruifu; Du, Zongmin

2011-11-01

A Yersinia pestis-human protein interaction network is reported here to improve our understanding of its pathogenesis. Up to 204 interactions between 66 Y. pestis bait proteins and 109 human proteins were identified by yeast two-hybrid assay and then combined with 23 previously published interactions to construct a protein-protein interaction network. Topological analysis of the interaction network revealed that human proteins targeted by Y. pestis were significantly enriched in the proteins that are central in the human protein-protein interaction network. Analysis of this network showed that signaling pathways important for host immune responses were preferentially targeted by Y. pestis, including the pathways involved in focal adhesion, regulation of cytoskeleton, leukocyte transendoepithelial migration, and Toll-like receptor (TLR) and mitogen-activated protein kinase (MAPK) signaling. Cellular pathways targeted by Y. pestis are highly relevant to its pathogenesis. Interactions with host proteins involved in focal adhesion and cytoskeketon regulation pathways could account for resistance of Y. pestis to phagocytosis. Interference with TLR and MAPK signaling pathways by Y. pestis reflects common characteristics of pathogen-host interaction that bacterial pathogens have evolved to evade host innate immune response by interacting with proteins in those signaling pathways. Interestingly, a large portion of human proteins interacting with Y. pestis (16/109) also interacted with viral proteins (Epstein-Barr virus [EBV] and hepatitis C virus [HCV]), suggesting that viral and bacterial pathogens attack common cellular functions to facilitate infections. In addition, we identified vasodilator-stimulated phosphoprotein (VASP) as a novel interaction partner of YpkA and showed that YpkA could inhibit in vitro actin assembly mediated by VASP.
Metaproteomics of cellulose methanisation under thermophilic conditions reveals a surprisingly high proteolytic activity

PubMed Central

Lü, Fan; Bize, Ariane; Guillot, Alain; Monnet, Véronique; Madigou, Céline; Chapleur, Olivier; Mazéas, Laurent; He, Pinjing; Bouchez, Théodore

2014-01-01

Cellulose is the most abundant biopolymer on Earth. Optimising energy recovery from this renewable but recalcitrant material is a key issue. The metaproteome expressed by thermophilic communities during cellulose anaerobic digestion was investigated in microcosms. By multiplying the analytical replicates (65 protein fractions analysed by MS/MS) and relying solely on public protein databases, more than 500 non-redundant protein functions were identified. The taxonomic community structure as inferred from the metaproteomic data set was in good overall agreement with 16S rRNA gene tag pyrosequencing and fluorescent in situ hybridisation analyses. Numerous functions related to cellulose and hemicellulose hydrolysis and fermentation catalysed by bacteria related to Caldicellulosiruptor spp. and Clostridium thermocellum were retrieved, indicating their key role in the cellulose-degradation process and also suggesting their complementary action. Despite the abundance of acetate as a major fermentation product, key methanogenesis enzymes from the acetoclastic pathway were not detected. In contrast, enzymes from the hydrogenotrophic pathway affiliated to Methanothermobacter were almost exclusively identified for methanogenesis, suggesting a syntrophic acetate oxidation process coupled to hydrogenotrophic methanogenesis. Isotopic analyses confirmed the high dominance of the hydrogenotrophic methanogenesis. Very surprising was the identification of an abundant proteolytic activity from Coprothermobacter proteolyticus strains, probably acting as scavenger and/or predator performing proteolysis and fermentation. Metaproteomics thus appeared as an efficient tool to unravel and characterise metabolic networks as well as ecological interactions during methanisation bioprocesses. More generally, metaproteomics provides direct functional insights at a limited cost, and its attractiveness should increase in the future as sequence databases are growing exponentially. PMID:23949661
Metaproteomics of cellulose methanisation under thermophilic conditions reveals a surprisingly high proteolytic activity.

PubMed

Lü, Fan; Bize, Ariane; Guillot, Alain; Monnet, Véronique; Madigou, Céline; Chapleur, Olivier; Mazéas, Laurent; He, Pinjing; Bouchez, Théodore

2014-01-01

Cellulose is the most abundant biopolymer on Earth. Optimising energy recovery from this renewable but recalcitrant material is a key issue. The metaproteome expressed by thermophilic communities during cellulose anaerobic digestion was investigated in microcosms. By multiplying the analytical replicates (65 protein fractions analysed by MS/MS) and relying solely on public protein databases, more than 500 non-redundant protein functions were identified. The taxonomic community structure as inferred from the metaproteomic data set was in good overall agreement with 16S rRNA gene tag pyrosequencing and fluorescent in situ hybridisation analyses. Numerous functions related to cellulose and hemicellulose hydrolysis and fermentation catalysed by bacteria related to Caldicellulosiruptor spp. and Clostridium thermocellum were retrieved, indicating their key role in the cellulose-degradation process and also suggesting their complementary action. Despite the abundance of acetate as a major fermentation product, key methanogenesis enzymes from the acetoclastic pathway were not detected. In contrast, enzymes from the hydrogenotrophic pathway affiliated to Methanothermobacter were almost exclusively identified for methanogenesis, suggesting a syntrophic acetate oxidation process coupled to hydrogenotrophic methanogenesis. Isotopic analyses confirmed the high dominance of the hydrogenotrophic methanogenesis. Very surprising was the identification of an abundant proteolytic activity from Coprothermobacter proteolyticus strains, probably acting as scavenger and/or predator performing proteolysis and fermentation. Metaproteomics thus appeared as an efficient tool to unravel and characterise metabolic networks as well as ecological interactions during methanisation bioprocesses. More generally, metaproteomics provides direct functional insights at a limited cost, and its attractiveness should increase in the future as sequence databases are growing exponentially.
CADDIS Volume 5. Causal Databases: Interactive Conceptual Diagrams (ICDs)

EPA Pesticide Factsheets

In Interactive Conceptual Diagram (ICD) section of CADDIS allows users to create conceptual model diagrams, search a literature-based evidence database, and then attach that evidence to their diagrams.
InterAction Database (IADB)

Cancer.gov

The InterAction Database includes demographic and prescription information for more than 500,000 patients in the northern and middle Netherlands and has been integrated with other systems to enhance data collection and analysis.
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Ling; Xiong, Yi; Gao, Hongyun

Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions

DOE PAGES

Liu, Ling; Xiong, Yi; Gao, Hongyun; ...

2018-04-02

Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
Mapping biological process relationships and disease perturbations within a pathway network.

PubMed

Stoney, Ruth; Robertson, David L; Nenadic, Goran; Schwartz, Jean-Marc

2018-01-01

Molecular interaction networks are routinely used to map the organization of cellular function. Edges represent interactions between genes, proteins, or metabolites. However, in living cells, molecular interactions are dynamic, necessitating context-dependent models. Contextual information can be integrated into molecular interaction networks through the inclusion of additional molecular data, but there are concerns about completeness and relevance of this data. We developed an approach for representing the organization of human cellular processes using pathways as the nodes in a network. Pathways represent spatial and temporal sets of context-dependent interactions, generating a high-level network when linked together, which incorporates contextual information without the need for molecular interaction data. Analysis of the pathway network revealed linked communities representing functional relationships, comparable to those found in molecular networks, including metabolism, signaling, immunity, and the cell cycle. We mapped a range of diseases onto this network and find that pathways associated with diseases tend to be functionally connected, highlighting the perturbed functions that result in disease phenotypes. We demonstrated that disease pathways cluster within the network. We then examined the distribution of cancer pathways and showed that cancer pathways tend to localize within the signaling, DNA processes and immune modules, although some cancer-associated nodes are found in other network regions. Altogether, we generated a high-confidence functional network, which avoids some of the shortcomings faced by conventional molecular models. Our representation provides an intuitive functional interpretation of cellular organization, which relies only on high-quality pathway and Gene Ontology data. The network is available at https://data.mendeley.com/datasets/3pbwkxjxg9/1.
WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data

PubMed Central

Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M

2006-01-01

Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281
The Co-regulation Data Harvester: Automating gene annotation starting from a transcriptome database

NASA Astrophysics Data System (ADS)

Tsypin, Lev M.; Turkewitz, Aaron P.

Identifying co-regulated genes provides a useful approach for defining pathway-specific machinery in an organism. To be efficient, this approach relies on thorough genome annotation, a process much slower than genome sequencing per se. Tetrahymena thermophila, a unicellular eukaryote, has been a useful model organism and has a fully sequenced but sparsely annotated genome. One important resource for studying this organism has been an online transcriptomic database. We have developed an automated approach to gene annotation in the context of transcriptome data in T. thermophila, called the Co-regulation Data Harvester (CDH). Beginning with a gene of interest, the CDH identifies co-regulated genes by accessing the Tetrahymena transcriptome database. It then identifies their closely related genes (orthologs) in other organisms by using reciprocal BLAST searches. Finally, it collates the annotations of those orthologs' functions, which provides the user with information to help predict the cellular role of the initial query. The CDH, which is freely available, represents a powerful new tool for analyzing cell biological pathways in Tetrahymena. Moreover, to the extent that genes and pathways are conserved between organisms, the inferences obtained via the CDH should be relevant, and can be explored, in many other systems.

DESHARKY: automatic design of metabolic pathways for optimal cell growth.

PubMed

Rodrigo, Guillermo; Carrera, Javier; Prather, Kristala Jones; Jaramillo, Alfonso

2008-11-01

The biological solution for synthesis or remediation of organic compounds using living organisms, particularly bacteria and yeast, has been promoted because of the cost reduction with respect to the non-living chemical approach. In that way, computational frameworks can profit from the previous knowledge stored in large databases of compounds, enzymes and reactions. In addition, the cell behavior can be studied by modeling the cellular context. We have implemented a Monte Carlo algorithm (DESHARKY) that finds a metabolic pathway from a target compound by exploring a database of enzymatic reactions. DESHARKY outputs a biochemical route to the host metabolism together with its impact in the cellular context by using mathematical models of the cell resources and metabolism. Furthermore, we provide the sequence of amino acids for the enzymes involved in the route closest phylogenetically to the considered organism. We provide examples of designed metabolic pathways with their genetic load characterizations. Here, we have used Escherichia coli as host organism. In addition, our bioinformatic tool can be applied for biodegradation or biosynthesis and its performance scales with the database size. Software, a tutorial and examples are freely available and open source at http://soft.synth-bio.org/desharky.html
Systems biology of cancer biomarker detection.

PubMed

Mitra, Sanga; Das, Smarajit; Chakrabarti, Jayprokas

2013-01-01

Cancer systems-biology is an ever-growing area of research due to explosion of data; how to mine these data and extract useful information is the problem. To have an insight on carcinogenesis one need to systematically mine several resources, such as databases, microarray and next-generation sequences. This review encompasses management and analysis of cancer data, databases construction and data deposition, whole transcriptome and genome comparison, analysing results from high throughput experiments to uncover cellular pathways and molecular interactions, and the design of effective algorithms to identify potential biomarkers. Recent technical advances such as ChIP-on-chip, ChIP-seq and RNA-seq can be applied to get epigenetic information transformed into a high-throughput endeavour to which systems biology and bioinformatics are making significant inroads. The data from ENCODE and GENCODE projects available through UCSC genome browser can be considered as benchmark for comparison and meta-analysis. A pipeline for integrating next generation sequencing data, microarray data, and putting them together with the existing database is discussed. The understanding of cancer genomics is changing the way we approach cancer diagnosis and treatment. To give a better understanding of utilizing available resources' we have chosen oral cancer to show how and what kind of analysis can be done. This review is a computational genomic primer that provides a bird's eye view of computational and bioinformatics' tools currently available to perform integrated genomic and system biology analyses of several carcinoma.
Genomics Portals: integrative web-platform for mining genomics data.

PubMed

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data

PubMed Central

2010-01-01

Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
FMM: a web server for metabolic pathway reconstruction and comparative analysis.

PubMed

Chou, Chih-Hung; Chang, Wen-Chi; Chiu, Chih-Min; Huang, Chih-Chang; Huang, Hsien-Da

2009-07-01

Synthetic Biology, a multidisciplinary field, is growing rapidly. Improving the understanding of biological systems through mimicry and producing bio-orthogonal systems with new functions are two complementary pursuits in this field. A web server called FMM (From Metabolite to Metabolite) was developed for this purpose. FMM can reconstruct metabolic pathways form one metabolite to another metabolite among different species, based mainly on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and other integrated biological databases. Novel presentation for connecting different KEGG maps is newly provided. Both local and global graphical views of the metabolic pathways are designed. FMM has many applications in Synthetic Biology and Metabolic Engineering. For example, the reconstruction of metabolic pathways to produce valuable metabolites or secondary metabolites in bacteria or yeast is a promising strategy for drug production. FMM provides a highly effective way to elucidate the genes from which species should be cloned into those microorganisms based on FMM pathway comparative analysis. Consequently, FMM is an effective tool for applications in synthetic biology to produce both drugs and biofuels. This novel and innovative resource is now freely available at http://FMM.mbc.nctu.edu.tw/.
Cyclone: java-based querying and computing with Pathway/Genome databases.

PubMed

Le Fèvre, François; Smidtas, Serge; Schächter, Vincent

2007-05-15

Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization. Cyclone is freely available under an open source license at: http://sourceforge.net/projects/nemo-cyclone. For download and installation instructions, tutorials, use cases and examples, see http://nemo-cyclone.sourceforge.net.
HypoxiaDB: a database of hypoxia-regulated proteins

PubMed Central

Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala

2013-01-01

There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein–protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for users to compare their protein sequences with HypoxiaDB protein database. We hope that HypoxiaDB will enrich our knowledge about hypoxia-related biology and eventually will lead to the development of novel hypothesis and advancements in diagnostic and therapeutic activities. HypoxiaDB is freely accessible for academic and non-profit users via http://www.hypoxiadb.com. Database URL: http://www.hypoxiadb.com PMID:24178989
A bioinformatic and mechanistic study elicits the antifibrotic effect of ursolic acid through the attenuation of oxidative stress with the involvement of ERK, PI3K/Akt, and p38 MAPK signaling pathways in human hepatic stellate cells and rat liver

PubMed Central

He, Wenhua; Shi, Feng; Zhou, Zhi-Wei; Li, Bimin; Zhang, Kunhe; Zhang, Xinhua; Ouyang, Canhui; Zhou, Shu-Feng; Zhu, Xuan

2015-01-01

NADPH oxidases (NOXs) are a predominant mediator of redox homeostasis in hepatic stellate cells (HSCs), and oxidative stress plays an important role in the pathogenesis of liver fibrosis. Ursolic acid (UA) is a pentacyclic triterpenoid with various pharmacological activities, but the molecular targets and underlying mechanisms for its antifibrotic effect in the liver remain elusive. This study aimed to computationally predict the molecular interactome and mechanistically investigate the antifibrotic effect of UA on oxidative stress, with a focus on NOX4 activity and cross-linked signaling pathways in human HSCs and rat liver. Drug–drug interaction via chemical–protein interactome tool, a server that can predict drug–drug interaction via chemical–protein interactome, was used to predict the molecular targets of UA, and Database for Annotation, Visualization, and Integrated Discovery was employed to analyze the signaling pathways of the predicted targets of UA. The bioinformatic data showed that there were 611 molecular proteins possibly interacting with UA and that there were over 49 functional clusters responding to UA. The subsequential benchmarking data showed that UA significantly reduced the accumulation of type I collagen in HSCs in rat liver, increased the expression level of MMP-1, but decreased the expression level of TIMP-1 in HSC-T6 cells. UA also remarkably reduced the gene expression level of type I collagen in HSC-T6 cells. Furthermore, UA remarkably attenuated oxidative stress via negative regulation of NOX4 activity and expression in HSC-T6 cells. The employment of specific chemical inhibitors, SB203580, LY294002, PD98059, and AG490, demonstrated the involvement of ERK, PI3K/Akt, and p38 MAPK signaling pathways in the regulatory effect of UA on NOX4 activity and expression. Collectively, the antifibrotic effect of UA is partially due to the oxidative stress attenuating effect through manipulating NOX4 activity and expression. The results suggest that UA may act as a promising antifibrotic agent. More studies are warranted to evaluate the safety and efficacy of UA in the treatment of liver fibrosis. PMID:26347199
ProCarDB: a database of bacterial carotenoids.

PubMed

Nupur, L N U; Vats, Asheema; Dhanda, Sandeep Kumar; Raghava, Gajendra P S; Pinnaka, Anil Kumar; Kumar, Ashwani

2016-05-26

Carotenoids have important functions in bacteria, ranging from harvesting light energy to neutralizing oxidants and acting as virulence factors. However, information pertaining to the carotenoids is scattered throughout the literature. Furthermore, information about the genes/proteins involved in the biosynthesis of carotenoids has tremendously increased in the post-genomic era. A web server providing the information about microbial carotenoids in a structured manner is required and will be a valuable resource for the scientific community working with microbial carotenoids. Here, we have created a manually curated, open access, comprehensive compilation of bacterial carotenoids named as ProCarDB- Prokaryotic Carotenoid Database. ProCarDB includes 304 unique carotenoids arising from 50 biosynthetic pathways distributed among 611 prokaryotes. ProCarDB provides important information on carotenoids, such as 2D and 3D structures, molecular weight, molecular formula, SMILES, InChI, InChIKey, IUPAC name, KEGG Id, PubChem Id, and ChEBI Id. The database also provides NMR data, UV-vis absorption data, IR data, MS data and HPLC data that play key roles in the identification of carotenoids. An important feature of this database is the extension of biosynthetic pathways from the literature and through the presence of the genes/enzymes in different organisms. The information contained in the database was mined from published literature and databases such as KEGG, PubChem, ChEBI, LipidBank, LPSN, and Uniprot. The database integrates user-friendly browsing and searching with carotenoid analysis tools to help the user. We believe that this database will serve as a major information centre for researchers working on bacterial carotenoids.
TranscriptomeBrowser 3.0: introducing a new compendium of molecular interactions and a new visualization tool for the study of gene regulatory networks.

PubMed

Lepoivre, Cyrille; Bergon, Aurélie; Lopez, Fabrice; Perumal, Narayanan B; Nguyen, Catherine; Imbert, Jean; Puthier, Denis

2012-01-31

Deciphering gene regulatory networks by in silico approaches is a crucial step in the study of the molecular perturbations that occur in diseases. The development of regulatory maps is a tedious process requiring the comprehensive integration of various evidences scattered over biological databases. Thus, the research community would greatly benefit from having a unified database storing known and predicted molecular interactions. Furthermore, given the intrinsic complexity of the data, the development of new tools offering integrated and meaningful visualizations of molecular interactions is necessary to help users drawing new hypotheses without being overwhelmed by the density of the subsequent graph. We extend the previously developed TranscriptomeBrowser database with a set of tables containing 1,594,978 human and mouse molecular interactions. The database includes: (i) predicted regulatory interactions (computed by scanning vertebrate alignments with a set of 1,213 position weight matrices), (ii) potential regulatory interactions inferred from systematic analysis of ChIP-seq experiments, (iii) regulatory interactions curated from the literature, (iv) predicted post-transcriptional regulation by micro-RNA, (v) protein kinase-substrate interactions and (vi) physical protein-protein interactions. In order to easily retrieve and efficiently analyze these interactions, we developed In-teractomeBrowser, a graph-based knowledge browser that comes as a plug-in for Transcriptome-Browser. The first objective of InteractomeBrowser is to provide a user-friendly tool to get new insight into any gene list by providing a context-specific display of putative regulatory and physical interactions. To achieve this, InteractomeBrowser relies on a "cell compartments-based layout" that makes use of a subset of the Gene Ontology to map gene products onto relevant cell compartments. This layout is particularly powerful for visual integration of heterogeneous biological information and is a productive avenue in generating new hypotheses. The second objective of InteractomeBrowser is to fill the gap between interaction databases and dynamic modeling. It is thus compatible with the network analysis software Cytoscape and with the Gene Interaction Network simulation software (GINsim). We provide examples underlying the benefits of this visualization tool for large gene set analysis related to thymocyte differentiation. The InteractomeBrowser plugin is a powerful tool to get quick access to a knowledge database that includes both predicted and validated molecular interactions. InteractomeBrowser is available through the TranscriptomeBrowser framework and can be found at: http://tagc.univ-mrs.fr/tbrowser/. Our database is updated on a regular basis.
Target gene screening and evaluation of prognostic values in non-small cell lung cancers by bioinformatics analysis.

PubMed

Piao, Junjie; Sun, Jie; Yang, Yang; Jin, Tiefeng; Chen, Liyan; Lin, Zhenhua

2018-03-20

Non-small cell lung cancer (NSCLC) is the major leading cause of cancer-related deaths worldwide. This study aims to explore molecular mechanism of NSCLC. Microarray dataset was obtained from the Gene Expression Omnibus (GEO) database, and analyzed by using GEO2R. Functional and pathway enrichment analysis were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Then, STRING, Cytoscape and MCODE were applied to construct the Protein-protein interaction (PPI) network and screen hub genes. Following, overall survival (OS) analysis of hub genes was performed by using the Kaplan-Meier plotter online tool. Moreover, miRecords was also applied to predict the targets of the differentially expressed microRNAs (DEMs). A total of 228 DEGs were identified, and they were mainly enriched in the terms of cell adhesion molecules, leukocyte transendothelial migration and ECM-receptor interaction. A PPI network was constructed, and 16 hub genes were identified, including TEK, ANGPT1, MMP9, VWF, CDH5, EDN1, ESAM, CCNE1, CDC45, PRC1, CCNB2, AURKA, MELK, CDC20, TOP2A and PTTG1. Among the genes, expressions of 14 hub genes were associated with prognosis of NSCLC patients. Additionally, a total of 11 DEMs were also identified. Our results provide some potential underlying biomarkers for NSCLC. Further studies are required to elucidate the pathogenesis of NSCLC. Copyright © 2018 Elsevier B.V. All rights reserved.
CREDO: a structural interactomics database for drug discovery

PubMed Central

Schreyer, Adrian M.; Blundell, Tom L.

2013-01-01

CREDO is a unique relational database storing all pairwise atomic interactions of inter- as well as intra-molecular contacts between small molecules and macromolecules found in experimentally determined structures from the Protein Data Bank. These interactions are integrated with further chemical and biological data. The database implements useful data structures and algorithms such as cheminformatics routines to create a comprehensive analysis platform for drug discovery. The database can be accessed through a web-based interface, downloads of data sets and web services at http://www-cryst.bioc.cam.ac.uk/credo. Database URL: http://www-cryst.bioc.cam.ac.uk/credo PMID:23868908
IntegromeDB: an integrated system and biological search engine

PubMed Central

2012-01-01

Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback. PMID:22260095
Digital gene expression profiling analysis and its application in the identification of genes associated with improved response to neoadjuvant chemotherapy in breast cancer.

PubMed

Liu, Xiaozhen; Jin, Gan; Qian, Jiacheng; Yang, Hongjian; Tang, Hongchao; Meng, Xuli; Li, Yongfeng

2018-04-23

This study aimed to screen sensitive biomarkers for the efficacy evaluation of neoadjuvant chemotherapy in breast cancer. In this study, Illumina digital gene expression sequencing technology was applied and differentially expressed genes (DEGs) between patients presenting pathological complete response (pCR) and non-pathological complete response (NpCR) were identified. Further, gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were then performed. The genes in significant enriched pathways were finally quantified by quantitative real-time PCR (qRT-PCR) to confirm that they were differentially expressed. Additionally, GSE23988 from Gene Expression Omnibus database was used as the validation dataset to confirm the DEGs. After removing the low-quality reads, 715 DEGs were finally detected. After mapping to KEGG pathways, 10 DEGs belonging to the ubiquitin proteasome pathway (HECTD3, PSMB10, UBD, UBE2C, and UBE2S) and cytokine-cytokine receptor interactions (CCL2, CCR1, CXCL10, CXCL11, and IL2RG) were selected for further analysis. These 10 genes were finally quantified by qRT-PCR to confirm that they were differentially expressed (the log 2 fold changes of selected genes were - 5.34, 7.81, 6.88, 5.74, 3.11, 19.58, 8.73, 8.88, 7.42, and 34.61 for HECTD3, PSMB10, UBD, UBE2C, UBE2S, CCL2, CCR1, CXCL10, CXCL11, and IL2RG, respectively). Moreover, 53 common genes were confirmed by the validation dataset, including downregulated UBE2C and UBE2S. Our results suggested that these 10 genes belonging to these two pathways might be useful as sensitive biomarkers for the efficacy evaluation of neoadjuvant chemotherapy in breast cancer.
Human Mesenchymal Stem Cell Treatment Normalizes Cortical Gene Expression after Traumatic Brain Injury.

PubMed

Darkazalli, Ali; Vied, Cynthia; Badger, Crystal-Dawn; Levenson, Cathy W

2017-01-01

Traumatic brain injury (TBI) results in a progressive disease state with many adverse and long-term neurological consequences. Mesenchymal stem cells (MSCs) have emerged as a promising cytotherapy and have been previously shown to reduce secondary apoptosis and cognitive deficits associated with TBI. Consistent with the established literature, we observed that systemically administered human MSCs (hMSCs) accumulate with high specificity at the TBI lesion boundary zone known as the penumbra. Substantial work has been done to illuminate the mechanisms by which MSCs, and the bioactive molecules they secrete, exert their therapeutic effect. However, no such work has been published to examine the effect of MSC treatment on gene expression in the brain post-TBI. In the present study, we use high-throughput RNA sequencing (RNAseq) of cortical tissue from the TBI penumbra to assess the molecular effects of both TBI and subsequent treatment with intravenously delivered hMSCs. RNAseq revealed that expression of almost 7000 cortical genes in the penumbra were differentially regulated by TBI. Pathway analysis using the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database revealed that TBI regulated a large number of genes belonging to pathways involved in metabolism, receptor-mediated cell signaling, neuronal plasticity, immune cell recruitment and infiltration, and neurodegenerative disease. Remarkably, hMSC treatment was found to normalize 49% of all genes disrupted by TBI, with notably robust normalization of specific pathways within the categories mentioned above, including neuroactive receptor-ligand interactions (57%), glycolysis and gluconeogenesis (81%), and Parkinson's disease (100%). These data provide evidence in support of the multi-mechanistic nature of stem cell therapy and suggest that hMSC treatment is capable of simultaneously normalizing a wide variety of important molecular pathways that are disrupted by brain injury.
The Dendritic Cell Major Histocompatibility Complex II (MHC II) Peptidome Derives from a Variety of Processing Pathways and Includes Peptides with a Broad Spectrum of HLA-DM Sensitivity*

PubMed Central

Clement, Cristina C.; Becerra, Aniuska; Yin, Liusong; Zolla, Valerio; Huang, Liling; Merlin, Simone; Follenzi, Antonia; Shaffer, Scott A.; Stern, Lawrence J.; Santambrogio, Laura

2016-01-01

The repertoire of peptides displayed in vivo by MHC II molecules derives from a wide spectrum of proteins produced by different cell types. Although intracellular endosomal processing in dendritic cells and B cells has been characterized for a few antigens, the overall range of processing pathways responsible for generating the MHC II peptidome are currently unclear. To determine the contribution of non-endosomal processing pathways, we eluted and sequenced over 3000 HLA-DR1-bound peptides presented in vivo by dendritic cells. The processing enzymes were identified by reference to a database of experimentally determined cleavage sites and experimentally validated for four epitopes derived from complement 3, collagen II, thymosin β4, and gelsolin. We determined that self-antigens processed by tissue-specific proteases, including complement, matrix metalloproteases, caspases, and granzymes, and carried by lymph, contribute significantly to the MHC II self-peptidome presented by conventional dendritic cells in vivo. Additionally, the presented peptides exhibited a wide spectrum of binding affinity and HLA-DM susceptibility. The results indicate that the HLA-DR1-restricted self-peptidome presented under physiological conditions derives from a variety of processing pathways. Non-endosomal processing enzymes add to the number of epitopes cleaved by cathepsins, altogether generating a wider peptide repertoire. Taken together with HLA-DM-dependent and-independent loading pathways, this ensures that a broad self-peptidome is presented by dendritic cells. This work brings attention to the role of “self-recognition” as a dynamic interaction between dendritic cells and the metabolic/catabolic activities ongoing in every parenchymal organ as part of tissue growth, remodeling, and physiological apoptosis. PMID:26740625
MelanomaDB: A Web Tool for Integrative Analysis of Melanoma Genomic Information to Identify Disease-Associated Molecular Pathways

PubMed Central

Trevarton, Alexander J.; Mann, Michael B.; Knapp, Christoph; Araki, Hiromitsu; Wren, Jonathan D.; Stones-Havas, Steven; Black, Michael A.; Print, Cristin G.

2013-01-01

Despite on-going research, metastatic melanoma survival rates remain low and treatment options are limited. Researchers can now access a rapidly growing amount of molecular and clinical information about melanoma. This information is becoming difficult to assemble and interpret due to its dispersed nature, yet as it grows it becomes increasingly valuable for understanding melanoma. Integration of this information into a comprehensive resource to aid rational experimental design and patient stratification is needed. As an initial step in this direction, we have assembled a web-accessible melanoma database, MelanomaDB, which incorporates clinical and molecular data from publically available sources, which will be regularly updated as new information becomes available. This database allows complex links to be drawn between many different aspects of melanoma biology: genetic changes (e.g., mutations) in individual melanomas revealed by DNA sequencing, associations between gene expression and patient survival, data concerning drug targets, biomarkers, druggability, and clinical trials, as well as our own statistical analysis of relationships between molecular pathways and clinical parameters that have been produced using these data sets. The database is freely available at http://genesetdb.auckland.ac.nz/melanomadb/about.html. A subset of the information in the database can also be accessed through a freely available web application in the Illumina genomic cloud computing platform BaseSpace at http://www.biomatters.com/apps/melanoma-profiler-for-research. The MelanomaDB database illustrates dysregulation of specific signaling pathways across 310 exome-sequenced melanomas and in individual tumors and identifies the distribution of somatic variants in melanoma. We suggest that MelanomaDB can provide a context in which to interpret the tumor molecular profiles of individual melanoma patients relative to biological information and available drug therapies. PMID:23875173
Asian Citrus Psyllid Expression Profiles Suggest Candidatus Liberibacter Asiaticus-Mediated Alteration of Adult Nutrition and Metabolism, and of Nymphal Development and Immunity

PubMed Central

He, Ruifeng; Nelson, William; Yin, Guohua; Cicero, Joseph M.; Willer, Mark; Kim, Ryan; Kramer, Robin; May, Greg A.; Crow, John A.; Soderlund, Carol A.; Gang, David R.; Brown, Judith K.

2015-01-01

The Asian citrus psyllid (ACP) Diaphorina citri Kuwayama (Hemiptera: Psyllidae) is the insect vector of the fastidious bacterium Candidatus Liberibacter asiaticus (CLas), the causal agent of citrus greening disease, or Huanglongbing (HLB). The widespread invasiveness of the psyllid vector and HLB in citrus trees worldwide has underscored the need for non-traditional approaches to manage the disease. One tenable solution is through the deployment of RNA interference technology to silence protein-protein interactions essential for ACP-mediated CLas invasion and transmission. To identify psyllid interactor-bacterial effector combinations associated with psyllid-CLas interactions, cDNA libraries were constructed from CLas-infected and CLas-free ACP adults and nymphs, and analyzed for differential expression. Library assemblies comprised 24,039,255 reads and yielded 45,976 consensus contigs. They were annotated (UniProt), classified using Gene Ontology, and subjected to in silico expression analyses using the Transcriptome Computational Workbench (TCW) (http://www.sohomoptera.org/ACPPoP/). Functional-biological pathway interpretations were carried out using the Kyoto Encyclopedia of Genes and Genomes databases. Differentially expressed contigs in adults and/or nymphs represented genes and/or metabolic/pathogenesis pathways involved in adhesion, biofilm formation, development-related, immunity, nutrition, stress, and virulence. Notably, contigs involved in gene silencing and transposon-related responses were documented in a psyllid for the first time. This is the first comparative transcriptomic analysis of ACP adults and nymphs infected and uninfected with CLas. The results provide key initial insights into host-parasite interactions involving CLas effectors that contribute to invasion-virulence, and to host nutritional exploitation and immune-related responses that appear to be essential for successful ACP-mediated circulative, propagative CLas transmission. PMID:26091106
Validated MicroRNA Target Databases: An Evaluation.

PubMed

Lee, Yun Ji Diana; Kim, Veronica; Muth, Dillon C; Witwer, Kenneth W

2015-11-01

Preclinical Research Positive findings from preclinical and clinical studies involving depletion or supplementation of microRNA (miRNA) engender optimism about miRNA-based therapeutics. However, off-target effects must be considered. Predicting these effects is complicated. Each miRNA may target many gene transcripts, and the rules governing imperfectly complementary miRNA: target interactions are incompletely understood. Several databases provide lists of the relatively small number of experimentally confirmed miRNA: target pairs. Although incomplete, this information might allow assessment of at least some of the off-target effects. We evaluated the performance of four databases of experimentally validated miRNA: target interactions (miRWalk 2.0, miRTarBase, miRecords, and TarBase 7.0) using a list of 50 alphabetically consecutive genes. We examined the provided citations to determine the degree to which each interaction was experimentally supported. To assess stability, we tested at the beginning and end of a five-month period. Results varied widely by database. Two of the databases changed significantly over the course of 5 months. Most reported evidence for miRNA: target interactions were indirect or otherwise weak, and relatively few interactions were supported by more than one publication. Some returned results appear to arise from simplistic text searches that offer no insight into the relationship of the search terms, may not even include the reported gene or miRNA, and may thus, be invalid. We conclude that validation databases provide important information, but not all information in all extant databases is up-to-date or accurate. Nevertheless, the more comprehensive validation databases may provide useful starting points for investigation of off-target effects of proposed small RNA therapies. © 2015 Wiley Periodicals, Inc.
RAIN: RNA–protein Association and Interaction Networks

PubMed Central

Junge, Alexander; Refsgaard, Jan C.; Garde, Christian; Pan, Xiaoyong; Santos, Alberto; Alkan, Ferhat; Anthon, Christian; von Mering, Christian; Workman, Christopher T.; Jensen, Lars Juhl; Gorodkin, Jan

2017-01-01

Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA–RNA and ncRNA–protein interactions and its integration with the STRING database of protein–protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded. Database URL: http://rth.dk/resources/rain PMID:28077569

The new interactive CESAR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fox, P.B.; Yatabe, M.

1987-01-01

In this report the Nuclear Criticality Safety Analytical Methods Resource Center describes a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) database at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR database was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE Criticality Safety Analytical Sequences and the codes comprising those sequences. The database includes data for both cross section preparation and criticality safety calculations. 3 refs., 1 tab.
Trait- and density-mediated indirect interactions initiated by an exotic invasive plant autogenic ecosystem engineer

Treesearch

Dean E. Pearson

2010-01-01

Indirect interactions are important for structuring ecological systems. However, research on indirect effects has been heavily biased toward top-down trophic interactions, and less is known about other indirect-interaction pathways. As autogenic ecosystem engineers, plants can serve as initiators of nontrophic indirect interactions that, like top-down pathways, can...
Mapping the patent landscape of synthetic biology for fine chemical production pathways.

PubMed

Carbonell, Pablo; Gök, Abdullah; Shapira, Philip; Faulon, Jean-Loup

2016-09-01

A goal of synthetic biology bio-foundries is to innovate through an iterative design/build/test/learn pipeline. In assessing the value of new chemical production routes, the intellectual property (IP) novelty of the pathway is important. Exploratory studies can be carried using knowledge of the patent/IP landscape for synthetic biology and metabolic engineering. In this paper, we perform an assessment of pathways as potential targets for chemical production across the full catalogue of reachable chemicals in the extended metabolic space of chassis organisms, as computed by the retrosynthesis-based algorithm RetroPath. Our database for reactions processed by sequences in heterologous pathways was screened against the PatSeq database, a comprehensive collection of more than 150M sequences present in patent grants and applications. We also examine related patent families using Derwent Innovations. This large-scale computational study provides useful insights into the IP landscape of synthetic biology for fine and specialty chemicals production. © 2016 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Caveat emptor: limitations of the automated reconstruction of metabolic pathways in Plasmodium.

PubMed

Ginsburg, Hagai

2009-01-01

The functional reconstruction of metabolic pathways from an annotated genome is a tedious and demanding enterprise. Automation of this endeavor using bioinformatics algorithms could cope with the ever-increasing number of sequenced genomes and accelerate the process. Here, the manual reconstruction of metabolic pathways in the functional genomic database of Plasmodium falciparum--Malaria Parasite Metabolic Pathways--is described and compared with pathways generated automatically as they appear in PlasmoCyc, metaSHARK and the Kyoto Encyclopedia for Genes and Genomes. A critical evaluation of this comparison discloses that the automatic reconstruction of pathways generates manifold paths that need an expert manual verification to accept some and reject most others based on manually curated gene annotation.
Ossification of the posterior longitudinal ligament related genes identification using microarray gene expression profiling and bioinformatics analysis.

PubMed

He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian

2014-01-10

Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL. © 2013.
Electronic coupling through natural amino acids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berstis, Laura; Beckham, Gregg T., E-mail: michael.crowley@nrel.gov, E-mail: gregg.beckham@nrel.gov; Crowley, Michael F., E-mail: michael.crowley@nrel.gov, E-mail: gregg.beckham@nrel.gov

2015-12-14

Myriad scientific domains concern themselves with biological electron transfer (ET) events that span across vast scales of rate and efficiency through a remarkably fine-tuned integration of amino acid (AA) sequences, electronic structure, dynamics, and environment interactions. Within this intricate scheme, many questions persist as to how proteins modulate electron-tunneling properties. To help elucidate these principles, we develop a model set of peptides representing the common α-helix and β-strand motifs including all natural AAs within implicit protein-environment solvation. Using an effective Hamiltonian strategy with density functional theory, we characterize the electronic coupling through these peptides, furthermore considering side-chain dynamics. For bothmore » motifs, predictions consistently show that backbone-mediated electronic coupling is distinctly sensitive to AA type (aliphatic, polar, aromatic, negatively charged and positively charged), and to side-chain orientation. The unique properties of these residues may be employed to design activated, deactivated, or switch-like superexchange pathways. Electronic structure calculations and Green’s function analyses indicate that localized shifts in the electron density along the peptide play a role in modulating these pathways, and further substantiate the experimentally observed behavior of proline residues as superbridges. The distinct sensitivities of tunneling pathways to sequence and conformation revealed in this electronic coupling database help improve our fundamental understanding of the broad diversity of ET reactivity and provide guiding principles for peptide design.« less
Analysis of expressed sequence tags for Frankliniella occidentalis, the western flower thrips.

PubMed

Rotenberg, D; Whitfield, A E

2010-08-01

Thrips are members of the insect order Thysanoptera and Frankliniella occidentalis (the western flower thrips) is the most economically important pest within this order. F. occidentalis is both a direct pest of crops and an efficient vector of plant viruses, including Tomato spotted wilt virus (TSWV). Despite the world-wide importance of thrips in agriculture, there is little knowledge of the F. occidentalis genome or gene functions at this time. A normalized cDNA library was constructed from first instar thrips and 13 839 expressed sequence tags (ESTs) were obtained. Our EST data assembled into 894 contigs and 11 806 singletons (12 700 nonredundant sequences). We found that 31% of these sequences had significant similarity (E< or = 10(-10)) to protein sequences in the National Center for Biotechnology Information nonredundant (nr) protein database, and 25% were functionally annotated using Blast 2GO. We identified 74 sequences with putative homology to proteins associated with insect innate immunity. Sixteen sequences had significant similarity to proteins associated with small RNA-mediated gene silencing pathways (RNA interference; RNAi), including the antiviral pathway (short interfering RNA-mediated pathway). Our EST collection provides new sequence resources for characterizing gene functions in F. occidentalis and other thrips species with regards to vital biological processes, studying the mechanism of interactions with the viruses harboured and transmitted by the vector, and identifying new insect gene-centred targets for plant disease and insect control.
DIANA-microT web server: elucidating microRNA functions through target prediction.

PubMed

Maragkakis, M; Reczko, M; Simossis, V A; Alexiou, P; Papadopoulos, G L; Dalamagas, T; Giannopoulos, G; Goumas, G; Koukis, E; Kourtis, K; Vergoulis, T; Koziris, N; Sellis, T; Tsanakas, P; Hatzigeorgiou, A G

2009-07-01

Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT.
Amyloid precursor protein interaction network in human testis: sentinel proteins for male reproduction.

PubMed

Silva, Joana Vieira; Yoon, Sooyeon; Domingues, Sara; Guimarães, Sofia; Goltsev, Alexander V; da Cruz E Silva, Edgar Figueiredo; Mendes, José Fernando F; da Cruz E Silva, Odete Abreu Beirão; Fardilha, Margarida

2015-01-16

Amyloid precursor protein (APP) is widely recognized for playing a central role in Alzheimer's disease pathogenesis. Although APP is expressed in several tissues outside the human central nervous system, the functions of APP and its family members in other tissues are still poorly understood. APP is involved in several biological functions which might be potentially important for male fertility, such as cell adhesion, cell motility, signaling, and apoptosis. Furthermore, APP superfamily members are known to be associated with fertility. Knowledge on the protein networks of APP in human testis and spermatozoa will shed light on the function of APP in the male reproductive system. We performed a Yeast Two-Hybrid screen and a database search to study the interaction network of APP in human testis and sperm. To gain insights into the role of APP superfamily members in fertility, the study was extended to APP-like protein 2 (APLP2). We analyzed several topological properties of the APP interaction network and the biological and physiological properties of the proteins in the APP interaction network were also specified by gene ontologyand pathways analyses. We classified significant features related to the human male reproduction for the APP interacting proteins and identified modules of proteins with similar functional roles which may show cooperative behavior for male fertility. The present work provides the first report on the APP interactome in human testis. Our approach allowed the identification of novel interactions and recognition of key APP interacting proteins for male reproduction, particularly in sperm-oocyte interaction.
Interactions between traditional Chinese medicine and western drugs in Taiwan: A population-based study.

PubMed

Chen, Kuan Chen; Lu, Richard; Iqbal, Usman; Hsu, Ko-Ching; Chen, Bi-Li; Nguyen, Phung-Anh; Yang, Hsuan-Chia; Huang, Chih-Wei; Li, Yu-Chuan Jack; Jian, Wen-Shan; Tsai, Shin-Han

2015-12-01

Drug-drug interactions have long been an active research area in clinical medicine. In Taiwan, however, the widespread use of traditional Chinese medicines (TCM) presents additional complexity to the topic. Therefore, it is important to see the interaction between traditional Chinese and western medicine. (1) To create a comprehensive database of multi-herb/western drug interactions indexed according to the ways in which physicians actually practice and (2) to measure this database's impact on the detection of adverse effects between traditional Chinese medicine compounds and western medicines. First, a multi-herb/western medicine drug interactions database was created by separating each TCM compound into its constituent herbs. Each individual herb was then checked against an existing single-herb/western drug interactions database. The data source comes from the National Health Insurance research database, which spans the years 1998-2011. This study estimated the interaction prevalence rate and further separated the rates according to patient characteristics, distribution by county, and hospital accreditation levels. Finally, this new database was integrated into a computer order entry module of the electronic medical records system of a regional teaching hospital. The effects it had were measured for two months. The most commonly interacting Chinese herbs were Ephedrae Herba and Angelicae Sinensis Radix/Angelicae Dahuricae Radix. Ephedrae Herba contains active ingredients similar to in ephedrine. 15 kinds of traditional Chinese medicine compounds contain Ephedrae Herba. Angelicae Sinensis Radix and Angelicae Dahuricae Radix contain ingredients similar to coumarin, a blood thinner. 9 kinds of traditional Chinese medicine compounds contained Angelicae Sinensis Radix/Angelicae Dahuricae Radix. In the period from 1998 to 2011, the prevalence of herb-drug interactions related to Ephedrae Herba was 0.18%. The most commonly prescribed traditional Chinese compounds were MA SHING GAN SHYR TANG (23.1%), followed by SHEAU CHING LONG TANG (15.5%) and DINQ CHUAN TANG (13.2%). The prevalence of herb-drug interactions related to Angelicae Sinensis Radix, Angelicae Dahuricae Radix was 4.59%. The most common traditional Chinese compound formula were TSANG EEL SAAN (32%), followed by HUOH SHIANG JENQ CHIH SAAN (31.4%) and SHY WUH TANG (10.7%). Once the multi-herb drug interaction database was deployed in a hospital system, there were 480 prescriptions that indicated a TCM-western drug interaction. Physicians were alerted 24 times during two months. These alerts resulted in a prescription change four times (16.7%). Due to the unique cultural factors that have resulted in widespread acceptance of both western and traditional Chinese medicine, Taiwan stands well positioned to report on the prevalence of interactions between western drugs and traditional Chinese medicine and devise ways to reduce their incidence. This study built a multi-herb/western drug interactions database, embedded inside a hospital clinical information system, and then examined the effects that drug interaction alerts had on clinician prescribing behaviour. The results demonstrated that western drug/traditional Chinese medicine interactions are prevalent and that western-trained physicians tend to change their prescribing behaviour more than traditional Chinese medicine physicians in their response to medication interaction alerts. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Beyond mitochondria, what would be the energy source of the cell?

PubMed

Herrera, Arturo S; Del C A Esparza, Maria; Md Ashraf, Ghulam; Zamyatnin, Andrey A; Aliev, Gjumrakch

2015-01-01

Currently, cell biology is based on glucose as the main source of energy. Cellular bioenergetic pathways have become unnecessarily complex in their eagerness to explain that how the cell is able to generate and use energy from the oxidation of glucose, where mitochondria play an important role through oxidative phosphorylation. During a descriptive study about the three leading causes of blindness in the world, the ability of melanin to transform light energy into chemical energy through the dissociation of water molecule was unraveled. Initially, during 2 or 3 years; we tried to link together our findings with the widely accepted metabolic pathways already described in metabolic pathway databases, which have been developed to collect and organize the current knowledge on metabolism scattered across a multitude of scientific articles. However, firstly, the literature on metabolism is extensive but rarely conclusive evidence is available, and secondly, one would expect these databases to contain largely the same information, but the contrary is true. For the apparently well studied metabolic process Krebs cycle, which was described as early as 1937 and is found in nearly every biology and chemistry curriculum, there is a considerable disagreement between at least five databases. Of the nearly 7000 reactions contained jointly by these five databases, only 199 are described in the same way in all the five databases. Thus to try to integrate chemical energy from melanin with the supposedly well-known bioenergetic pathways is easier said than done; and the lack of consensus about metabolic network constitutes an insurmountable barrier. After years of unsuccessful results, we finally realized that the chemical energy released through the dissociation of water molecule by melanin represents over 90% of cell energy requirements. These findings reveal a new aspect of cell biology, as glucose and ATP have biological functions related mainly to biomass and not so much with energy. Our finding about the unexpected intrinsic property of melanin to transform photon energy into chemical energy through the dissociation of water molecule, a role performed supposedly only by chlorophyll in plants, seriously questions the sacrosanct role of glucose and thereby mitochondria as the primary source of energy and power for the cells.
Correcting ligands, metabolites, and pathways

PubMed Central

Ott, Martin A; Vriend, Gert

2006-01-01

Background A wide range of research areas in bioinformatics, molecular biology and medicinal chemistry require precise chemical structure information about molecules and reactions, e.g. drug design, ligand docking, metabolic network reconstruction, and systems biology. Most available databases, however, treat chemical structures more as illustrations than as a datafield in its own right. Lack of chemical accuracy impedes progress in the areas mentioned above. We present a database of metabolites called BioMeta that augments the existing pathway databases by explicitly assessing the validity, correctness, and completeness of chemical structure and reaction information. Description The main bulk of the data in BioMeta were obtained from the KEGG Ligand database. We developed a tool for chemical structure validation which assesses the chemical validity and stereochemical completeness of a molecule description. The validation tool was used to examine the compounds in BioMeta, showing that a relatively small number of compounds had an incorrect constitution (connectivity only, not considering stereochemistry) and that a considerable number (about one third) had incomplete or even incorrect stereochemistry. We made a large effort to correct the errors and to complete the structural descriptions. A total of 1468 structures were corrected and/or completed. We also established the reaction balance of the reactions in BioMeta and corrected 55% of the unbalanced (stoichiometrically incorrect) reactions in an automatic procedure. The BioMeta database was implemented in PostgreSQL and provided with a web-based interface. Conclusion We demonstrate that the validation of metabolite structures and reactions is a feasible and worthwhile undertaking, and that the validation results can be used to trigger corrections and improvements to BioMeta, our metabolite database. BioMeta provides some tools for rational drug design, reaction searches, and visualization. It is freely available at provided that the copyright notice of all original data is cited. The database will be useful for querying and browsing biochemical pathways, and to obtain reference information for identifying compounds. However, these applications require that the underlying data be correct, and that is the focus of BioMeta. PMID:17132165
An Assessment of Database-Validated microRNA Target Genes in Normal Colonic Mucosa: Implications for Pathway Analysis.

PubMed

Slattery, Martha L; Herrick, Jennifer S; Stevens, John R; Wolff, Roger K; Mullany, Lila E

2017-01-01

Determination of functional pathways regulated by microRNAs (miRNAs), while an essential step in developing therapeutics, is challenging. Some miRNAs have been studied extensively; others have limited information. In this study, we focus on 254 miRNAs previously identified as being associated with colorectal cancer and their database-identified validated target genes. We use RNA-Seq data to evaluate messenger RNA (mRNA) expression for 157 subjects who also had miRNA expression data. In the replication phase of the study, we replicated associations between 254 miRNAs associated with colorectal cancer and mRNA expression of database-identified target genes in normal colonic mucosa. In the discovery phase of the study, we evaluated expression of 18 miR-NAs (those with 20 or fewer database-identified target genes along with miR-21-5p, miR-215-5p, and miR-124-3p which have more than 500 database-identified target genes) with expression of 17 434 mRNAs to identify new targets in colon tissue. Seed region matches between miRNA and newly identified targeted mRNA were used to help determine direct miRNA-mRNA associations. From the replication of the 121 miRNAs that had at least 1 database-identified target gene using mRNA expression methods, 97.9% were expressed in normal colonic mucosa. Of the 8622 target miRNA-mRNA associations identified in the database, 2658 (30.2%) were associated with gene expression in normal colonic mucosa after adjusting for multiple comparisons. Of the 133 miRNAs with database-identified target genes by non-mRNA expression methods, 97.2% were expressed in normal colonic mucosa. After adjustment for multiple comparisons, 2416 miRNA-mRNA associations remained significant (19.8%). Results from the discovery phase based on detailed examination of 18 miRNAs identified more than 80 000 miRNA-mRNA associations that had not previously linked to the miRNA. Of these miRNA-mRNA associations, 15.6% and 14.8% had seed matches for CRCh38 and CRCh37, respectively. Our data suggest that miRNA target gene databases are incomplete; pathways derived from these databases have similar deficiencies. Although we know a lot about several miRNAs, little is known about other miRNAs in terms of their targeted genes. We encourage others to use their data to continue to further identify and validate miRNA-targeted genes.
Identification and Analysis of Jasmonate Pathway Genes in Coffea canephora (Robusta Coffee) by In Silico Approach.

PubMed

Bharathi, Kosaraju; Sreenath, H L

2017-07-01

Coffea canephora is the commonly cultivated coffee species in the world along with Coffea arabica . Different pests and pathogens affect the production and quality of the coffee. Jasmonic acid (JA) is a plant hormone which plays an important role in plants growth, development, and defense mechanisms, particularly against insect pests. The key enzymes involved in the production of JA are lipoxygenase, allene oxide synthase, allene oxide cyclase, and 12-oxo-phytodienoic reductase. There is no report on the genes involved in JA pathway in coffee plants. We made an attempt to identify and analyze the genes coding for these enzymes in C. canephora . First, protein sequences of jasmonate pathway genes from model plant Arabidopsis thaliana were identified in the National Center for Biotechnology Information (NCBI) database. These protein sequences were used to search the web-based database Coffee Genome Hub to identify homologous protein sequences in C. canephora genome using Basic Local Alignment Search Tool (BLAST). Homologous protein sequences for key genes were identified in the C. canephora genome database. Protein sequences of the top matches were in turn used to search in NCBI database using BLAST tool to confirm the identity of the selected proteins and to identify closely related genes in species. The protein sequences from C. canephora database and the top matches in NCBI were aligned, and phylogenetic trees were constructed using MEGA6 software and identified the genetic distance of the respective genes. The study identified the four key genes of JA pathway in C. canephora , confirming the conserved nature of the pathway in coffee. The study expected to be useful to further explore the defense mechanisms of coffee plants. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC, and OPR are identified in C. canephora (robusta coffee) by bioinformatic approaches confirming the conserved nature of the pathway in coffee. The findings are useful to understand the defense mechanisms of C. canephora and coffee breeding in the long run. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC and OPR were identified and analyzed in C. canephora (robusta coffee) by in silico approach. The study has confirmed the conserved nature of JA pathway in coffee; the findings are useful to further explore the defense mechanisms of coffee plants. Abbreviations used: C. canephora : Coffea canephora ; C. arabica : Coffea arabica ; JA: Jasmonic acid; CGH: Coffee Genome Hub; NCBI: National Centre for Biotechnology Information; BLAST: Basic Local Alignment Search Tool; A. thaliana : Arabidopsis thaliana ; LOX: Lipoxygenase, AOS: Allene oxide synthase; AOC: Allene oxide cyclase; OPR: 12 oxo phytodienoic reductase.
Geometric database maintenance using CCTV cameras and overlay graphics

NASA Astrophysics Data System (ADS)

Oxenberg, Sheldon C.; Landell, B. Patrick; Kan, Edwin

1988-01-01

An interactive graphics system using closed circuit television (CCTV) cameras for remote verification and maintenance of a geometric world model database has been demonstrated in GE's telerobotics testbed. The database provides geometric models and locations of objects viewed by CCTV cameras and manipulated by telerobots. To update the database, an operator uses the interactive graphics system to superimpose a wireframe line drawing of an object with known dimensions on a live video scene containing that object. The methodology used is multipoint positioning to easily superimpose a wireframe graphic on the CCTV image of an object in the work scene. An enhanced version of GE's interactive graphics system will provide the object designation function for the operator control station of the Jet Propulsion Laboratory's telerobot demonstration system.
Bio-crude transcriptomics: gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa).

PubMed

Molnár, István; Lopez, David; Wisecaver, Jennifer H; Devarenne, Timothy P; Weiss, Taylor L; Pellegrini, Matteo; Hackett, Jeremiah D

2012-10-30

Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy. A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated. The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome sequence for the Showa strain of B. braunii, race B. Further, the transcriptome database empowers future biosynthetic engineering approaches for strain improvement and the transfer of desirable traits to heterologous hosts.
Transcriptomic analysis of flower development in wintersweet (Chimonanthus praecox).

PubMed

Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang

2014-01-01

Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants.
Bio-crude transcriptomics: Gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa)*

PubMed Central

2012-01-01

Background Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy. Results A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated. Conclusions The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome sequence for the Showa strain of B. braunii, race B. Further, the transcriptome database empowers future biosynthetic engineering approaches for strain improvement and the transfer of desirable traits to heterologous hosts. PMID:23110428
Transcriptomic Analysis of Flower Development in Wintersweet (Chimonanthus praecox)

PubMed Central

Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang

2014-01-01

Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants. PMID:24489818
Gramene 2016: comparative plant genomics and pathway resources.

PubMed

Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

2016-01-04

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.