Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun
2014-01-01
As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
Wu, Mengmeng; Zeng, Wanwen; Liu, Wenqiang; Lv, Hairong; Chen, Ting; Jiang, Rui
2018-06-03
Genome-wide association studies (GWAS) have successfully discovered a number of disease-associated genetic variants in the past decade, providing an unprecedented opportunity for deciphering genetic basis of human inherited diseases. However, it is still a challenging task to extract biological knowledge from the GWAS data, due to such issues as missing heritability and weak interpretability. Indeed, the fact that the majority of discovered loci fall into noncoding regions without clear links to genes has been preventing the characterization of their functions and appealing for a sophisticated approach to bridge genetic and genomic studies. Towards this problem, network-based prioritization of candidate genes, which performs integrated analysis of gene networks with GWAS data, has emerged as a promising direction and attracted much attention. However, most existing methods overlook the sparse and noisy properties of gene networks and thus may lead to suboptimal performance. Motivated by this understanding, we proposed a novel method called REGENT for integrating multiple gene networks with GWAS data to prioritize candidate genes for complex diseases. We leveraged a technique called the network representation learning to embed a gene network into a compact and robust feature space, and then designed a hierarchical statistical model to integrate features of multiple gene networks with GWAS data for the effective inference of genes associated with a disease of interest. We applied our method to six complex diseases and demonstrated the superior performance of REGENT over existing approaches in recovering known disease-associated genes. We further conducted a pathway analysis and showed that the ability of REGENT to discover disease-associated pathways. We expect to see applications of our method to a broad spectrum of diseases for post-GWAS analysis. REGENT is freely available at https://github.com/wmmthu/REGENT. Copyright © 2018 Elsevier Inc. All rights reserved.
"Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...
Hadley, Dexter; Wu, Zhi-liang; Kao, Charlly; Kini, Akshata; Mohamed-Hadley, Alisha; Thomas, Kelly; Vazquez, Lyam; Qiu, Haijun; Mentch, Frank; Pellegrino, Renata; Kim, Cecilia; Connolly, John; Pinto, Dalila; Merikangas, Alison; Klei, Lambertus; Vorstman, Jacob A.S.; Thompson, Ann; Regan, Regina; Pagnamenta, Alistair T.; Oliveira, Bárbara; Magalhaes, Tiago R.; Gilbert, John; Duketis, Eftichia; De Jonge, Maretha V.; Cuccaro, Michael; Correia, Catarina T.; Conroy, Judith; Conceição, Inês C.; Chiocchetti, Andreas G.; Casey, Jillian P.; Bolshakova, Nadia; Bacchelli, Elena; Anney, Richard; Zwaigenbaum, Lonnie; Wittemeyer, Kerstin; Wallace, Simon; Engeland, Herman van; Soorya, Latha; Rogé, Bernadette; Roberts, Wendy; Poustka, Fritz; Mouga, Susana; Minshew, Nancy; McGrew, Susan G.; Lord, Catherine; Leboyer, Marion; Le Couteur, Ann S.; Kolevzon, Alexander; Jacob, Suma; Guter, Stephen; Green, Jonathan; Green, Andrew; Gillberg, Christopher; Fernandez, Bridget A.; Duque, Frederico; Delorme, Richard; Dawson, Geraldine; Café, Cátia; Brennan, Sean; Bourgeron, Thomas; Bolton, Patrick F.; Bölte, Sven; Bernier, Raphael; Baird, Gillian; Bailey, Anthony J.; Anagnostou, Evdokia; Almeida, Joana; Wijsman, Ellen M.; Vieland, Veronica J.; Vicente, Astrid M.; Schellenberg, Gerard D.; Pericak-Vance, Margaret; Paterson, Andrew D.; Parr, Jeremy R.; Oliveira, Guiomar; Almeida, Joana; Café, Cátia; Mouga, Susana; Correia, Catarina; Nurnberger, John I.; Monaco, Anthony P.; Maestrini, Elena; Klauck, Sabine M.; Hakonarson, Hakon; Haines, Jonathan L.; Geschwind, Daniel H.; Freitag, Christine M.; Folstein, Susan E.; Ennis, Sean; Coon, Hilary; Battaglia, Agatino; Szatmari, Peter; Sutcliffe, James S.; Hallmayer, Joachim; Gill, Michael; Cook, Edwin H.; Buxbaum, Joseph D.; Devlin, Bernie; Gallagher, Louise; Betancur, Catalina; Scherer, Stephen W.; Glessner, Joseph; Hakonarson, Hakon
2014-01-01
Although multiple reports show that defective genetic networks underlie the aetiology of autism, few have translated into pharmacotherapeutic opportunities. Since drugs compete with endogenous small molecules for protein binding, many successful drugs target large gene families with multiple drug binding sites. Here we search for defective gene family interaction networks (GFINs) in 6,742 patients with the ASDs relative to 12,544 neurologically normal controls, to find potentially druggable genetic targets. We find significant enrichment of structural defects (P≤2.40E−09, 1.8-fold enrichment) in the metabotropic glutamate receptor (GRM) GFIN, previously observed to impact attention deficit hyperactivity disorder (ADHD) and schizophrenia. Also, the MXD-MYC-MAX network of genes, previously implicated in cancer, is significantly enriched (P≤3.83E−23, 2.5-fold enrichment), as is the calmodulin 1 (CALM1) gene interaction network (P≤4.16E−04, 14.4-fold enrichment), which regulates voltage-independent calcium-activated action potentials at the neuronal synapse. We find that multiple defective gene family interactions underlie autism, presenting new translational opportunities to explore for therapeutic interventions. PMID:24927284
Network Analysis of Rodent Transcriptomes in Spaceflight
NASA Technical Reports Server (NTRS)
Ramachandran, Maya; Fogle, Homer; Costes, Sylvain
2017-01-01
Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.
Trainable Gene Regulation Networks with Applications to Drosophila Pattern Formation
NASA Technical Reports Server (NTRS)
Mjolsness, Eric
2000-01-01
This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila melanogaster. For details the reader is referred to the papers introduced below. It will then introduce a new gene regulation network model which can describe promoter-level substructure in gene regulation. As described in chapter 2, gene regulation may be thought of as a combination of cis-acting regulation by the extended promoter of a gene (including all regulatory sequences) by way of the transcription complex, and of trans-acting regulation by the transcription factor products of other genes. If we simplify the cis-action by using a phenomenological model which can be tuned to data, such as a unit or other small portion of an artificial neural network, then the full transacting interaction between multiple genes during development can be modelled as a larger network which can again be tuned or trained to data. The larger network will in general need to have recurrent (feedback) connections since at least some real gene regulation networks do. This is the basic modeling approach taken, which describes how a set of recurrent neural networks can be used as a modeling language for multiple developmental processes including gene regulation within a single cell, cell-cell communication, and cell division. Such network models have been called "gene circuits", "gene regulation networks", or "genetic regulatory networks", sometimes without distinguishing the models from the actual modeled systems.
Safari-Alighiarloo, Nahid; Taghizadeh, Mohammad; Tabatabaei, Seyyed Mohammad; Namaki, Saeed
2016-01-01
Background The involvement of multiple genes and missing heritability, which are dominant in complex diseases such as multiple sclerosis (MS), entail using network biology to better elucidate their molecular basis and genetic factors. We therefore aimed to integrate interactome (protein–protein interaction (PPI)) and transcriptomes data to construct and analyze PPI networks for MS disease. Methods Gene expression profiles in paired cerebrospinal fluid (CSF) and peripheral blood mononuclear cells (PBMCs) samples from MS patients, sampled in relapse or remission and controls, were analyzed. Differentially expressed genes which determined only in CSF (MS vs. control) and PBMCs (relapse vs. remission) separately integrated with PPI data to construct the Query-Query PPI (QQPPI) networks. The networks were further analyzed to investigate more central genes, functional modules and complexes involved in MS progression. Results The networks were analyzed and high centrality genes were identified. Exploration of functional modules and complexes showed that the majority of high centrality genes incorporated in biological pathways driving MS pathogenesis. Proteasome and spliceosome were also noticeable in enriched pathways in PBMCs (relapse vs. remission) which were identified by both modularity and clique analyses. Finally, STK4, RB1, CDKN1A, CDK1, RAC1, EZH2, SDCBP genes in CSF (MS vs. control) and CDC37, MAP3K3, MYC genes in PBMCs (relapse vs. remission) were identified as potential candidate genes for MS, which were the more central genes involved in biological pathways. Discussion This study showed that network-based analysis could explicate the complex interplay between biological processes underlying MS. Furthermore, an experimental validation of candidate genes can lead to identification of potential therapeutic targets. PMID:28028462
Integration of multi-omics data for integrative gene regulatory network inference.
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun; Kang, Mingon
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called 'multi-omics data', that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN's capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed.
Integration of multi-omics data for integrative gene regulatory network inference
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called ‘multi-omics data’, that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN’s capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed. PMID:29354189
Constructing an integrated gene similarity network for the identification of disease genes.
Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin
2017-09-20
Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T
2014-12-01
Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).
Construct and Compare Gene Coexpression Networks with DAPfinder and DAPview.
Skinner, Jeff; Kotliarov, Yuri; Varma, Sudhir; Mine, Karina L; Yambartsev, Anatoly; Simon, Richard; Huyen, Yentram; Morgun, Andrey
2011-07-14
DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks and identify significant differences in pairwise gene-gene coexpression between two phenotypes. Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma experiments and microarray simulations demonstrate the utility of these tools. DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks.
Deng, Wenping; Zhang, Kui; Liu, Sanzhen; Zhao, Patrick; Xu, Shizhong; Wei, Hairong
2018-04-30
Joint reconstruction of multiple gene regulatory networks (GRNs) using gene expression data from multiple tissues/conditions is very important for understanding common and tissue/condition-specific regulation. However, there are currently no computational models and methods available for directly constructing such multiple GRNs that not only share some common hub genes but also possess tissue/condition-specific regulatory edges. In this paper, we proposed a new graphic Gaussian model for joint reconstruction of multiple gene regulatory networks (JRmGRN), which highlighted hub genes, using gene expression data from several tissues/conditions. Under the framework of Gaussian graphical model, JRmGRN method constructs the GRNs through maximizing a penalized log likelihood function. We formulated it as a convex optimization problem, and then solved it with an alternating direction method of multipliers (ADMM) algorithm. The performance of JRmGRN was first evaluated with synthetic data and the results showed that JRmGRN outperformed several other methods for reconstruction of GRNs. We also applied our method to real Arabidopsis thaliana RNA-seq data from two light regime conditions in comparison with other methods, and both common hub genes and some conditions-specific hub genes were identified with higher accuracy and precision. JRmGRN is available as a R program from: https://github.com/wenpingd. hairong@mtu.edu. Proof of theorem, derivation of algorithm and supplementary data are available at Bioinformatics online.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
2012-01-01
Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html. PMID:22531049
Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro
2010-04-21
The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
Ji, S C; Pan, Y T; Lu, Q Y; Sun, Z Y; Liu, Y Z
2014-03-17
The purpose of this study was to identify critical genes associated with septic multiple trauma by comparing peripheral whole blood samples from multiple trauma patients with and without sepsis. A microarray data set was downloaded from the Gene Expression Omnibus (GEO) database. This data set included 70 samples, 36 from multiple trauma patients with sepsis and 34 from multiple trauma patients without sepsis (as a control set). The data were preprocessed, and differentially expressed genes (DEGs) were then screened for using packages of the R language. Functional analysis of DEGs was performed with DAVID. Interaction networks were then established for the most up- and down-regulated genes using HitPredict. Pathway-enrichment analysis was conducted for genes in the networks using WebGestalt. Fifty-eight DEGs were identified. The expression levels of PLAU (down-regulated) and MMP8 (up-regulated) presented the largest fold-changes, and interaction networks were established for these genes. Further analysis revealed that PLAT (plasminogen activator, tissue) and SERPINF2 (serpin peptidase inhibitor, clade F, member 2), which interact with PLAU, play important roles in the pathway of the component and coagulation cascade. We hypothesize that PLAU is a major regulator of the component and coagulation cascade, and down-regulation of PLAU results in dysfunction of the pathway, causing sepsis.
Cooperation and coexpression: How coexpression networks shift in response to multiple mutualists.
Palakurty, Sathvik X; Stinchcombe, John R; Afkhami, Michelle E
2018-04-01
A mechanistic understanding of community ecology requires tackling the nonadditive effects of multispecies interactions, a challenge that necessitates integration of ecological and molecular complexity-namely moving beyond pairwise ecological interaction studies and the "gene at a time" approach to mechanism. Here, we investigate the consequences of multispecies mutualisms for the structure and function of genomewide differential coexpression networks for the first time, using the tractable and ecologically important interaction between legume Medicago truncatula, rhizobia and mycorrhizal fungi. First, we found that genes whose expression is affected nonadditively by multiple mutualists are more highly connected in gene networks than expected by chance and had 94% greater network centrality than genes showing additive effects, suggesting that nonadditive genes may be key players in the widespread transcriptomic responses to multispecies symbioses. Second, multispecies mutualisms substantially changed coexpression network structure of 18 modules of host plant genes and 22 modules of the fungal symbionts' genes, indicating that third-party mutualists can cause significant rewiring of plant and fungal molecular networks. Third, we found that 60% of the coexpressed gene sets that explained variation in plant performance had coexpression structures that were altered by interactive effects of rhizobia and fungi. Finally, an "across-symbiosis" approach identified sets of plant and mycorrhizal genes whose coexpression structure was unique to the multiple mutualist context and suggested coupled responses across the plant-mycorrhizal interaction to rhizobial mutualists. Taken together, these results show multispecies mutualisms have substantial effects on the molecular interactions in host plants, microbes and across symbiotic boundaries. © 2018 John Wiley & Sons Ltd.
Blatti, Charles; Sinha, Saurabh
2016-07-15
Analysis of co-expressed gene sets typically involves testing for enrichment of different annotations or 'properties' such as biological processes, pathways, transcription factor binding sites, etc., one property at a time. This common approach ignores any known relationships among the properties or the genes themselves. It is believed that known biological relationships among genes and their many properties may be exploited to more accurately reveal commonalities of a gene set. Previous work has sought to achieve this by building biological networks that combine multiple types of gene-gene or gene-property relationships, and performing network analysis to identify other genes and properties most relevant to a given gene set. Most existing network-based approaches for recognizing genes or annotations relevant to a given gene set collapse information about different properties to simplify (homogenize) the networks. We present a network-based method for ranking genes or properties related to a given gene set. Such related genes or properties are identified from among the nodes of a large, heterogeneous network of biological information. Our method involves a random walk with restarts, performed on an initial network with multiple node and edge types that preserve more of the original, specific property information than current methods that operate on homogeneous networks. In this first stage of our algorithm, we find the properties that are the most relevant to the given gene set and extract a subnetwork of the original network, comprising only these relevant properties. We then re-rank genes by their similarity to the given gene set, based on a second random walk with restarts, performed on the above subnetwork. We demonstrate the effectiveness of this algorithm for ranking genes related to Drosophila embryonic development and aggressive responses in the brains of social animals. DRaWR was implemented as an R package available at veda.cs.illinois.edu/DRaWR. blatti@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
2004-03-01
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Bagot, Rosemary C; Cates, Hannah M; Purushothaman, Immanuel; Lorsch, Zachary S; Walker, Deena M; Wang, Junshi; Huang, Xiaojie; Schlüter, Oliver M; Maze, Ian; Peña, Catherine J; Heller, Elizabeth A; Issler, Orna; Wang, Minghui; Song, Won-Min; Stein, Jason L; Liu, Xiaochuan; Doyle, Marie A; Scobie, Kimberly N; Sun, Hao Sheng; Neve, Rachael L; Geschwind, Daniel; Dong, Yan; Shen, Li; Zhang, Bin; Nestler, Eric J
2016-06-01
Depression is a complex, heterogeneous disorder and a leading contributor to the global burden of disease. Most previous research has focused on individual brain regions and genes contributing to depression. However, emerging evidence in humans and animal models suggests that dysregulated circuit function and gene expression across multiple brain regions drive depressive phenotypes. Here, we performed RNA sequencing on four brain regions from control animals and those susceptible or resilient to chronic social defeat stress at multiple time points. We employed an integrative network biology approach to identify transcriptional networks and key driver genes that regulate susceptibility to depressive-like symptoms. Further, we validated in vivo several key drivers and their associated transcriptional networks that regulate depression susceptibility and confirmed their functional significance at the levels of gene transcription, synaptic regulation, and behavior. Our study reveals novel transcriptional networks that control stress susceptibility and offers fundamentally new leads for antidepressant drug discovery. Copyright © 2016 Elsevier Inc. All rights reserved.
Fast Construction of Near Parsimonious Hybridization Networks for Multiple Phylogenetic Trees.
Mirzaei, Sajad; Wu, Yufeng
2016-01-01
Hybridization networks represent plausible evolutionary histories of species that are affected by reticulate evolutionary processes. An established computational problem on hybridization networks is constructing the most parsimonious hybridization network such that each of the given phylogenetic trees (called gene trees) is "displayed" in the network. There have been several previous approaches, including an exact method and several heuristics, for this NP-hard problem. However, the exact method is only applicable to a limited range of data, and heuristic methods can be less accurate and also slow sometimes. In this paper, we develop a new algorithm for constructing near parsimonious networks for multiple binary gene trees. This method is more efficient for large numbers of gene trees than previous heuristics. This new method also produces more parsimonious results on many simulated datasets as well as a real biological dataset than a previous method. We also show that our method produces topologically more accurate networks for many datasets.
Hormonal response to bidirectional selection on social behavior
USDA-ARS?s Scientific Manuscript database
Behavior is a quantitative trait determined through the actions of multiple genes. These genes form pleiotropic networks that are sensitive to environmental variation and genetic background. One aspect of behavioral gene networks that is of special interest includes effects during early development....
NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482
Nigam, Deepti; Sawant, Samir V
2013-01-01
Technological development led to an increased interest in systems biological approaches in plants to characterize developmental mechanism and candidate genes relevant to specific tissue or cell morphology. AUX-IAA proteins are important plant-specific putative transcription factors. There are several reports on physiological response of this family in Arabidopsis but in cotton fiber the transcriptional network through which AUX-IAA regulated its target genes is still unknown. in-silico modelling of cotton fiber development specific gene expression data (108 microarrays and 22,737 genes) using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals 3690 putative AUX-IAA target genes of which 139 genes were known to be AUX-IAA co-regulated within Arabidopsis. Further AUX-IAA targeted gene regulatory network (GRN) had substantial impact on the transcriptional dynamics of cotton fiber, as showed by, altered TF networks, and Gene Ontology (GO) biological processes and metabolic pathway associated with its target genes. Analysis of the AUX-IAA-correlated gene network reveals multiple functions for AUX-IAA target genes such as unidimensional cell growth, cellular nitrogen compound metabolic process, nucleosome organization, DNA-protein complex and process related to cell wall. These candidate networks/pathways have a variety of profound impacts on such cellular functions as stress response, cell proliferation, and cell differentiation. While these functions are fairly broad, their underlying TF networks may provide a global view of AUX-IAA regulated gene expression and a GRN that guides future studies in understanding role of AUX-IAA box protein and its targets regulating fiber development. PMID:24497725
2012-01-01
Background Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. Results In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. Conclusion By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach. PMID:22691450
Analysis of bHLH coding genes using gene co-expression network approach.
Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok
2016-07-01
Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.
Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H
2017-12-20
Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
Lee, A Yeong; Park, Won; Kang, Tae-Wook; Cha, Min Ho; Chun, Jin Mi
2018-07-15
Yijin-Tang (YJT) is a traditional prescription for the treatment of hyperlipidaemia, atherosclerosis and other ailments related to dampness phlegm, a typical pathological symptom of abnormal body fluid metabolism in Traditional Korean Medicine. However, a holistic network pharmacology approach to understanding the therapeutic mechanisms underlying hyperlipidaemia and atherosclerosis has not been pursued. To examine the network pharmacological potential effects of YJT on hyperlipidaemia and atherosclerosis, we analysed components, performed target prediction and network analysis, and investigated interacting pathways using a network pharmacology approach. Information on compounds in herbal medicines was obtained from public databases, and oral bioavailability and drug-likeness was screened using absorption, distribution, metabolism, and excretion (ADME) criteria. Correlations between compounds and genes were linked using the STITCH database, and genes related to hyperlipidaemia and atherosclerosis were gathered using the GeneCards database. Human genes were identified and subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Network analysis identified 447 compounds in five herbal medicines that were subjected to ADME screening, and 21 compounds and 57 genes formed the main pathways linked to hyperlipidaemia and atherosclerosis. Among them, 10 compounds (naringenin, nobiletin, hesperidin, galangin, glycyrrhizin, homogentisic acid, stigmasterol, 6-gingerol, quercetin and glabridin) were linked to more than four genes, and are bioactive compounds and key chemicals. Core genes in this network were CASP3, CYP1A1, CYP1A2, MMP2 and MMP9. The compound-target gene network revealed close interactions between multiple components and multiple targets, and facilitates a better understanding of the potential therapeutic effects of YJT. Pharmacological network analysis can help to explain the potential effects of YJT for treating dampness phlegm-related diseases such as hyperlipidaemia and atherosclerosis. Copyright © 2018 Elsevier B.V. All rights reserved.
TimeXNet Web: Identifying cellular response networks from diverse omics time-course data.
Tan, Phit Ling; López, Yosvany; Nakai, Kenta; Patil, Ashwini
2018-05-14
Condition-specific time-course omics profiles are frequently used to study cellular response to stimuli and identify associated signaling pathways. However, few online tools allow users to analyze multiple types of high-throughput time-course data. TimeXNet Web is a web server that extracts a time-dependent gene/protein response network from time-course transcriptomic, proteomic or phospho-proteomic data, and an input interaction network. It classifies the given genes/proteins into time-dependent groups based on the time of their highest activity and identifies the most probable paths connecting genes/proteins in consecutive groups. The response sub-network is enriched in activated genes/proteins and contains novel regulators that do not show any observable change in the input data. Users can view the resultant response network and analyze it for functional enrichment. TimeXNet Web supports the analysis of high-throughput data from multiple species by providing high quality, weighted protein-protein interaction networks for 12 model organisms. http://txnet.hgc.jp/. ashwini@hgc.jp. Supplementary data are available at Bioinformatics online.
Shanley, Thomas P; Cvijanovich, Natalie; Lin, Richard; Allen, Geoffrey L; Thomas, Neal J; Doctor, Allan; Kalyanaraman, Meena; Tofil, Nancy M; Penfil, Scott; Monaco, Marie; Odoms, Kelli; Barnes, Michael; Sakthivel, Bhuvaneswari; Aronow, Bruce J; Wong, Hector R
2007-01-01
We have conducted longitudinal studies focused on the expression profiles of signaling pathways and gene networks in children with septic shock. Genome-level expression profiles were generated from whole blood-derived RNA of children with septic shock (n = 30) corresponding to day one and day three of septic shock, respectively. Based on sequential statistical and expression filters, day one and day three of septic shock were characterized by differential regulation of 2,142 and 2,504 gene probes, respectively, relative to controls (n = 15). Venn analysis demonstrated 239 unique genes in the day one dataset, 598 unique genes in the day three dataset, and 1,906 genes common to both datasets. Functional analyses demonstrated time-dependent, differential regulation of genes involved in multiple signaling pathways and gene networks primarily related to immunity and inflammation. Notably, multiple and distinct gene networks involving T cell- and MHC antigen-related biology were persistently downregulated on both day one and day three. Further analyses demonstrated large scale, persistent downregulation of genes corresponding to functional annotations related to zinc homeostasis. These data represent the largest reported cohort of patients with septic shock subjected to longitudinal genome-level expression profiling. The data further advance our genome-level understanding of pediatric septic shock and support novel hypotheses. PMID:17932561
Grewal, Nivit; Singh, Shailendra; Chand, Trilok
2017-01-01
Owing to the innate noise in the biological data sources, a single source or a single measure do not suffice for an effective disease gene prioritization. So, the integration of multiple data sources or aggregation of multiple measures is the need of the hour. The aggregation operators combine multiple related data values to a single value such that the combined value has the effect of all the individual values. In this paper, an attempt has been made for applying the fuzzy aggregation on the network-based disease gene prioritization and investigate its effect under noise conditions. This study has been conducted for a set of 15 blood disorders by fusing four different network measures, computed from the protein interaction network, using a selected set of aggregation operators and ranking the genes on the basis of the aggregated value. The aggregation operator-based rankings have been compared with the "Random walk with restart" gene prioritization method. The impact of noise has also been investigated by adding varying proportions of noise to the seed set. The results reveal that for all the selected blood disorders, the Mean of Maximal operator has relatively outperformed the other aggregation operators for noisy as well as non-noisy data.
An approach for reduction of false predictions in reverse engineering of gene regulatory networks.
Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar
2018-05-14
A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives. Copyright © 2018 Elsevier Ltd. All rights reserved.
The transfer and transformation of collective network information in gene-matched networks.
Kitsukawa, Takashi; Yagi, Takeshi
2015-10-09
Networks, such as the human society network, social and professional networks, and biological system networks, contain vast amounts of information. Information signals in networks are distributed over nodes and transmitted through intricately wired links, making the transfer and transformation of such information difficult to follow. Here we introduce a novel method for describing network information and its transfer using a model network, the Gene-matched network (GMN), in which nodes (neurons) possess attributes (genes). In the GMN, nodes are connected according to their expression of common genes. Because neurons have multiple genes, the GMN is cluster-rich. We show that, in the GMN, information transfer and transformation were controlled systematically, according to the activity level of the network. Furthermore, information transfer and transformation could be traced numerically with a vector using genes expressed in the activated neurons, the active-gene array, which was used to assess the relative activity among overlapping neuronal groups. Interestingly, this coding style closely resembles the cell-assembly neural coding theory. The method introduced here could be applied to many real-world networks, since many systems, including human society and various biological systems, can be represented as a network of this type.
Bottom-up GGM algorithm for constructing multiple layered hierarchical gene regulatory networks
USDA-ARS?s Scientific Manuscript database
Multilayered hierarchical gene regulatory networks (ML-hGRNs) are very important for understanding genetics regulation of biological pathways. However, there are currently no computational algorithms available for directly building ML-hGRNs that regulate biological pathways. A bottom-up graphic Gaus...
The relationship between gene transcription and combinations of histone modifications
NASA Astrophysics Data System (ADS)
Cui, Xiangjun; Li, Hong; Luo, Liaofu
2012-09-01
Histone modification is an important subject of epigenetics which plays an intrinsic role in transcriptional regulation. It is known that multiple histone modifications act in a combinatorial fashion. In this study, we demonstrated that the pathways within constructed Bayesian networks can give an indication for the combinations among 12 histone modifications which have been studied in the TSS+1kb region in S. cerevisiae. After Bayesian networks for the genes with high transcript levels (H-network) and low transcript levels (L-network) were constructed, the combinations of modifications within the two networks were analyzed from the view of transcript level. The results showed that different combinations played dissimilar roles in the regulation of gene transcription when there exist differences for gene expression at transcription level.
Jia, Peilin; Wang, Lily; Fanous, Ayman H.; Pato, Carlos N.; Edwards, Todd L.; Zhao, Zhongming
2012-01-01
With the recent success of genome-wide association studies (GWAS), a wealth of association data has been accomplished for more than 200 complex diseases/traits, proposing a strong demand for data integration and interpretation. A combinatory analysis of multiple GWAS datasets, or an integrative analysis of GWAS data and other high-throughput data, has been particularly promising. In this study, we proposed an integrative analysis framework of multiple GWAS datasets by overlaying association signals onto the protein-protein interaction network, and demonstrated it using schizophrenia datasets. Building on a dense module search algorithm, we first searched for significantly enriched subnetworks for schizophrenia in each single GWAS dataset and then implemented a discovery-evaluation strategy to identify module genes with consistent association signals. We validated the module genes in an independent dataset, and also examined them through meta-analysis of the related SNPs using multiple GWAS datasets. As a result, we identified 205 module genes with a joint effect significantly associated with schizophrenia; these module genes included a number of well-studied candidate genes such as DISC1, GNA12, GNA13, GNAI1, GPR17, and GRIN2B. Further functional analysis suggested these genes are involved in neuronal related processes. Additionally, meta-analysis found that 18 SNPs in 9 module genes had P meta<1×10−4, including the gene HLA-DQA1 located in the MHC region on chromosome 6, which was reported in previous studies using the largest cohort of schizophrenia patients to date. These results demonstrated our bi-directional network-based strategy is efficient for identifying disease-associated genes with modest signals in GWAS datasets. This approach can be applied to any other complex diseases/traits where multiple GWAS datasets are available. PMID:22792057
Discovering time-lagged rules from microarray data using gene profile classifiers
2011-01-01
Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation. PMID:21524308
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S.
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites. PMID:27588023
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites.
Functional Module Analysis for Gene Coexpression Networks with Network Integration.
Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K
2015-01-01
Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.
When is hub gene selection better than standard meta-analysis?
Langfelder, Peter; Mischel, Paul S; Horvath, Steve
2013-01-01
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to gene expression data and presents novel R functions for carrying out consensus network analysis, network based screening, and meta analysis.
Hou, Chunyu; Wang, Fei; Liu, Xuewen; Chang, Guangming; Wang, Feng; Geng, Xin
2017-08-01
Telomerase reverse transcriptase (TERT) is the protein component of telomerase complex. Evidence has accumulated showing that the nontelomeric functions of TERT are independent of telomere elongation. However, the mechanisms governing the interaction between TERT and its target genes are not clearly revealed. The biological functions of TERT are not fully elucidated and have thus far been underestimated. To further explore these functions, we investigated TERT interaction networks using multiple bioinformatic databases, including BioGRID, STRING, DAVID, GeneCards, GeneMANIA, PANTHER, miRWalk, mirTarBase, miRNet, miRDB, and TargetScan. In addition, network diagrams were built using Cytoscape software. As competing endogenous RNAs (ceRNAs) are endogenous transcripts that compete for the binding of microRNAs (miRNAs) by using shared miRNA recognition elements, they are involved in creating widespread regulatory networks. Therefore, the ceRNA regulatory networks of TERT were also investigated in this study. Interestingly, we found that the three genes PABPC1, SLC7A11, and TP53 were present in both TERT interaction networks and ceRNAs target genes. It was predicted that TERT might play nontelomeric roles in the generation or development of some rare diseases, such as Rift Valley fever and dyscalculia. Thus, our data will help to decipher the interaction networks of TERT and reveal the unknown functions of telomerase in cancer and aging-related diseases.
Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex
2010-01-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
Ficklin, Stephen P; Luo, Feng; Feltus, F Alex
2010-09-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
Musungu, Bryan M; Bhatnagar, Deepak; Brown, Robert L; Payne, Gary A; OBrian, Greg; Fakhoury, Ahmad M; Geisler, Matt
2016-01-01
A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus , a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays , and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays , there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus . Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus .
Musungu, Bryan M.; Bhatnagar, Deepak; Brown, Robert L.; Payne, Gary A.; OBrian, Greg; Fakhoury, Ahmad M.; Geisler, Matt
2016-01-01
A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus, a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays, and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays, there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus. Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus. PMID:27917194
2012-01-01
Background Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Results Over 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since multiple TLRs were found in the generic fever network, it is reasonable to hypothesize that vaccine-TLR interactions may play an important role in inducing fever response, which deserves a further investigation. Conclusions This study demonstrated that ontology-based literature mining is a powerful method for analyzing gene interaction networks and generating new scientific hypotheses. PMID:23256563
Markov State Models of gene regulatory networks.
Chu, Brian K; Tse, Margaret J; Sato, Royce R; Read, Elizabeth L
2017-02-06
Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fate-decisions. Quantitative models that can link molecular-level knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cell-reprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking. The method identifies the number, phenotypes, and lifetimes of long-lived states for a set of common gene regulatory network models. Application of transition path theory to the constructed Markov State Model decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic state-switching. In this proof-of-concept study, we found that the Markov State Model provides a general framework for analyzing and visualizing stochastic multistability and state-transitions in gene networks. Our results suggest that this framework-adopted from the field of atomistic Molecular Dynamics-can be a useful tool for quantitative Systems Biology at the network scale.
A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning
2018-01-01
Risk stratification model for lung cancer with gene expression profile is of great interest. Instead of previous models based on individual prognostic genes, we aimed to develop a novel system-level risk stratification model for lung adenocarcinoma based on gene coexpression network. Using multiple microarray, gene coexpression network analysis was performed to identify survival-related networks. A deep learning based risk stratification model was constructed with representative genes of these networks. The model was validated in two test sets. Survival analysis was performed using the output of the model to evaluate whether it could predict patients' survival independent of clinicopathological variables. Five networks were significantly associated with patients' survival. Considering prognostic significance and representativeness, genes of the two survival-related networks were selected for input of the model. The output of the model was significantly associated with patients' survival in two test sets and training set (p < 0.00001, p < 0.0001 and p = 0.02 for training and test sets 1 and 2, resp.). In multivariate analyses, the model was associated with patients' prognosis independent of other clinicopathological features. Our study presents a new perspective on incorporating gene coexpression networks into the gene expression signature and clinical application of deep learning in genomic data science for prognosis prediction. PMID:29581968
Chen, Yuefeng; Wei, Tao; Yan, Lei; Lawrence, Frank; Qian, Hui-Rong; Burkholder, Timothy P; Starling, James J; Yingling, Jonathan M; Shou, Jianyong
2008-01-01
Background Tumor angiogenesis is a highly regulated process involving intercellular communication as well as the interactions of multiple downstream signal transduction pathways. Disrupting one or even a few angiogenesis pathways is often insufficient to achieve sustained therapeutic benefits due to the complexity of angiogenesis. Targeting multiple angiogenic pathways has been increasingly recognized as a viable strategy. However, translation of the polypharmacology of a given compound to its antiangiogenic efficacy remains a major technical challenge. Developing a global functional association network among angiogenesis-related genes is much needed to facilitate holistic understanding of angiogenesis and to aid the development of more effective anti-angiogenesis therapeutics. Results We constructed a comprehensive gene functional association network or interactome by transcript profiling an in vitro angiogenesis model, in which human umbilical vein endothelial cells (HUVECs) formed capillary structures when co-cultured with normal human dermal fibroblasts (NHDFs). HUVEC competence and NHDF supportiveness of cord formation were found to be highly cell-passage dependent. An enrichment test of Biological Processes (BP) of differentially expressed genes (DEG) revealed that angiogenesis related BP categories significantly changed with cell passages. Built upon 2012 DEGs identified from two microarray studies, the resulting interactome captured 17226 functional gene associations and displayed characteristics of a scale-free network. The interactome includes the involvement of oncogenes and tumor suppressor genes in angiogenesis. We developed a network walking algorithm to extract connectivity information from the interactome and applied it to simulate the level of network perturbation by three multi-targeted anti-angiogenic kinase inhibitors. Simulated network perturbation correlated with observed anti-angiogenesis activity in a cord formation bioassay. Conclusion We established a comprehensive gene functional association network to model in vitro angiogenesis regulation. The present study provided a proof-of-concept pilot of applying network perturbation analysis to drug phenotypic activity assessment. PMID:18518970
Discovery and validation of a glioblastoma co-expressed gene module
Dunwoodie, Leland J.; Poehlman, William L.; Ficklin, Stephen P.; Feltus, Frank Alexander
2018-01-01
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network. PMID:29541392
Discovery and validation of a glioblastoma co-expressed gene module.
Dunwoodie, Leland J; Poehlman, William L; Ficklin, Stephen P; Feltus, Frank Alexander
2018-02-16
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network.
Network neighborhood analysis with the multi-node topological overlap measure.
Li, Ai; Horvath, Steve
2007-01-15
The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topological overlap measure. Since the use of multiple nodes in the seed set may lead to more informative neighborhoods, it can be advantageous to define multi-node similarity measures. The pairwise topological overlap measure is generalized to multiple network nodes and subsequently used in a recursive neighborhood construction method. A local permutation scheme is used to determine the neighborhood size. Using four network applications and a simulated example, we provide empirical evidence that the resulting neighborhoods are biologically meaningful, e.g. we use neighborhood analysis to identify brain cancer related genes. An executable Windows program and tutorial for multi-node topological overlap measure (MTOM) based analysis can be downloaded from the webpage (http://www.genetics.ucla.edu/labs/horvath/MTOM/).
A fast and high performance multiple data integration algorithm for identifying human disease genes
2015-01-01
Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620
Limit cycles in piecewise-affine gene network models with multiple interaction loops
NASA Astrophysics Data System (ADS)
Farcot, Etienne; Gouzé, Jean-Luc
2010-01-01
In this article, we consider piecewise affine differential equations modelling gene networks. We work with arbitrary decay rates, and under a local hypothesis expressed as an alignment condition of successive focal points. The interaction graph of the system may be rather complex (multiple intricate loops of any sign, multiple thresholds, etc.). Our main result is an alternative theorem showing that if a sequence of region is periodically visited by trajectories, then under our hypotheses, there exists either a unique stable periodic solution, or the origin attracts all trajectories in this sequence of regions. This result extends greatly our previous work on a single negative feedback loop. We give several examples and simulations illustrating different cases.
Disentangling the multigenic and pleiotropic nature of molecular function
2015-01-01
Background Biological processes at the molecular level are usually represented by molecular interaction networks. Function is organised and modularity identified based on network topology, however, this approach often fails to account for the dynamic and multifunctional nature of molecular components. For example, a molecule engaging in spatially or temporally independent functions may be inappropriately clustered into a single functional module. To capture biologically meaningful sets of interacting molecules, we use experimentally defined pathways as spatial/temporal units of molecular activity. Results We defined functional profiles of Saccharomyces cerevisiae based on a minimal set of Gene Ontology terms sufficient to represent each pathway's genes. The Gene Ontology terms were used to annotate 271 pathways, accounting for pathway multi-functionality and gene pleiotropy. Pathways were then arranged into a network, linked by shared functionality. Of the genes in our data set, 44% appeared in multiple pathways performing a diverse set of functions. Linking pathways by overlapping functionality revealed a modular network with energy metabolism forming a sparse centre, surrounded by several denser clusters comprised of regulatory and metabolic pathways. Signalling pathways formed a relatively discrete cluster connected to the centre of the network. Genetic interactions were enriched within the clusters of pathways by a factor of 5.5, confirming the organisation of our pathway network is biologically significant. Conclusions Our representation of molecular function according to pathway relationships enables analysis of gene/protein activity in the context of specific functional roles, as an alternative to typical molecule-centric graph-based methods. The pathway network demonstrates the cooperation of multiple pathways to perform biological processes and organises pathways into functionally related clusters with interdependent outcomes. PMID:26678917
NASA Astrophysics Data System (ADS)
Chen, Ye; Wolanyk, Nathaniel; Ilker, Tunc; Gao, Shouguo; Wang, Xujing
Methods developed based on bifurcation theory have demonstrated their potential in driving network identification for complex human diseases, including the work by Chen, et al. Recently bifurcation theory has been successfully applied to model cellular differentiation. However, there one often faces a technical challenge in driving network prediction: time course cellular differentiation study often only contains one sample at each time point, while driving network prediction typically require multiple samples at each time point to infer the variation and interaction structures of candidate genes for the driving network. In this study, we investigate several methods to identify both the critical time point and the driving network through examination of how each time point affects the autocorrelation and phase locking. We apply these methods to a high-throughput sequencing (RNA-Seq) dataset of 42 subsets of thymocytes and mature peripheral T cells at multiple time points during their differentiation (GSE48138 from GEO). We compare the predicted driving genes with known transcription regulators of cellular differentiation. We will discuss the advantages and limitations of our proposed methods, as well as potential further improvements of our methods.
MINER: exploratory analysis of gene interaction networks by machine learning from expression data.
Kadupitige, Sidath Randeni; Leung, Kin Chun; Sellmeier, Julia; Sivieng, Jane; Catchpoole, Daniel R; Bain, Michael E; Gaëta, Bruno A
2009-12-03
The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. We have developed MINER (Microarray Interactive Network Exploration and Representation), an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Evolutionary rewiring of bacterial regulatory networks
Taylor, Tiffany B.; Mulley, Geraldine; McGuffin, Liam J.; Johnson, Louise J.; Brockhurst, Michael A.; Arseneault, Tanya; Silby, Mark W.; Jackson, Robert W.
2015-01-01
Bacteria have evolved complex regulatory networks that enable integration of multiple intracellular and extracellular signals to coordinate responses to environmental changes. However, our knowledge of how regulatory systems function and evolve is still relatively limited. There is often extensive homology between components of different networks, due to past cycles of gene duplication, divergence, and horizontal gene transfer, raising the possibility of cross-talk or redundancy. Consequently, evolutionary resilience is built into gene networks - homology between regulators can potentially allow rapid rescue of lost regulatory function across distant regions of the genome. In our recent study [Taylor, et al. Science (2015), 347(6225)] we find that mutations that facilitate cross-talk between pathways can contribute to gene network evolution, but that such mutations come with severe pleiotropic costs. Arising from this work are a number of questions surrounding how this phenomenon occurs. PMID:28357301
Gene regulatory network inference using fused LASSO on multiple data sets
Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran
2016-01-01
Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687
Chan, Kei Hang K; Huang, Yen-Tsung; Meng, Qingying; Wu, Chunyuan; Reiner, Alexander; Sobel, Eric M; Tinker, Lesley; Lusis, Aldons J; Yang, Xia; Liu, Simin
2014-12-01
Although cardiovascular disease (CVD) and type 2 diabetes mellitus (T2D) share many common risk factors, potential molecular mechanisms that may also be shared for these 2 disorders remain unknown. Using an integrative pathway and network analysis, we performed genome-wide association studies in 8155 blacks, 3494 Hispanic American, and 3697 Caucasian American women who participated in the national Women's Health Initiative single-nucleotide polymorphism (SNP) Health Association Resource and the Genomics and Randomized Trials Network. Eight top pathways and gene networks related to cardiomyopathy, calcium signaling, axon guidance, cell adhesion, and extracellular matrix seemed to be commonly shared between CVD and T2D across all 3 ethnic groups. We also identified ethnicity-specific pathways, such as cell cycle (specific for Hispanic American and Caucasian American) and tight junction (CVD and combined CVD and T2D in Hispanic American). In network analysis of gene-gene or protein-protein interactions, we identified key drivers that included COL1A1, COL3A1, and ELN in the shared pathways for both CVD and T2D. These key driver genes were cross-validated in multiple mouse models of diabetes mellitus and atherosclerosis. Our integrative analysis of American women of 3 ethnicities identified multiple shared biological pathways and key regulatory genes for the development of CVD and T2D. These prospective findings also support the notion that ethnicity-specific susceptibility genes and process are involved in the pathogenesis of CVD and T2D. © 2014 American Heart Association, Inc.
Origins of extrinsic variability in eukaryotic gene expression
NASA Astrophysics Data System (ADS)
Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff
2006-02-01
Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes simultaneously, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modelling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous lower limit for expression variability. A second source, which is modelled as originating from a common upstream transcription factor, exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Origins of extrinsic variability in eukaryotic gene expression
NASA Astrophysics Data System (ADS)
Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff
2006-03-01
Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes in concert, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modeling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous noise floor in expression variability. A second source which is modeled as originating from a common upstream transcription factor exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Zhou, Haibo; Liu, Junlai; Zhou, Changyang; Gao, Ni; Rao, Zhiping; Li, He; Hu, Xinde; Li, Changlin; Yao, Xuan; Shen, Xiaowen; Sun, Yidi; Wei, Yu; Liu, Fei; Ying, Wenqin; Zhang, Junming; Tang, Cheng; Zhang, Xu; Xu, Huatai; Shi, Linyu; Cheng, Leping; Huang, Pengyu; Yang, Hui
2018-03-01
Despite rapid progresses in the genome-editing field, in vivo simultaneous overexpression of multiple genes remains challenging. We generated a transgenic mouse using an improved dCas9 system that enables simultaneous and precise in vivo transcriptional activation of multiple genes and long noncoding RNAs in the nervous system. As proof of concept, we were able to use targeted activation of endogenous neurogenic genes in these transgenic mice to directly and efficiently convert astrocytes into functional neurons in vivo. This system provides a flexible and rapid screening platform for studying complex gene networks and gain-of-function phenotypes in the mammalian brain.
Li, Cheng-Wei; Chen, Bor-Sen
2016-01-01
Epigenetic and microRNA (miRNA) regulation are associated with carcinogenesis and the development of cancer. By using the available omics data, including those from next-generation sequencing (NGS), genome-wide methylation profiling, candidate integrated genetic and epigenetic network (IGEN) analysis, and drug response genome-wide microarray analysis, we constructed an IGEN system based on three coupling regression models that characterize protein-protein interaction networks (PPINs), gene regulatory networks (GRNs), miRNA regulatory networks (MRNs), and epigenetic regulatory networks (ERNs). By applying system identification method and principal genome-wide network projection (PGNP) to IGEN analysis, we identified the core network biomarkers to investigate bladder carcinogenic mechanisms and design multiple drug combinations for treating bladder cancer with minimal side-effects. The progression of DNA repair and cell proliferation in stage 1 bladder cancer ultimately results not only in the derepression of miR-200a and miR-200b but also in the regulation of the TNF pathway to metastasis-related genes or proteins, cell proliferation, and DNA repair in stage 4 bladder cancer. We designed a multiple drug combination comprising gefitinib, estradiol, yohimbine, and fulvestrant for treating stage 1 bladder cancer with minimal side-effects, and another multiple drug combination comprising gefitinib, estradiol, chlorpromazine, and LY294002 for treating stage 4 bladder cancer with minimal side-effects.
Evolving phenotypic networks in silico.
François, Paul
2014-11-01
Evolved gene networks are constrained by natural selection. Their structures and functions are consequently far from being random, as exemplified by the multiple instances of parallel/convergent evolution. One can thus ask if features of actual gene networks can be recovered from evolutionary first principles. I review a method for in silico evolution of small models of gene networks aiming at performing predefined biological functions. I summarize the current implementation of the algorithm, insisting on the construction of a proper "fitness" function. I illustrate the approach on three examples: biochemical adaptation, ligand discrimination and vertebrate segmentation (somitogenesis). While the structure of the evolved networks is variable, dynamics of our evolved networks are usually constrained and present many similar features to actual gene networks, including properties that were not explicitly selected for. In silico evolution can thus be used to predict biological behaviours without a detailed knowledge of the mapping between genotype and phenotype. Copyright © 2014 The Author. Published by Elsevier Ltd.. All rights reserved.
Peng, Jiajie; Zhang, Xuanshuo; Hui, Weiwei; Lu, Junya; Li, Qianqian; Liu, Shuhui; Shang, Xuequn
2018-03-19
Gene Ontology (GO) is one of the most popular bioinformatics resources. In the past decade, Gene Ontology-based gene semantic similarity has been effectively used to model gene-to-gene interactions in multiple research areas. However, most existing semantic similarity approaches rely only on GO annotations and structure, or incorporate only local interactions in the co-functional network. This may lead to inaccurate GO-based similarity resulting from the incomplete GO topology structure and gene annotations. We present NETSIM2, a new network-based method that allows researchers to measure GO-based gene functional similarities by considering the global structure of the co-functional network with a random walk with restart (RWR)-based method, and by selecting the significant term pairs to decrease the noise information. Based on the EC number (Enzyme Commission)-based groups of yeast and Arabidopsis, evaluation test shows that NETSIM2 can enhance the accuracy of Gene Ontology-based gene functional similarity. Using NETSIM2 as an example, we found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
Pan- and core- network analysis of co-expression genes in a model plant
He, Fei; Maslov, Sergei
2016-12-16
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Pan- and core- network analysis of co-expression genes in a model plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Fei; Maslov, Sergei
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María
2014-09-26
Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different networks. By simultaneously exploring these networks and metadata, we gained insights into regulatory mechanisms in M. tuberculosis that could not be obtained through the separate analysis of each data type.
A pathway-based network analysis of hypertension-related genes
NASA Astrophysics Data System (ADS)
Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng
2016-02-01
Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.
Use of Network Inference to Elucidate Common and Chemical-specific Effects on Steoidogenesis
Microarray data is a key source for modeling gene regulatory interactions. Regulatory network models based on multiple datasets are potentially more robust and can provide greater confidence. In this study, we used network modeling on microarray data generated by exposing the fat...
Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent
Zhu, Sha; Degnan, James H.
2017-01-01
Abstract Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable—that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. PMID:27780899
Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo
2014-06-01
In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Shi, Xingjie; Zhao, Qing; Huang, Jian; Xie, Yang; Ma, Shuangge
2015-01-01
Motivation: Both gene expression levels (GEs) and copy number alterations (CNAs) have important biological implications. GEs are partly regulated by CNAs, and much effort has been devoted to understanding their relations. The regulation analysis is challenging with one gene expression possibly regulated by multiple CNAs and one CNA potentially regulating the expressions of multiple genes. The correlations among GEs and among CNAs make the analysis even more complicated. The existing methods have limitations and cannot comprehensively describe the regulation. Results: A sparse double Laplacian shrinkage method is developed. It jointly models the effects of multiple CNAs on multiple GEs. Penalization is adopted to achieve sparsity and identify the regulation relationships. Network adjacency is computed to describe the interconnections among GEs and among CNAs. Two Laplacian shrinkage penalties are imposed to accommodate the network adjacency measures. Simulation shows that the proposed method outperforms the competing alternatives with more accurate marker identification. The Cancer Genome Atlas data are analysed to further demonstrate advantages of the proposed method. Availability and implementation: R code is available at http://works.bepress.com/shuangge/49/ Contact: shuangge.ma@yale.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26342102
Fyn-Dependent Gene Networks in Acute Ethanol Sensitivity
Farris, Sean P.; Miles, Michael F.
2013-01-01
Studies in humans and animal models document that acute behavioral responses to ethanol are predisposing factor for the risk of long-term drinking behavior. Prior microarray data from our laboratory document strain- and brain region-specific variation in gene expression profile responses to acute ethanol that may be underlying regulators of ethanol behavioral phenotypes. The non-receptor tyrosine kinase Fyn has previously been mechanistically implicated in the sedative-hypnotic response to acute ethanol. To further understand how Fyn may modulate ethanol behaviors, we used whole-genome expression profiling. We characterized basal and acute ethanol-evoked (3 g/kg) gene expression patterns in nucleus accumbens (NAC), prefrontal cortex (PFC), and ventral midbrain (VMB) of control and Fyn knockout mice. Bioinformatics analysis identified a set of Fyn-related gene networks differently regulated by acute ethanol across the three brain regions. In particular, our analysis suggested a coordinate basal decrease in myelin-associated gene expression within NAC and PFC as an underlying factor in sensitivity of Fyn null animals to ethanol sedation. An in silico analysis across the BXD recombinant inbred (RI) strains of mice identified a significant correlation between Fyn expression and a previously published ethanol loss-of-righting-reflex (LORR) phenotype. By combining PFC gene expression correlates to Fyn and LORR across multiple genomic datasets, we identified robust Fyn-centric gene networks related to LORR. Our results thus suggest that multiple system-wide changes exist within specific brain regions of Fyn knockout mice, and that distinct Fyn-dependent expression networks within PFC may be important determinates of the LORR due to acute ethanol. These results add to the interpretation of acute ethanol behavioral sensitivity in Fyn kinase null animals, and identify Fyn-centric gene networks influencing variance in ethanol LORR. Such networks may also inform future design of pharmacotherapies for the treatment and prevention of alcohol use disorders. PMID:24312422
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks.
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks.
Özgür, Arzucan; Hur, Junguk; He, Yongqun
2016-01-01
The Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks. INO has been demonstrated to be valuable in providing a set of structured ontological terms and associated keywords to support literature mining of gene-gene interactions from biomedical literature. However, previous work using INO focused on single keyword matching, while many interactions are represented with two or more interaction keywords used in combination. This paper reports our extension of INO to include combinatory patterns of two or more literature mining keywords co-existing in one sentence to represent specific INO interaction classes. Such keyword combinations and related INO interaction type information could be automatically obtained via SPARQL queries, formatted in Excel format, and used in an INO-supported SciMiner, an in-house literature mining program. We studied the gene interaction sentences from the commonly used benchmark Learning Logic in Language (LLL) dataset and one internally generated vaccine-related dataset to identify and analyze interaction types containing multiple keywords. Patterns obtained from the dependency parse trees of the sentences were used to identify the interaction keywords that are related to each other and collectively represent an interaction type. The INO ontology currently has 575 terms including 202 terms under the interaction branch. The relations between the INO interaction types and associated keywords are represented using the INO annotation relations: 'has literature mining keywords' and 'has keyword dependency pattern'. The keyword dependency patterns were generated via running the Stanford Parser to obtain dependency relation types. Out of the 107 interactions in the LLL dataset represented with two-keyword interaction types, 86 were identified by using the direct dependency relations. The LLL dataset contained 34 gene regulation interaction types, each of which associated with multiple keywords. A hierarchical display of these 34 interaction types and their ancestor terms in INO resulted in the identification of specific gene-gene interaction patterns from the LLL dataset. The phenomenon of having multi-keyword interaction types was also frequently observed in the vaccine dataset. By modeling and representing multiple textual keywords for interaction types, the extended INO enabled the identification of complex biological gene-gene interactions represented with multiple keywords.
Kim, Yongsoo; Kim, Taek-Kyun; Kim, Yungu; Yoo, Jiho; You, Sungyong; Lee, Inyoul; Carlson, George; Hood, Leroy; Choi, Seungjin; Hwang, Daehee
2011-01-01
Motivation: Systems biology attempts to describe complex systems behaviors in terms of dynamic operations of biological networks. However, there is lack of tools that can effectively decode complex network dynamics over multiple conditions. Results: We present principal network analysis (PNA) that can automatically capture major dynamic activation patterns over multiple conditions and then generate protein and metabolic subnetworks for the captured patterns. We first demonstrated the utility of this method by applying it to a synthetic dataset. The results showed that PNA correctly captured the subnetworks representing dynamics in the data. We further applied PNA to two time-course gene expression profiles collected from (i) MCF7 cells after treatments of HRG at multiple doses and (ii) brain samples of four strains of mice infected with two prion strains. The resulting subnetworks and their interactions revealed network dynamics associated with HRG dose-dependent regulation of cell proliferation and differentiation and early PrPSc accumulation during prion infection. Availability: The web-based software is available at: http://sbm.postech.ac.kr/pna. Contact: dhhwang@postech.ac.kr; seungjin@postech.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193522
PyPanda: a Python package for gene regulatory network reconstruction
van IJzendoorn, David G.P.; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L.
2016-01-01
Summary: PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of ‘omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. Availability and implementation: The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda. Contact: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl PMID:27402905
PyPanda: a Python package for gene regulatory network reconstruction.
van IJzendoorn, David G P; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L
2016-11-01
PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of 'omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda CONTACT: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl. © The Author 2016. Published by Oxford University Press.
A Scalable Approach for Discovering Conserved Active Subnetworks across Species
Verfaillie, Catherine M.; Hu, Wei-Shou; Myers, Chad L.
2010-01-01
Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. Despite successes in finding active subnetworks in the context of a single species, the idea of overlaying lists of differentially expressed genes on networks has not yet been extended to support the analysis of multiple species' interaction networks. To address this problem, we designed a scalable, cross-species network search algorithm, neXus (Network - cross(X)-species - Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. Our approach leverages functional linkage networks, which provide more comprehensive coverage of functional relationships than physical interaction networks by combining heterogeneous types of genomic data. We applied our cross-species approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on parallel gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved active subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Using a variation of this approach, we also find a number of species-specific networks, which likely reflect mechanisms of stem cell function that have diverged between mouse and human. We assess the statistical significance of the subnetworks by comparing them with subnetworks discovered on random permutations of the differential expression data. We also describe several case examples that illustrate the utility of comparative analysis of active subnetworks. PMID:21170309
Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R
2018-01-01
The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.
Network motifs – recurring circuitry components in biological systems
Environmental perturbations, elicited by chemicals, dietary supplements, and drugs, can alter the dynamics of the molecular circuits and networks operating in cells, leading to multiple disease endpoints. Multi-component signal transduction pathways and gene regulatory circuits u...
Cellular and synaptic network defects in autism
Peça, João; Feng, Guoping
2012-01-01
Many candidate genes are now thought to confer susceptibility to autism spectrum disorder (ASD). Here we review four interrelated complexes, each composed of multiple families of genes that functionally coalesce on common cellular pathways. We illustrate a common thread in the organization of glutamatergic synapses and suggest a link between genes involved in Tuberous Sclerosis Complex, Fragile X syndrome, Angelman syndrome and several synaptic ASD candidate genes. When viewed in this context, progress in deciphering the molecular architecture of cellular protein-protein interactions together with the unraveling of synaptic dysfunction in neural networks may prove pivotal to advancing our understanding of ASDs. PMID:22440525
Using genetic markers to orient the edges in quantitative trait networks: the NEO software.
Aten, Jason E; Fuller, Tova F; Lusis, Aldons J; Horvath, Steve
2008-04-15
Systems genetic studies have been used to identify genetic loci that affect transcript abundances and clinical traits such as body weight. The pairwise correlations between gene expression traits and/or clinical traits can be used to define undirected trait networks. Several authors have argued that genetic markers (e.g expression quantitative trait loci, eQTLs) can serve as causal anchors for orienting the edges of a trait network. The availability of hundreds of thousands of genetic markers poses new challenges: how to relate (anchor) traits to multiple genetic markers, how to score the genetic evidence in favor of an edge orientation, and how to weigh the information from multiple markers. We develop and implement Network Edge Orienting (NEO) methods and software that address the challenges of inferring unconfounded and directed gene networks from microarray-derived gene expression data by integrating mRNA levels with genetic marker data and Structural Equation Model (SEM) comparisons. The NEO software implements several manual and automatic methods for incorporating genetic information to anchor traits. The networks are oriented by considering each edge separately, thus reducing error propagation. To summarize the genetic evidence in favor of a given edge orientation, we propose Local SEM-based Edge Orienting (LEO) scores that compare the fit of several competing causal graphs. SEM fitting indices allow the user to assess local and overall model fit. The NEO software allows the user to carry out a robustness analysis with regard to genetic marker selection. We demonstrate the utility of NEO by recovering known causal relationships in the sterol homeostasis pathway using liver gene expression data from an F2 mouse cross. Further, we use NEO to study the relationship between a disease gene and a biologically important gene co-expression module in liver tissue. The NEO software can be used to orient the edges of gene co-expression networks or quantitative trait networks if the edges can be anchored to genetic marker data. R software tutorials, data, and supplementary material can be downloaded from: http://www.genetics.ucla.edu/labs/horvath/aten/NEO.
Mei, Suyu
2018-05-04
Bacterial protein-protein interaction (PPI) networks are significant to reveal the machinery of signal transduction and drug resistance within bacterial cells. The database STRING has collected a large number of bacterial pathogen PPI networks, but most of the data are of low quality without being experimentally or computationally validated, thus restricting its further biomedical applications. We exploit the experimental data via four solutions to enhance the quality of M. tuberculosis H37Rv (MTB) PPI networks in STRING. Computational results show that the experimental data derived jointly by two-hybrid and copurification approaches are the most reliable to train an L 2 -regularized logistic regression model for MTB PPI network validation. On the basis of the validated MTB PPI networks, we further study the three problems via breadth-first graph search algorithm: (1) discovery of MTB drug-resistance pathways through searching for the paths between known drug-target genes and drug-resistance genes, (2) choosing potential cotarget genes via searching for the critical genes located on multiple pathways, and (3) choosing essential drug-target genes via analysis of network degree distribution. In addition, we further combine the validated MTB PPI networks with human PPI networks to analyze the potential pharmacological risks of known and candidate drug-target genes from the point of view of system pharmacology. The evidence from protein structure alignment demonstrates that the drugs that act on MTB target genes could also adversely act on human signaling pathways.
Microbial genotype-phenotype mapping by class association rule mining.
Tamura, Makio; D'haeseleer, Patrik
2008-07-01
Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient class association rule mining algorithm, netCAR, in order to extract sets of COGs (clusters of orthologous groups of proteins) associated with a phenotype from COG phylogenetic profiles and a phenotype profile. netCAR takes into account the phylogenetic co-occurrence graph between COGs to restrict hypothesis space, and uses mutual information to evaluate the biconditional relation. We examined the mining capability of pairwise and multiple-to-one association by using netCAR to extract COGs relevant to six microbial phenotypes (aerobic, anaerobic, facultative, endospore, motility and Gram negative) from 11,969 unique COG profiles across 155 prokaryotic organisms. With the same level of false discovery rate, multiple-to-one association can extract about 10 times more relevant COGs than one-to-one association. We also reveal various topologies of association networks among COGs (modules) from extracted multiple-to-one correlation rules relevant with the six phenotypes; including a well-connected network for motility, a star-shaped network for aerobic and intermediate topologies for the other phenotypes. netCAR outperforms a standard CAR mining algorithm, CARapriori, while requiring several orders of magnitude less computational time for extracting 3-COG sets. Source code of the Java implementation is available as Supplementary Material at the Bioinformatics online website, or upon request to the author. Supplementary data are available at Bioinformatics online.
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Background Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. Methodology To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. Principal Findings We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks. PMID:26927540
Cell cycle gene expression networks discovered using systems biology: Significance in carcinogenesis
Scott, RE; Ghule, PN; Stein, JL; Stein, GS
2015-01-01
The early stages of carcinogenesis are linked to defects in the cell cycle. A series of cell cycle checkpoints are involved in this process. The G1/S checkpoint that serves to integrate the control of cell proliferation and differentiation is linked to carcinogenesis and the mitotic spindle checkpoint with the development of chromosomal instability. This paper presents the outcome of systems biology studies designed to evaluate if networks of covariate cell cycle gene transcripts exist in proliferative mammalian tissues including mice, rats and humans. The GeneNetwork website that contains numerous gene expression datasets from different species, sexes and tissues represents the foundational resource for these studies (www.genenetwork.org). In addition, WebGestalt, a gene ontology tool, facilitated the identification of expression networks of genes that co-vary with key cell cycle targets, especially Cdc20 and Plk1 (www.bioinfo.vanderbilt.edu/webgestalt). Cell cycle expression networks of such covariate mRNAs exist in multiple proliferative tissues including liver, lung, pituitary, adipose and lymphoid tissues among others but not in brain or retina that have low proliferative potential. Sixty-three covariate cell cycle gene transcripts (mRNAs) compose the average cell cycle network with p = e−13 to e−36. Cell cycle expression networks show species, sex and tissue variability and they are enriched in mRNA transcripts associated with mitosis many of which are associated with chromosomal instability. PMID:25808367
2014-01-01
Background Network inference of gene expression data is an important challenge in systems biology. Novel algorithms may provide more detailed gene regulatory networks (GRN) for complex, chronic inflammatory diseases such as rheumatoid arthritis (RA), in which activated synovial fibroblasts (SFBs) play a major role. Since the detailed mechanisms underlying this activation are still unclear, simultaneous investigation of multi-stimuli activation of SFBs offers the possibility to elucidate the regulatory effects of multiple mediators and to gain new insights into disease pathogenesis. Methods A GRN was therefore inferred from RA-SFBs treated with 4 different stimuli (IL-1 β, TNF- α, TGF- β, and PDGF-D). Data from time series microarray experiments (0, 1, 2, 4, 12 h; Affymetrix HG-U133 Plus 2.0) were batch-corrected applying ‘ComBat’, analyzed for differentially expressed genes over time with ‘Limma’, and used for the inference of a robust GRN with NetGenerator V2.0, a heuristic ordinary differential equation-based method with soft integration of prior knowledge. Results Using all genes differentially expressed over time in RA-SFBs for any stimulus, and selecting the genes belonging to the most significant gene ontology (GO) term, i.e., ‘cartilage development’, a dynamic, robust, moderately complex multi-stimuli GRN was generated with 24 genes and 57 edges in total, 31 of which were gene-to-gene edges. Prior literature-based knowledge derived from Pathway Studio or manual searches was reflected in the final network by 25/57 confirmed edges (44%). The model contained known network motifs crucial for dynamic cellular behavior, e.g., cross-talk among pathways, positive feed-back loops, and positive feed-forward motifs (including suppression of the transcriptional repressor OSR2 by all 4 stimuli. Conclusion A multi-stimuli GRN highly concordant with literature data was successfully generated by network inference from the gene expression of stimulated RA-SFBs. The GRN showed high reliability, since 10 predicted edges were independently validated by literature findings post network inference. The selected GO term ‘cartilage development’ contained a number of differentiation markers, growth factors, and transcription factors with potential relevance for RA. Finally, the model provided new insight into the response of RA-SFBs to multiple stimuli implicated in the pathogenesis of RA, in particular to the ‘novel’ potent growth factor PDGF-D. PMID:24989895
Modena, Brian D; Bleecker, Eugene R; Busse, William W; Erzurum, Serpil C; Gaston, Benjamin M; Jarjour, Nizar N; Meyers, Deborah A; Milosevic, Jadranka; Tedrow, John R; Wu, Wei; Kaminski, Naftali; Wenzel, Sally E
2017-06-01
Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Identify networks of genes reflective of underlying biological processes that define SA. Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12-21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes.
Modena, Brian D.; Bleecker, Eugene R.; Busse, William W.; Erzurum, Serpil C.; Gaston, Benjamin M.; Jarjour, Nizar N.; Meyers, Deborah A.; Milosevic, Jadranka; Tedrow, John R.; Wu, Wei; Kaminski, Naftali
2017-01-01
Rationale: Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Objectives: Identify networks of genes reflective of underlying biological processes that define SA. Methods: Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Measurements and Main Results: Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12–21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. Conclusions: In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes. PMID:27984699
Analysis of the dynamic co-expression network of heart regeneration in the zebrafish
Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco
2016-01-01
The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320
Analysis of the dynamic co-expression network of heart regeneration in the zebrafish
NASA Astrophysics Data System (ADS)
Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco
2016-05-01
The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration.
Single and multiple phenotype QTL analyses of downy mildew resistance in interspecific grapevines.
Divilov, Konstantin; Barba, Paola; Cadle-Davidson, Lance; Reisch, Bruce I
2018-05-01
Downy mildew resistance across days post-inoculation, experiments, and years in two interspecific grapevine F 1 families was investigated using linear mixed models and Bayesian networks, and five new QTL were identified. Breeding grapevines for downy mildew disease resistance has traditionally relied on qualitative gene resistance, which can be overcome by pathogen evolution. Analyzing two interspecific F 1 families, both having ancestry derived from Vitis vinifera and wild North American Vitis species, across 2 years and multiple experiments, we found multiple loci associated with downy mildew sporulation and hypersensitive response in both families using a single phenotype model. The loci explained between 7 and 17% of the variance for either phenotype, suggesting a complex genetic architecture for these traits in the two families studied. For two loci, we used RNA-Seq to detect differentially transcribed genes and found that the candidate genes at these loci were likely not NBS-LRR genes. Additionally, using a multiple phenotype Bayesian network analysis, we found effects between the leaf trichome density, hypersensitive response, and sporulation phenotypes. Moderate-high heritabilities were found for all three phenotypes, suggesting that selection for downy mildew resistance is an achievable goal by breeding for either physical- or non-physical-based resistance mechanisms, with the combination of the two possibly providing durable resistance.
Kreula, Sanna M.; Kaewphan, Suwisa; Ginter, Filip
2018-01-01
The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from ‘reading the literature’. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already ‘known’, and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to (i) discover novel candidate associations between different genes or proteins in the network, and (ii) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource. PMID:29844966
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra
Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less
Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; Ng, Patrick; Khraiwesh, Basel; Jaiswal, Ashish; Jijakli, Kenan; Koussa, Joseph; Nelson, David R; Cai, Hong; Yang, Xinping; Chang, Roger L; Papin, Jason; Yu, Haiyuan; Balaji, Santhanam; Salehi-Ashtiani, Kourosh
2016-07-19
Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolic network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. The defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.
Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; ...
2016-06-14
Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less
GeneNetFinder2: Improved Inference of Dynamic Gene Regulatory Relations with Multiple Regulators.
Han, Kyungsook; Lee, Jeonghoon
2016-01-01
A gene involved in complex regulatory interactions may have multiple regulators since gene expression in such interactions is often controlled by more than one gene. Another thing that makes gene regulatory interactions complicated is that regulatory interactions are not static, but change over time during the cell cycle. Most research so far has focused on identifying gene regulatory relations between individual genes in a particular stage of the cell cycle. In this study we developed a method for identifying dynamic gene regulations of several types from the time-series gene expression data. The method can find gene regulations with multiple regulators that work in combination or individually as well as those with single regulators. The method has been implemented as the second version of GeneNetFinder (hereafter called GeneNetFinder2) and tested on several gene expression datasets. Experimental results with gene expression data revealed the existence of genes that are not regulated by individual genes but rather by a combination of several genes. Such gene regulatory relations cannot be found by conventional methods. Our method finds such regulatory relations as well as those with multiple, independent regulators or single regulators, and represents gene regulatory relations as a dynamic network in which different gene regulatory relations are shown in different stages of the cell cycle. GeneNetFinder2 is available at http://bclab.inha.ac.kr/GeneNetFinder and will be useful for modeling dynamic gene regulations with multiple regulators.
Computing all hybridization networks for multiple binary phylogenetic input trees.
Albrecht, Benjamin
2015-07-30
The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events is the computation of hybridization networks. This work presents the first algorithm computing the hybridization number as well as a set of representative hybridization networks for multiple binary phylogenetic input trees on the same set of taxa. To improve its practical runtime, we show how this algorithm can be parallelized. Moreover, we demonstrate the efficiency of the software Hybroscale, containing an implementation of our algorithm, by comparing it to PIRNv2.0, which is so far the best available software computing the exact hybridization number for multiple binary phylogenetic trees on the same set of taxa. The algorithm is part of the software Hybroscale, which was developed specifically for the investigation of hybridization networks including their computation and visualization. Hybroscale is freely available(1) and runs on all three major operating systems. Our simulation study indicates that our approach is on average 100 times faster than PIRNv2.0. Moreover, we show how Hybroscale improves the interpretation of the reported hybridization networks by adding certain features to its graphical representation.
Liu, Yonghong; Liu, Yuanyuan; Wu, Jiaming; Roizman, Bernard; Zhou, Grace Guoying
2018-04-03
Analyses of the levels of mRNAs encoding IFIT1, IFI16, RIG-1, MDA5, CXCL10, LGP2, PUM1, LSD1, STING, and IFNβ in cell lines from which the gene encoding LGP2, LSD1, PML, HDAC4, IFI16, PUM1, STING, MDA5, IRF3, or HDAC 1 had been knocked out, as well as the ability of these cell lines to support the replication of HSV-1, revealed the following: ( i ) Cell lines lacking the gene encoding LGP2, PML, or HDAC4 (cluster 1) exhibited increased levels of expression of partially overlapping gene networks. Concurrently, these cell lines produced from 5 fold to 12 fold lower yields of HSV-1 than the parental cells. ( ii ) Cell lines lacking the genes encoding STING, LSD1, MDA5, IRF3, or HDAC 1 (cluster 2) exhibited decreased levels of mRNAs of partially overlapping gene networks. Concurrently, these cell lines produced virus yields that did not differ from those produced by the parental cell line. The genes up-regulated in cell lines forming cluster 1, overlapped in part with genes down-regulated in cluster 2. The key conclusions are that gene knockouts and subsequent selection for growth causes changes in expression of multiple genes, and hence the phenotype of the cell lines cannot be ascribed to a single gene; the patterns of gene expression may be shared by multiple knockouts; and the enhanced immunity to viral replication by cluster 1 knockout cell lines but not by cluster 2 cell lines suggests that in parental cells, the expression of innate resistance to infection is specifically repressed.
Criticality Is an Emergent Property of Genetic Networks that Exhibit Evolvability
Torres-Sosa, Christian; Huang, Sui; Aldana, Maximino
2012-01-01
Accumulating experimental evidence suggests that the gene regulatory networks of living organisms operate in the critical phase, namely, at the transition between ordered and chaotic dynamics. Such critical dynamics of the network permits the coexistence of robustness and flexibility which are necessary to ensure homeostatic stability (of a given phenotype) while allowing for switching between multiple phenotypes (network states) as occurs in development and in response to environmental change. However, the mechanisms through which genetic networks evolve such critical behavior have remained elusive. Here we present an evolutionary model in which criticality naturally emerges from the need to balance between the two essential components of evolvability: phenotype conservation and phenotype innovation under mutations. We simulated the Darwinian evolution of random Boolean networks that mutate gene regulatory interactions and grow by gene duplication. The mutating networks were subjected to selection for networks that both (i) preserve all the already acquired phenotypes (dynamical attractor states) and (ii) generate new ones. Our results show that this interplay between extending the phenotypic landscape (innovation) while conserving the existing phenotypes (conservation) suffices to cause the evolution of all the networks in a population towards criticality. Furthermore, the networks produced by this evolutionary process exhibit structures with hubs (global regulators) similar to the observed topology of real gene regulatory networks. Thus, dynamical criticality and certain elementary topological properties of gene regulatory networks can emerge as a byproduct of the evolvability of the phenotypic landscape. PMID:22969419
Ni, Jingchao; Koyuturk, Mehmet; Tong, Hanghang; Haines, Jonathan; Xu, Rong; Zhang, Xiang
2016-11-10
Accurately prioritizing candidate disease genes is an important and challenging problem. Various network-based methods have been developed to predict potential disease genes by utilizing the disease similarity network and molecular networks such as protein interaction or gene co-expression networks. Although successful, a common limitation of the existing methods is that they assume all diseases share the same molecular network and a single generic molecular network is used to predict candidate genes for all diseases. However, different diseases tend to manifest in different tissues, and the molecular networks in different tissues are usually different. An ideal method should be able to incorporate tissue-specific molecular networks for different diseases. In this paper, we develop a robust and flexible method to integrate tissue-specific molecular networks for disease gene prioritization. Our method allows each disease to have its own tissue-specific network(s). We formulate the problem of candidate gene prioritization as an optimization problem based on network propagation. When there are multiple tissue-specific networks available for a disease, our method can automatically infer the relative importance of each tissue-specific network. Thus it is robust to the noisy and incomplete network data. To solve the optimization problem, we develop fast algorithms which have linear time complexities in the number of nodes in the molecular networks. We also provide rigorous theoretical foundations for our algorithms in terms of their optimality and convergence properties. Extensive experimental results show that our method can significantly improve the accuracy of candidate gene prioritization compared with the state-of-the-art methods. In our experiments, we compare our methods with 7 popular network-based disease gene prioritization algorithms on diseases from Online Mendelian Inheritance in Man (OMIM) database. The experimental results demonstrate that our methods recover true associations more accurately than other methods in terms of AUC values, and the performance differences are significant (with paired t-test p-values less than 0.05). This validates the importance to integrate tissue-specific molecular networks for studying disease gene prioritization and show the superiority of our network models and ranking algorithms toward this purpose. The source code and datasets are available at http://nijingchao.github.io/CRstar/ .
Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra
NASA Astrophysics Data System (ADS)
Schadt, Eric
2005-03-01
The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Wisdom of crowds for robust gene network inference
Marbach, Daniel; Costello, James C.; Küffner, Robert; Vega, Nicci; Prill, Robert J.; Camacho, Diogo M.; Allison, Kyle R.; Kellis, Manolis; Collins, James J.; Stolovitzky, Gustavo
2012-01-01
Reconstructing gene regulatory networks from high-throughput data is a long-standing problem. Through the DREAM project (Dialogue on Reverse Engineering Assessment and Methods), we performed a comprehensive blind assessment of over thirty network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae, and in silico microarray data. We characterize performance, data requirements, and inherent biases of different inference approaches offering guidelines for both algorithm application and development. We observe that no single inference method performs optimally across all datasets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse datasets. Thereby, we construct high-confidence networks for E. coli and S. aureus, each comprising ~1700 transcriptional interactions at an estimated precision of 50%. We experimentally test 53 novel interactions in E. coli, of which 23 were supported (43%). Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks. PMID:22796662
Gong, Bin-Sheng; Zhang, Qing-Pu; Zhang, Guang-Mei; Zhang, Shao-Jun; Zhang, Wei; Lv, Hong-Chao; Zhang, Fan; Lv, Sa-Li; Li, Chuan-Xing; Rao, Shao-Qi; Li, Xia
2007-01-01
Gene expression profiles and single-nucleotide polymorphism (SNP) profiles are modern data for genetic analysis. It is possible to use the two types of information to analyze the relationships among genes by some genetical genomics approaches. In this study, gene expression profiles were used as expression traits. And relationships among the genes, which were co-linked to a common SNP(s), were identified by integrating the two types of information. Further research on the co-expressions among the co-linked genes was carried out after the gene-SNP relationships were established using the Haseman-Elston sib-pair regression. The results showed that the co-expressions among the co-linked genes were significantly higher if the number of connections between the genes and a SNP(s) was more than six. Then, the genes were interconnected via one or more SNP co-linkers to construct a gene-SNP intermixed network. The genes sharing more SNPs tended to have a stronger correlation. Finally, a gene-gene network was constructed with their intensities of relationships (the number of SNP co-linkers shared) as the weights for the edges. PMID:18466544
Analysis of Gene Regulatory Networks of Maize in Response to Nitrogen.
Jiang, Lu; Ball, Graham; Hodgman, Charlie; Coules, Anne; Zhao, Han; Lu, Chungui
2018-03-08
Nitrogen (N) fertilizer has a major influence on the yield and quality. Understanding and optimising the response of crop plants to nitrogen fertilizer usage is of central importance in enhancing food security and agricultural sustainability. In this study, the analysis of gene regulatory networks reveals multiple genes and biological processes in response to N. Two microarray studies have been used to infer components of the nitrogen-response network. Since they used different array technologies, a map linking the two probe sets to the maize B73 reference genome has been generated to allow comparison. Putative Arabidopsis homologues of maize genes were used to query the Biological General Repository for Interaction Datasets (BioGRID) network, which yielded the potential involvement of three transcription factors (TFs) (GLK5, MADS64 and bZIP108) and a Calcium-dependent protein kinase. An Artificial Neural Network was used to identify influential genes and retrieved bZIP108 and WRKY36 as significant TFs in both microarray studies, along with genes for Asparagine Synthetase, a dual-specific protein kinase and a protein phosphatase. The output from one study also suggested roles for microRNA (miRNA) 399b and Nin-like Protein 15 (NLP15). Co-expression-network analysis of TFs with closely related profiles to known Nitrate-responsive genes identified GLK5, GLK8 and NLP15 as candidate regulators of genes repressed under low Nitrogen conditions, while bZIP108 might play a role in gene activation.
Teren, A; Kirsten, H; Beutner, F; Scholz, M; Holdt, L M; Teupser, D; Gutberlet, M; Thiery, J; Schuler, G; Eitel, I
2017-02-03
Prognostic relevant pathways of leukocyte involvement in human myocardial ischemic-reperfusion injury are largely unknown. We enrolled 136 patients with ST-elevation myocardial infarction (STEMI) after primary angioplasty within 12 h after onset of symptoms. Following reperfusion, whole blood was collected within a median time interval of 20 h (interquartile range: 15-25 h) for genome-wide gene expression analysis. Subsequent CMR scans were performed using a standard protocol to determine infarct size (IS), area at risk (AAR), myocardial salvage index (MSI) and the extent of late microvascular obstruction (lateMO). We found 398 genes associated with lateMO and two genes with IS. Neither AAR, nor MSI showed significant correlations with gene expression. Genes correlating with lateMO were strongly related to several canonical pathways, including positive regulation of T-cell activation (p = 3.44 × 10 -5 ), and regulation of inflammatory response (p = 1.86 × 10 -3 ). Network analysis of multiple gene expression alterations associated with larger lateMO identified the following functional consequences: facilitated utilisation and decreased concentration of free fatty acid, repressed cell differentiation, enhanced phagocyte movement, increased cell death, vascular disease and compensatory vasculogenesis. In conclusion, the extent of lateMO after acute, reperfused STEMI correlated with altered activation of multiple genes related to fatty acid utilisation, lymphocyte differentiation, phagocyte mobilisation, cell survival, and vascular dysfunction.
Mahoney, J. Matthew; Taroni, Jaclyn; Martyanov, Viktor; Wood, Tammara A.; Greene, Casey S.; Pioli, Patricia A.; Hinchcliff, Monique E.; Whitfield, Michael L.
2015-01-01
Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6–12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes using a gene-gene interaction network, and place the genetic risk loci in the context of the intrinsic subsets. To identify gene expression modules common to three independent datasets from three different clinical centers, we developed a consensus clustering procedure based on mutual information of partitions, an information theory concept, and performed a meta-analysis of these genome-wide gene expression datasets. We created a gene-gene interaction network of the conserved molecular features across the intrinsic subsets and analyzed their connections with SSc-associated genetic polymorphisms. The network is composed of distinct, but interconnected, components related to interferon activation, M2 macrophages, adaptive immunity, extracellular matrix remodeling, and cell proliferation. The network shows extensive connections between the inflammatory- and fibroproliferative-specific genes. The network also shows connections between these subset-specific genes and 30 SSc-associated polymorphic genes including STAT4, BLK, IRF7, NOTCH4, PLAUR, CSK, IRAK1, and several human leukocyte antigen (HLA) genes. Our analyses suggest that the gene expression changes underlying the SSc subsets may be long-lived, but mechanistically interconnected and related to a patients underlying genetic risk. PMID:25569146
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-12-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.
VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine).
Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M
2013-12-16
Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis) whereby the recovered sub-networks reconfirm established plant gene functions and also identify novel associations. Together, we present valuable insights into grapevine transcriptional regulation by developing network models applicable to researchers in their prioritisation of gene candidates, for on-going study of biological processes related to grapevine development, metabolism and stress responses.
Yeast Phenomics: An Experimental Approach for Modeling Gene Interaction Networks that Buffer Disease
Hartman, John L.; Stisher, Chandler; Outlaw, Darryl A.; Guo, Jingyu; Shah, Najaf A.; Tian, Dehua; Santos, Sean M.; Rodgers, John W.; White, Richard A.
2015-01-01
The genome project increased appreciation of genetic complexity underlying disease phenotypes: many genes contribute each phenotype and each gene contributes multiple phenotypes. The aspiration of predicting common disease in individuals has evolved from seeking primary loci to marginal risk assignments based on many genes. Genetic interaction, defined as contributions to a phenotype that are dependent upon particular digenic allele combinations, could improve prediction of phenotype from complex genotype, but it is difficult to study in human populations. High throughput, systematic analysis of S. cerevisiae gene knockouts or knockdowns in the context of disease-relevant phenotypic perturbations provides a tractable experimental approach to derive gene interaction networks, in order to deduce by cross-species gene homology how phenotype is buffered against disease-risk genotypes. Yeast gene interaction network analysis to date has revealed biology more complex than previously imagined. This has motivated the development of more powerful yeast cell array phenotyping methods to globally model the role of gene interaction networks in modulating phenotypes (which we call yeast phenomic analysis). The article illustrates yeast phenomic technology, which is applied here to quantify gene X media interaction at higher resolution and supports use of a human-like media for future applications of yeast phenomics for modeling human disease. PMID:25668739
Hafemeister, Christoph; Nicotra, Adrienne B.; Jagadish, S.V. Krishna; Bonneau, Richard; Purugganan, Michael
2016-01-01
Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference. PMID:27655842
Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P
2013-03-21
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
Genomic approaches for the elucidation of genes and gene networks underlying cardiovascular traits.
Adriaens, M E; Bezzina, C R
2018-06-22
Genome-wide association studies have shed light on the association between natural genetic variation and cardiovascular traits. However, linking a cardiovascular trait associated locus to a candidate gene or set of candidate genes for prioritization for follow-up mechanistic studies is all but straightforward. Genomic technologies based on next-generation sequencing technology nowadays offer multiple opportunities to dissect gene regulatory networks underlying genetic cardiovascular trait associations, thereby aiding in the identification of candidate genes at unprecedented scale. RNA sequencing in particular becomes a powerful tool when combined with genotyping to identify loci that modulate transcript abundance, known as expression quantitative trait loci (eQTL), or loci modulating transcript splicing known as splicing quantitative trait loci (sQTL). Additionally, the allele-specific resolution of RNA-sequencing technology enables estimation of allelic imbalance, a state where the two alleles of a gene are expressed at a ratio differing from the expected 1:1 ratio. When multiple high-throughput approaches are combined with deep phenotyping in a single study, a comprehensive elucidation of the relationship between genotype and phenotype comes into view, an approach known as systems genetics. In this review, we cover key applications of systems genetics in the broad cardiovascular field.
Constraints on signaling network logic reveal functional subgraphs on Multiple Myeloma OMIC data.
Miannay, Bertrand; Minvielle, Stéphane; Magrangeas, Florence; Guziolowski, Carito
2018-03-21
The integration of gene expression profiles (GEPs) and large-scale biological networks derived from pathways databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic, however our approach does not require a prior species selection according to their gene expression level. We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this confrontation independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We apply our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph links Multiple Myeloma (MM) genes to known receptors for this blood cancer. We discover that the NCI-PID MM graph had 15 independent components, and when confronted to 611 MM GEPs, we find 1 component as being more specific to represent the difference between cancer and healthy profiles.
INfORM: Inference of NetwOrk Response Modules.
Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario
2018-06-15
Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.
Angelici, Bartolomeo; Mailand, Erik; Haefliger, Benjamin; Benenson, Yaakov
2016-08-30
One of the goals of synthetic biology is to develop programmable artificial gene networks that can transduce multiple endogenous molecular cues to precisely control cell behavior. Realizing this vision requires interfacing natural molecular inputs with synthetic components that generate functional molecular outputs. Interfacing synthetic circuits with endogenous mammalian transcription factors has been particularly difficult. Here, we describe a systematic approach that enables integration and transduction of multiple mammalian transcription factor inputs by a synthetic network. The approach is facilitated by a proportional amplifier sensor based on synergistic positive autoregulation. The circuits efficiently transduce endogenous transcription factor levels into RNAi, transcriptional transactivation, and site-specific recombination. They also enable AND logic between pairs of arbitrary transcription factors. The results establish a framework for developing synthetic gene networks that interface with cellular processes through transcriptional regulators. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Functional Interaction Network Construction and Analysis for Disease Discovery.
Wu, Guanming; Haw, Robin
2017-01-01
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers
Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier
2017-01-01
Background The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. Objective MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. Methods MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. Results MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user’s specific interests and provides an efficient way to share information with collaborators. Furthermore, the user’s behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. Conclusions We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. PMID:28623182
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xia, Jing; Rocke, David M.; Perry, George
In late-onset Alzheimer’s disease (AD), multiple brain regions are not affected simultaneously. Comparing the gene expression of the affected regions to identify the differences in the biological processes perturbed can lead to greater insight into AD pathogenesis and early characteristics. We identified differentially expressed (DE) genes from single cell microarray data of four AD affected brain regions: entorhinal cortex (EC), hippocampus (HIP), posterior cingulate cortex (PCC), and middle temporal gyrus (MTG). We organized the DE genes in the four brain regions into region-specific gene coexpression networks. Differential neighborhood analyses in the coexpression networks were performed to identify genes with lowmore » topological overlap (TO) of their direct neighbors. The low TO genes were used to characterize the biological differences between two regions. Our analyses show that increased oxidative stress, along with alterations in lipid metabolism in neurons, may be some of the very early events occurring in AD pathology. Cellular defense mechanisms try to intervene but fail, finally resulting in AD pathology as the disease progresses. Furthermore, disease annotation of the low TO genes in two independent protein interaction networks has resulted in association between cancer, diabetes, renal diseases, and cardiovascular diseases.« less
Xia, Jing; Rocke, David M.; Perry, George; ...
2014-01-01
In late-onset Alzheimer’s disease (AD), multiple brain regions are not affected simultaneously. Comparing the gene expression of the affected regions to identify the differences in the biological processes perturbed can lead to greater insight into AD pathogenesis and early characteristics. We identified differentially expressed (DE) genes from single cell microarray data of four AD affected brain regions: entorhinal cortex (EC), hippocampus (HIP), posterior cingulate cortex (PCC), and middle temporal gyrus (MTG). We organized the DE genes in the four brain regions into region-specific gene coexpression networks. Differential neighborhood analyses in the coexpression networks were performed to identify genes with lowmore » topological overlap (TO) of their direct neighbors. The low TO genes were used to characterize the biological differences between two regions. Our analyses show that increased oxidative stress, along with alterations in lipid metabolism in neurons, may be some of the very early events occurring in AD pathology. Cellular defense mechanisms try to intervene but fail, finally resulting in AD pathology as the disease progresses. Furthermore, disease annotation of the low TO genes in two independent protein interaction networks has resulted in association between cancer, diabetes, renal diseases, and cardiovascular diseases.« less
Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang
2014-01-01
Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN.
Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang
2014-01-01
Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN. PMID:25423109
Gene co-expression networks shed light into diseases of brain iron accumulation
Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M.; Botía, Juan A.; Collingwood, Joanna F.; Hardy, John; Milward, Elizabeth A.; Ryten, Mina; Houlden, Henry
2016-01-01
Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700
Gene co-expression networks shed light into diseases of brain iron accumulation.
Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M; Botía, Juan A; Collingwood, Joanna F; Hardy, John; Milward, Elizabeth A; Ryten, Mina; Houlden, Henry
2016-03-01
Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-01-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918
Zhu, Sha; Degnan, James H; Goldstien, Sharyn J; Eldon, Bjarki
2015-09-15
There has been increasing interest in coalescent models which admit multiple mergers of ancestral lineages; and to model hybridization and coalescence simultaneously. Hybrid-Lambda is a software package that simulates gene genealogies under multiple merger and Kingman's coalescent processes within species networks or species trees. Hybrid-Lambda allows different coalescent processes to be specified for different populations, and allows for time to be converted between generations and coalescent units, by specifying a population size for each population. In addition, Hybrid-Lambda can generate simulated datasets, assuming the infinitely many sites mutation model, and compute the F ST statistic. As an illustration, we apply Hybrid-Lambda to infer the time of subdivision of certain marine invertebrates under different coalescent processes. Hybrid-Lambda makes it possible to investigate biogeographic concordance among high fecundity species exhibiting skewed offspring distribution.
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.
Anitha, P; Anbarasu, Anand; Ramaiah, Sudha
2014-05-01
Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks
Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways. PMID:29049295
Xu, Peng; Wang, Junhua; Sun, Bo; Xiao, Zhongdang
2018-06-15
Investigating the potential biological function of differential changed genes through integrating multiple omics data including miRNA and mRNA expression profiles, is always hot topic. However, how to evaluate the repression effect on target genes integrating miRNA and mRNA expression profiles are not fully solved. In this study, we provide an analyzing method by integrating both miRNAs and mRNAs expression data simultaneously. Difference analysis was adopted based on the repression score, then significantly repressed mRNAs were screened out by DEGseq. Pathway analysis for the significantly repressed mRNAs shows that multiple pathways such as MAPK signaling pathway, TGF-beta signaling pathway and so on, may correlated to the colorectal cancer(CRC). Focusing on the MAPK signaling pathway, a miRNA-mRNA network that centering the cell fate genes was constructed. Finally, the miRNA-mRNAs that potentially important in the CRC carcinogenesis were screened out and scored by impact index. Copyright © 2018 Elsevier B.V. All rights reserved.
Modrák, Martin; Vohradský, Jiří
2018-04-13
Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.
Huang, Sui
2012-02-01
The Neo-Darwinian concept of natural selection is plausible when one assumes a straightforward causation of phenotype by genotype. However, such simple 1:1 mapping must now give place to the modern concepts of gene regulatory networks and gene expression noise. Both can, in the absence of genetic mutations, jointly generate a diversity of inheritable randomly occupied phenotypic states that could also serve as a substrate for natural selection. This form of epigenetic dynamics challenges Neo-Darwinism. It needs to incorporate the non-linear, stochastic dynamics of gene networks. A first step is to consider the mathematical correspondence between gene regulatory networks and Waddington's metaphoric 'epigenetic landscape', which actually represents the quasi-potential function of global network dynamics. It explains the coexistence of multiple stable phenotypes within one genotype. The landscape's topography with its attractors is shaped by evolution through mutational re-wiring of regulatory interactions - offering a link between genetic mutation and sudden, broad evolutionary changes. Copyright © 2012 WILEY Periodicals, Inc.
Zhang, Guanglin; Codoni, Veronica; Yang, Jun; Wilson, James G.; Levy, Daniel; Lusis, Aldons J.; Liu, Simin; Yang, Xia
2017-01-01
Cardiovascular diseases (CVD) and type 2 diabetes (T2D) are closely interrelated complex diseases likely sharing overlapping pathogenesis driven by aberrant activities in gene networks. However, the molecular circuitries underlying the pathogenic commonalities remain poorly understood. We sought to identify the shared gene networks and their key intervening drivers for both CVD and T2D by conducting a comprehensive integrative analysis driven by five multi-ethnic genome-wide association studies (GWAS) for CVD and T2D, expression quantitative trait loci (eQTLs), ENCODE, and tissue-specific gene network models (both co-expression and graphical models) from CVD and T2D relevant tissues. We identified pathways regulating the metabolism of lipids, glucose, and branched-chain amino acids, along with those governing oxidation, extracellular matrix, immune response, and neuronal system as shared pathogenic processes for both diseases. Further, we uncovered 15 key drivers including HMGCR, CAV1, IGF1 and PCOLCE, whose network neighbors collectively account for approximately 35% of known GWAS hits for CVD and 22% for T2D. Finally, we cross-validated the regulatory role of the top key drivers using in vitro siRNA knockdown, in vivo gene knockout, and two Hybrid Mouse Diversity Panels each comprised of >100 strains. Findings from this in-depth assessment of genetic and functional data from multiple human cohorts provide strong support that common sets of tissue-specific molecular networks drive the pathogenesis of both CVD and T2D across ethnicities and help prioritize new therapeutic avenues for both CVD and T2D. PMID:28957322
Dynamic regulation of genetic pathways and targets during aging in Caenorhabditis elegans.
He, Kan; Zhou, Tao; Shao, Jiaofang; Ren, Xiaoliang; Zhao, Zhongying; Liu, Dahai
2014-03-01
Numerous genetic targets and some individual pathways associated with aging have been identified using the worm model. However, less is known about the genetic mechanisms of aging in genome wide, particularly at the level of multiple pathways as well as the regulatory networks during aging. Here, we employed the gene expression datasets of three time points during aging in Caenorhabditis elegans (C. elegans) and performed the approach of gene set enrichment analysis (GSEA) on each dataset between adjacent stages. As a result, multiple genetic pathways and targets were identified as significantly down- or up-regulated. Among them, 5 truly aging-dependent signaling pathways including MAPK signaling pathway, mTOR signaling pathway, Wnt signaling pathway, TGF-beta signaling pathway and ErbB signaling pathway as well as 12 significantly associated genes were identified with dynamic expression pattern during aging. On the other hand, the continued declines in the regulation of several metabolic pathways have been demonstrated to display age-related changes. Furthermore, the reconstructed regulatory networks based on three of aging related Chromatin immunoprecipitation experiments followed by sequencing (ChIP-seq) datasets and the expression matrices of 154 involved genes in above signaling pathways provide new insights into aging at the multiple pathways level. The combination of multiple genetic pathways and targets needs to be taken into consideration in future studies of aging, in which the dynamic regulation would be uncovered.
Network Medicine for Alzheimer's Disease and Traditional Chinese Medicine.
Jarrell, Juliet T; Gao, Li; Cohen, David S; Huang, Xudong
2018-05-11
Alzheimer’s Disease (AD) is a neurodegenerative condition that currently has no known cure. The principles of the expanding field of network medicine (NM) have recently been applied to AD research. The main principle of NM proposes that diseases are much more complicated than one mutation in one gene, and incorporate different genes, connections between genes, and pathways that may include multiple diseases to create full scale disease networks. AD research findings as a result of the application of NM principles have suggested that functional network connectivity, myelination, myeloid cells, and genes and pathways may play an integral role in AD progression, and may be integral to the search for a cure. Different aspects of the AD pathology could be potential targets for drug therapy to slow down or stop the disease from advancing, but more research is needed to reach definitive conclusions. Additionally, the holistic approaches of network pharmacology in traditional Chinese medicine (TCM) research may be viable options for the AD treatment, and may lead to an effective cure for AD in the future.
Wuttke, Daniel; Connor, Richard; Vora, Chintan; Craig, Thomas; Li, Yang; Wood, Shona; Vasieva, Olga; Shmookler Reis, Robert; Tang, Fusheng; de Magalhães, João Pedro
2012-01-01
Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led us to suggest that DR commonly suppresses translation, while stimulating an ancient reproduction-related process. PMID:22912585
Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou
2016-12-23
Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.
Diversified Control Paths: A Significant Way Disease Genes Perturb the Human Regulatory Network
Wang, Bingbo; Gao, Lin; Zhang, Qingfang; Li, Aimin; Deng, Yue; Guo, Xingli
2015-01-01
Background The complexity of biological systems motivates us to use the underlying networks to provide deep understanding of disease etiology and the human diseases are viewed as perturbations of dynamic properties of networks. Control theory that deals with dynamic systems has been successfully used to capture systems-level knowledge in large amount of quantitative biological interactions. But from the perspective of system control, the ways by which multiple genetic factors jointly perturb a disease phenotype still remain. Results In this work, we combine tools from control theory and network science to address the diversified control paths in complex networks. Then the ways by which the disease genes perturb biological systems are identified and quantified by the control paths in a human regulatory network. Furthermore, as an application, prioritization of candidate genes is presented by use of control path analysis and gene ontology annotation for definition of similarities. We use leave-one-out cross-validation to evaluate the ability of finding the gene-disease relationship. Results have shown compatible performance with previous sophisticated works, especially in directed systems. Conclusions Our results inspire a deeper understanding of molecular mechanisms that drive pathological processes. Diversified control paths offer a basis for integrated intervention techniques which will ultimately lead to the development of novel therapeutic strategies. PMID:26284649
FastGCN: A GPU Accelerated Tool for Fast Gene Co-Expression Networks
Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun
2015-01-01
Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out. PMID:25602758
FastGCN: a GPU accelerated tool for fast gene co-expression networks.
Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun
2015-01-01
Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.
Abdeltawab, Nourtan F.; Aziz, Ramy K.; Kansal, Rita; Rowe, Sarah L.; Su, Yin; Gardner, Lidia; Brannen, Charity; Nooh, Mohammed M.; Attia, Ramy R.; Abdelsamed, Hossam A.; Taylor, William L.; Lu, Lu; Williams, Robert W.; Kotb, Malak
2008-01-01
Striking individual differences in severity of group A streptococcal (GAS) sepsis have been noted, even among patients infected with the same bacterial strain. We had provided evidence that HLA class II allelic variation contributes significantly to differences in systemic disease severity by modulating host responses to streptococcal superantigens. Inasmuch as the bacteria produce additional virulence factors that participate in the pathogenesis of this complex disease, we sought to identify additional gene networks modulating GAS sepsis. Accordingly, we applied a systems genetics approach using a panel of advanced recombinant inbred mice. By analyzing disease phenotypes in the context of mice genotypes we identified a highly significant quantitative trait locus (QTL) on Chromosome 2 between 22 and 34 Mb that strongly predicts disease severity, accounting for 25%–30% of variance. This QTL harbors several polymorphic genes known to regulate immune responses to bacterial infections. We evaluated candidate genes within this QTL using multiple parameters that included linkage, gene ontology, variation in gene expression, cocitation networks, and biological relevance, and identified interleukin1 alpha and prostaglandin E synthases pathways as key networks involved in modulating GAS sepsis severity. The association of GAS sepsis with multiple pathways underscores the complexity of traits modulating GAS sepsis and provides a powerful approach for analyzing interactive traits affecting outcomes of other infectious diseases. PMID:18421376
Stojanova, Daniela; Ceci, Michelangelo; Malerba, Donato; Dzeroski, Saso
2013-09-26
Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Our newly developed method for HMC takes into account network information in the learning phase: When used for gene function prediction in the context of PPI networks, the explicit consideration of network autocorrelation increases the predictive performance of the learned models. Overall, we found that this holds for different gene features/ descriptions, functional annotation schemes, and PPI networks: Best results are achieved when the PPI network is dense and contains a large proportion of function-relevant interactions.
Multiconstrained gene clustering based on generalized projections
2010-01-01
Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386
Characterizing mutation-expression network relationships in multiple cancers.
Ghazanfar, Shila; Yang, Jean Yee Hwa
2016-08-01
Data made available through large cancer consortia like The Cancer Genome Atlas make for a rich source of information to be studied across and between cancers. In recent years, network approaches have been applied to such data in uncovering the complex interrelationships between mutational and expression profiles, but lack direct testing for expression changes via mutation. In this pan-cancer study we analyze mutation and gene expression information in an integrative manner by considering the networks generated by testing for differences in expression in direct association with specific mutations. We relate our findings among the 19 cancers examined to identify commonalities and differences as well as their characteristics. Using somatic mutation and gene expression information across 19 cancers, we generated mutation-expression networks per cancer. On evaluation we found that our generated networks were significantly enriched for known cancer-related genes, such as skin cutaneous melanoma (p<0.01 using Network of Cancer Genes 4.0). Our framework identified that while different cancers contained commonly mutated genes, there was little concordance between associated gene expression changes among cancers. Comparison between cancers showed a greater overlap of network nodes for cancers with higher overall non-silent mutation load, compared to those with a lower overall non-silent mutation load. This study offers a framework that explores network information through co-analysis of somatic mutations and gene expression profiles. Our pan-cancer application of this approach suggests that while mutations are frequently common among cancer types, the impact they have on the surrounding networks via gene expression changes varies. Despite this finding, there are some cancers for which mutation-associated network behaviour appears to be similar: suggesting a potential framework for uncovering related cancers for which similar therapeutic strategies may be applicable. Our framework for understanding relationships among cancers has been integrated into an interactive R Shiny application, PAn Cancer Mutation Expression Networks (PACMEN), containing dynamic and static network visualization of the mutation-expression networks. PACMEN also features tools for further examination of network topology characteristics among cancers. Copyright © 2016 Elsevier Ltd. All rights reserved.
Core Promoter Functions in the Regulation of Gene Expression of Drosophila Dorsal Target Genes*
Zehavi, Yonathan; Kuznetsov, Olga; Ovadia-Shochat, Avital; Juven-Gershon, Tamar
2014-01-01
Developmental processes are highly dependent on transcriptional regulation by RNA polymerase II. The RNA polymerase II core promoter is the ultimate target of a multitude of transcription factors that control transcription initiation. Core promoters consist of core promoter motifs, e.g. the initiator, TATA box, and the downstream core promoter element (DPE), which confer specific properties to the core promoter. Here, we explored the importance of core promoter functions in the dorsal-ventral developmental gene regulatory network. This network includes multiple genes that are activated by different nuclear concentrations of Dorsal, an NFκB homolog transcription factor, along the dorsal-ventral axis. We show that over two-thirds of Dorsal target genes contain DPE sequence motifs, which is significantly higher than the proportion of DPE-containing promoters in Drosophila genes. We demonstrate that multiple Dorsal target genes are evolutionarily conserved and functionally dependent on the DPE. Furthermore, we have analyzed the activation of key Dorsal target genes by Dorsal, as well as by another Rel family transcription factor, Relish, and the dependence of their activation on the DPE motif. Using hybrid enhancer-promoter constructs in Drosophila cells and embryo extracts, we have demonstrated that the core promoter composition is an important determinant of transcriptional activity of Dorsal target genes. Taken together, our results provide evidence for the importance of core promoter composition in the regulation of Dorsal target genes. PMID:24634215
Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico
2014-01-01
Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
Waliszewski, P; Molski, M; Konarski, J
1998-06-01
A keystone of the molecular reductionist approach to cellular biology is a specific deductive strategy relating genotype to phenotype-two distinct categories. This relationship is based on the assumption that the intermediary cellular network of actively transcribed genes and their regulatory elements is deterministic (i.e., a link between expression of a gene and a phenotypic trait can always be identified, and evolution of the network in time is predetermined). However, experimental data suggest that the relationship between genotype and phenotype is nonbijective (i.e., a gene can contribute to the emergence of more than just one phenotypic trait or a phenotypic trait can be determined by expression of several genes). This implies nonlinearity (i.e., lack of the proportional relationship between input and the outcome), complexity (i.e. emergence of the hierarchical network of multiple cross-interacting elements that is sensitive to initial conditions, possesses multiple equilibria, organizes spontaneously into different morphological patterns, and is controlled in dispersed rather than centralized manner), and quasi-determinism (i.e., coexistence of deterministic and nondeterministic events) of the network. Nonlinearity within the space of the cellular molecular events underlies the existence of a fractal structure within a number of metabolic processes, and patterns of tissue growth, which is measured experimentally as a fractal dimension. Because of its complexity, the same phenotype can be associated with a number of alternative sequences of cellular events. Moreover, the primary cause initiating phenotypic evolution of cells such as malignant transformation can be favored probabilistically, but not identified unequivocally. Thermodynamic fluctuations of energy rather than gene mutations, the material traits of the fluctuations alter both the molecular and informational structure of the network. Then, the interplay between deterministic chaos, complexity, self-organization, and natural selection drives formation of malignant phenotype. This concept offers a novel perspective for investigation of tumorigenesis without invalidating current molecular findings. The essay integrates the ideas of the sciences of complexity in a biological context.
Identification of five novel modifier loci of ApcMin harbored in the BXH14 recombinant inbred strain
Siracusa, Linda D.
2012-01-01
Every year thousands of people in the USA are diagnosed with small intestine and colorectal cancers (CRC). Although environmental factors affect disease etiology, uncovering underlying genetic factors is imperative for risk assessment and developing preventative therapies. Familial adenomatous polyposis is a heritable genetic disorder in which individuals carry germ-line mutations in the adenomatous polyposis coli (APC) gene that predisposes them to CRC. The Apc Min mouse model carries a point mutation in the Apc gene and develops polyps along the intestinal tract. Inbred strain background influences polyp phenotypes in Apc Min mice. Several Modifier of Min (Mom) loci that alter tumor phenotypes associated with the Apc Min mutation have been identified to date. We screened BXH recombinant inbred (RI) strains by crossing BXH RI females with C57BL/6J (B6) Apc Min males and quantitating tumor phenotypes in backcross progeny. We found that the BXH14 RI strain harbors five modifier loci that decrease polyp multiplicity. Furthermore, we show that resistance is determined by varying combinations of these modifier loci. Gene interaction network analysis shows that there are multiple networks with proven gene–gene interactions, which contain genes from all five modifier loci. We discuss the implications of this result for studies that define susceptibility loci, namely that multiple networks may be acting concurrently to alter tumor phenotypes. Thus, the significance of this work resides not only with the modifier loci we identified but also with the combinations of loci needed to get maximal protection against polyposis and the impact of this finding on human disease studies. Abbreviations:APCadenomatous polyposis coliGWASgenome-wide association studiesQTLquantitative trait lociSNPsingle-nucleotide polymorphism. PMID:22637734
Hu, Wei; Xia, Zhiqiang; Yan, Yan; Ding, Zehong; Tie, Weiwei; Wang, Lianzhe; Zou, Meiling; Wei, Yunxie; Lu, Cheng; Hou, Xiaowan; Wang, Wenquan; Peng, Ming
2015-01-01
Cassava is an important food and potential biofuel crop that is tolerant to multiple abiotic stressors. The mechanisms underlying these tolerances are currently less known. CBL-interacting protein kinases (CIPKs) have been shown to play crucial roles in plant developmental processes, hormone signaling transduction, and in the response to abiotic stress. However, no data is currently available about the CPK family in cassava. In this study, a total of 25 CIPK genes were identified from cassava genome based on our previous genome sequencing data. Phylogenetic analysis suggested that 25 MeCIPKs could be classified into four subfamilies, which was supported by exon-intron organizations and the architectures of conserved protein motifs. Transcriptomic analysis of a wild subspecies and two cultivated varieties showed that most MeCIPKs had different expression patterns between wild subspecies and cultivatars in different tissues or in response to drought stress. Some orthologous genes involved in CIPK interaction networks were identified between Arabidopsis and cassava. The interaction networks and co-expression patterns of these orthologous genes revealed that the crucial pathways controlled by CIPK networks may be involved in the differential response to drought stress in different accessions of cassava. Nine MeCIPK genes were selected to investigate their transcriptional response to various stimuli and the results showed the comprehensive response of the tested MeCIPK genes to osmotic, salt, cold, oxidative stressors, and ABA signaling. The identification and expression analysis of CIPK family suggested that CIPK genes are important components of development and multiple signal transduction pathways in cassava. The findings of this study will help lay a foundation for the functional characterization of the CIPK gene family and provide an improved understanding of abiotic stress responses and signaling transduction in cassava. PMID:26579161
Hansen, Bjoern Oest; Meyer, Etienne H; Ferrari, Camilla; Vaid, Neha; Movahedi, Sara; Vandepoele, Klaas; Nikoloski, Zoran; Mutwil, Marek
2018-03-01
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Shadows of complexity: what biological networks reveal about epistasis and pleiotropy
Tyler, Anna L.; Asselbergs, Folkert W.; Williams, Scott M.; Moore, Jason H.
2011-01-01
Pleiotropy, in which one mutation causes multiple phenotypes, has traditionally been seen as a deviation from the conventional observation in which one gene affects one phenotype. Epistasis, or gene-gene interaction, has also been treated as an exception to the Mendelian one gene-one phenotype paradigm. This simplified perspective belies the pervasive complexity of biology and hinders progress toward a deeper understanding of biological systems. We assert that epistasis and pleiotropy are not isolated occurrences, but ubiquitous and inherent properties of biomolecular networks. These phenomena should not be treated as exceptions, but rather as fundamental components of genetic analyses. A systems level understanding of epistasis and pleiotropy is, therefore, critical to furthering our understanding of human genetics and its contribution to common human disease. Finally, graph theory offers an intuitive and powerful set of tools with which to study the network bases of these important genetic phenomena. PMID:19204994
NASA Astrophysics Data System (ADS)
Onoyama, Takashi; Maekawa, Takuya; Kubota, Sen; Tsuruta, Setuso; Komoda, Norihisa
To build a cooperative logistics network covering multiple enterprises, a planning method that can build a long-distance transportation network is required. Many strict constraints are imposed on this type of problem. To solve these strict-constraint problems, a selfish constraint satisfaction genetic algorithm (GA) is proposed. In this GA, each gene of an individual satisfies only its constraint selfishly, disregarding the constraints of other genes in the same individuals. Moreover, a constraint pre-checking method is also applied to improve the GA convergence speed. The experimental result shows the proposed method can obtain an accurate solution in a practical response time.
Modularity and evolutionary constraints in a baculovirus gene regulatory network
2013-01-01
Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates that modularity may be a general feature of biological gene regulatory networks. PMID:24006890
Deep conservation of cis-regulatory elements in metazoans
Maeso, Ignacio; Irimia, Manuel; Tena, Juan J.; Casares, Fernando; Gómez-Skarmeta, José Luis
2013-01-01
Despite the vast morphological variation observed across phyla, animals share multiple basic developmental processes orchestrated by a common ancestral gene toolkit. These genes interact with each other building complex gene regulatory networks (GRNs), which are encoded in the genome by cis-regulatory elements (CREs) that serve as computational units of the network. Although GRN subcircuits involved in ancient developmental processes are expected to be at least partially conserved, identification of CREs that are conserved across phyla has remained elusive. Here, we review recent studies that revealed such deeply conserved CREs do exist, discuss the difficulties associated with their identification and describe new approaches that will facilitate this search. PMID:24218633
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-01-01
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs. PMID:29113310
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Identifying metabolic enzymes with multiple types of association evidence
Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M
2006-01-01
Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
Krienen, Fenna M.; Yeo, B. T. Thomas; Ge, Tian; Buckner, Randy L.; Sherwood, Chet C.
2016-01-01
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute’s human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections. PMID:26739559
Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C
2016-01-26
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.
Synchronous versus asynchronous modeling of gene regulatory networks.
Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni
2008-09-01
In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.
Empirical Bayes conditional independence graphs for regulatory network recovery.
Mahdi, Rami; Madduri, Abishek S; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R; Crystal, Ronald G; Mezey, Jason G
2012-08-01
Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary data are available at Bioinformatics online.
Acerbi, Enzo; Viganò, Elena; Poidinger, Michael; Mortellaro, Alessandra; Zelante, Teresa; Stella, Fabio
2016-01-01
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4+ naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments. PMID:26976045
Xing, Li-Bo; Zhang, Dong; Li, You-Mei; Shen, Ya-Wen; Zhao, Cai-Ping; Ma, Juan-Juan; An, Na; Han, Ming-Yu
2015-10-01
Flower induction in apple (Malus domestica Borkh.) is regulated by complex gene networks that involve multiple signal pathways to ensure flower bud formation in the next year, but the molecular determinants of apple flower induction are still unknown. In this research, transcriptomic profiles from differentiating buds allowed us to identify genes potentially involved in signaling pathways that mediate the regulatory mechanisms of flower induction. A hypothetical model for this regulatory mechanism was obtained by analysis of the available transcriptomic data, suggesting that sugar-, hormone- and flowering-related genes, as well as those involved in cell-cycle induction, participated in the apple flower induction process. Sugar levels and metabolism-related gene expression profiles revealed that sucrose is the initiation signal in flower induction. Complex hormone regulatory networks involved in cytokinin (CK), abscisic acid (ABA) and gibberellic acid pathways also induce apple flower formation. CK plays a key role in the regulation of cell formation and differentiation, and in affecting flowering-related gene expression levels during these processes. Meanwhile, ABA levels and ABA-related gene expression levels gradually increased, as did those of sugar metabolism-related genes, in developing buds, indicating that ABA signals regulate apple flower induction by participating in the sugar-mediated flowering pathway. Furthermore, changes in sugar and starch deposition levels in buds can be affected by ABA content and the expression of the genes involved in the ABA signaling pathway. Thus, multiple pathways, which are mainly mediated by crosstalk between sugar and hormone signals, regulate the molecular network involved in bud growth and flower induction in apple trees. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
2013-12-18
include interactive gene and methylation profiles, interactive heatmaps, cytoscape network views, integrative genomics viewer ( IGV ), and protein-protein...single chart. The website also provides an option to include multiple genes. Integrative Genomics Viewer ( IGV )1, is a high-performance desktop tool for
Quantifying the underlying landscape and paths of cancer
Li, Chunhe; Wang, Jin
2014-01-01
Cancer is a disease regulated by the underlying gene networks. The emergence of normal and cancer states as well as the transformation between them can be thought of as a result of the gene network interactions and associated changes. We developed a global potential landscape and path framework to quantify cancer and associated processes. We constructed a cancer gene regulatory network based on the experimental evidences and uncovered the underlying landscape. The resulting tristable landscape characterizes important biological states: normal, cancer and apoptosis. The landscape topography in terms of barrier heights between stable state attractors quantifies the global stability of the cancer network system. We propose two mechanisms of cancerization: one is by the changes of landscape topography through the changes in regulation strengths of the gene networks. The other is by the fluctuations that help the system to go over the critical barrier at fixed landscape topography. The kinetic paths from least action principle quantify the transition processes among normal state, cancer state and apoptosis state. The kinetic rates provide the quantification of transition speeds among normal, cancer and apoptosis attractors. By the global sensitivity analysis of the gene network parameters on the landscape topography, we uncovered some key gene regulations determining the transitions between cancer and normal states. This can be used to guide the design of new anti-cancer tactics, through cocktail strategy of targeting multiple key regulation links simultaneously, for preventing cancer occurrence or transforming the early cancer state back to normal state. PMID:25232051
Cross-platform method for identifying candidate network biomarkers for prostate cancer.
Jin, G; Zhou, X; Cui, K; Zhang, X-S; Chen, L; Wong, S T C
2009-11-01
Discovering biomarkers using mass spectrometry (MS) and microarray expression profiles is a promising strategy in molecular diagnosis. Here, the authors proposed a new pipeline for biomarker discovery that integrates disease information for proteins and genes, expression profiles in both genomic and proteomic levels, and protein-protein interactions (PPIs) to discover high confidence network biomarkers. Using this pipeline, a total of 474 molecules (genes and proteins) related to prostate cancer were identified and a prostate-cancer-related network (PCRN) was derived from the integrative information. Thus, a set of candidate network biomarkers were identified from multiple expression profiles composed by eight microarray datasets and one proteomics dataset. The network biomarkers with PPIs can accurately distinguish the prostate patients from the normal ones, which potentially provide more reliable hits of biomarker candidates than conventional biomarker discovery methods.
Semi-Supervised Multi-View Learning for Gene Network Reconstruction
Ceci, Michelangelo; Pio, Gianvito; Kuzmanovski, Vladimir; Džeroski, Sašo
2015-01-01
The task of gene regulatory network reconstruction from high-throughput data is receiving increasing attention in recent years. As a consequence, many inference methods for solving this task have been proposed in the literature. It has been recently observed, however, that no single inference method performs optimally across all datasets. It has also been shown that the integration of predictions from multiple inference methods is more robust and shows high performance across diverse datasets. Inspired by this research, in this paper, we propose a machine learning solution which learns to combine predictions from multiple inference methods. While this approach adds additional complexity to the inference process, we expect it would also carry substantial benefits. These would come from the automatic adaptation to patterns on the outputs of individual inference methods, so that it is possible to identify regulatory interactions more reliably when these patterns occur. This article demonstrates the benefits (in terms of accuracy of the reconstructed networks) of the proposed method, which exploits an iterative, semi-supervised ensemble-based algorithm. The algorithm learns to combine the interactions predicted by many different inference methods in the multi-view learning setting. The empirical evaluation of the proposed algorithm on a prokaryotic model organism (E. coli) and on a eukaryotic model organism (S. cerevisiae) clearly shows improved performance over the state of the art methods. The results indicate that gene regulatory network reconstruction for the real datasets is more difficult for S. cerevisiae than for E. coli. The software, all the datasets used in the experiments and all the results are available for download at the following link: http://figshare.com/articles/Semi_supervised_Multi_View_Learning_for_Gene_Network_Reconstruction/1604827. PMID:26641091
MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers.
Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier; Lecompte, Odile
2017-06-16
The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user's specific interests and provides an efficient way to share information with collaborators. Furthermore, the user's behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. ©Alexis Allot, Kirsley Chennen, Yannis Nevers, Laetitia Poidevin, Arnaud Kress, Raymond Ripp, Julie Dawn Thompson, Olivier Poch, Odile Lecompte. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.06.2017.
Node-Based Learning of Multiple Gaussian Graphical Models
Mohan, Karthik; London, Palma; Fazel, Maryam; Witten, Daniela; Lee, Su-In
2014-01-01
We consider the problem of estimating high-dimensional Gaussian graphical models corresponding to a single set of variables under several distinct conditions. This problem is motivated by the task of recovering transcriptional regulatory networks on the basis of gene expression data containing heterogeneous samples, such as different disease states, multiple species, or different developmental stages. We assume that most aspects of the conditional dependence networks are shared, but that there are some structured differences between them. Rather than assuming that similarities and differences between networks are driven by individual edges, we take a node-based approach, which in many cases provides a more intuitive interpretation of the network differences. We consider estimation under two distinct assumptions: (1) differences between the K networks are due to individual nodes that are perturbed across conditions, or (2) similarities among the K networks are due to the presence of common hub nodes that are shared across all K networks. Using a row-column overlap norm penalty function, we formulate two convex optimization problems that correspond to these two assumptions. We solve these problems using an alternating direction method of multipliers algorithm, and we derive a set of necessary and sufficient conditions that allows us to decompose the problem into independent subproblems so that our algorithm can be scaled to high-dimensional settings. Our proposal is illustrated on synthetic data, a webpage data set, and a brain cancer gene expression data set. PMID:25309137
dbCPG: A web resource for cancer predisposition genes.
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-06-21
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.
Elia, Josephine; Glessner, Joseph T; Wang, Kai; Takahashi, Nagahide; Shtir, Corina J; Hadley, Dexter; Sleiman, Patrick M A; Zhang, Haitao; Kim, Cecilia E; Robison, Reid; Lyon, Gholson J; Flory, James H; Bradfield, Jonathan P; Imielinski, Marcin; Hou, Cuiping; Frackelton, Edward C; Chiavacci, Rosetta M; Sakurai, Takeshi; Rabin, Cara; Middleton, Frank A; Thomas, Kelly A; Garris, Maria; Mentch, Frank; Freitag, Christine M; Steinhausen, Hans-Christoph; Todorov, Alexandre A; Reif, Andreas; Rothenberger, Aribert; Franke, Barbara; Mick, Eric O; Roeyers, Herbert; Buitelaar, Jan; Lesch, Klaus-Peter; Banaschewski, Tobias; Ebstein, Richard P; Mulas, Fernando; Oades, Robert D; Sergeant, Joseph; Sonuga-Barke, Edmund; Renner, Tobias J; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Meyer, Jobst; Pálmason, Haukur; Seitz, Christiane; Loo, Sandra K; Smalley, Susan L; Biederman, Joseph; Kent, Lindsey; Asherson, Philip; Anney, Richard J L; Gaynor, J William; Shaw, Philip; Devoto, Marcella; White, Peter S; Grant, Struan F A; Buxbaum, Joseph D; Rapoport, Judith L; Williams, Nigel M; Nelson, Stanley F; Faraone, Stephen V; Hakonarson, Hakon
2014-01-01
Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts. PMID:22138692
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.
Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.; ...
2016-03-18
Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
The CTD2 Dashboard hosts analyzed data and other evidence generated by the CTD2 Network. It is a web interface for the research community to browse and search CTD2 Network data related to genes, proteins, and compounds from individual CTD2 Centers, or explore observations across multiple Centers.
Identification of a neuronal transcription factor network involved in medulloblastoma development.
Lastowska, Maria; Al-Afghani, Hani; Al-Balool, Haya H; Sheth, Harsh; Mercer, Emma; Coxhead, Jonathan M; Redfern, Chris P F; Peters, Heiko; Burt, Alastair D; Santibanez-Koref, Mauro; Bacon, Chris M; Chesler, Louis; Rust, Alistair G; Adams, David J; Williamson, Daniel; Clifford, Steven C; Jackson, Michael S
2013-07-11
Medulloblastomas, the most frequent malignant brain tumours affecting children, comprise at least 4 distinct clinicogenetic subgroups. Aberrant sonic hedgehog (SHH) signalling is observed in approximately 25% of tumours and defines one subgroup. Although alterations in SHH pathway genes (e.g. PTCH1, SUFU) are observed in many of these tumours, high throughput genomic analyses have identified few other recurring mutations. Here, we have mutagenised the Ptch+/- murine tumour model using the Sleeping Beauty transposon system to identify additional genes and pathways involved in SHH subgroup medulloblastoma development. Mutagenesis significantly increased medulloblastoma frequency and identified 17 candidate cancer genes, including orthologs of genes somatically mutated (PTEN, CREBBP) or associated with poor outcome (PTEN, MYT1L) in the human disease. Strikingly, these candidate genes were enriched for transcription factors (p=2x10-5), the majority of which (6/7; Crebbp, Myt1L, Nfia, Nfib, Tead1 and Tgif2) were linked within a single regulatory network enriched for genes associated with a differentiated neuronal phenotype. Furthermore, activity of this network varied significantly between the human subgroups, was associated with metastatic disease, and predicted poor survival specifically within the SHH subgroup of tumours. Igf2, previously implicated in medulloblastoma, was the most differentially expressed gene in murine tumours with network perturbation, and network activity in both mouse and human tumours was characterised by enrichment for multiple gene-sets indicating increased cell proliferation, IGF signalling, MYC target upregulation, and decreased neuronal differentiation. Collectively, our data support a model of medulloblastoma development in SB-mutagenised Ptch+/- mice which involves disruption of a novel transcription factor network leading to Igf2 upregulation, proliferation of GNPs, and tumour formation. Moreover, our results identify rational therapeutic targets for SHH subgroup tumours, alongside prognostic biomarkers for the identification of poor-risk SHH patients.
In Silico Gene Prioritization by Integrating Multiple Data Sources
Zhou, Yingyao; Shields, Robert; Chanda, Sumit K.; Elston, Robert C.; Li, Jing
2011-01-01
Identifying disease genes is crucial to the understanding of disease pathogenesis, and to the improvement of disease diagnosis and treatment. In recent years, many researchers have proposed approaches to prioritize candidate genes by considering the relationship of candidate genes and existing known disease genes, reflected in other data sources. In this paper, we propose an expandable framework for gene prioritization that can integrate multiple heterogeneous data sources by taking advantage of a unified graphic representation. Gene-gene relationships and gene-disease relationships are then defined based on the overall topology of each network using a diffusion kernel measure. These relationship measures are in turn normalized to derive an overall measure across all networks, which is utilized to rank all candidate genes. Based on the informativeness of available data sources with respect to each specific disease, we also propose an adaptive threshold score to select a small subset of candidate genes for further validation studies. We performed large scale cross-validation analysis on 110 disease families using three data sources. Results have shown that our approach consistently outperforms other two state of the art programs. A case study using Parkinson disease (PD) has identified four candidate genes (UBB, SEPT5, GPR37 and TH) that ranked higher than our adaptive threshold, all of which are involved in the PD pathway. In particular, a very recent study has observed a deletion of TH in a patient with PD, which supports the importance of the TH gene in PD pathogenesis. A web tool has been implemented to assist scientists in their genetic studies. PMID:21731658
Chatterjee, Sumantra; Sivakamasundari, V; Yap, Sook Peng; Kraus, Petra; Kumar, Vibhor; Xing, Xing; Lim, Siew Lan; Sng, Joel; Prabhakar, Shyam; Lufkin, Thomas
2014-12-05
Vertebrate organogenesis is a highly complex process involving sequential cascades of transcription factor activation or repression. Interestingly a single developmental control gene can occasionally be essential for the morphogenesis and differentiation of tissues and organs arising from vastly disparate embryological lineages. Here we elucidated the role of the mammalian homeobox gene Bapx1 during the embryogenesis of five distinct organs at E12.5 - vertebral column, spleen, gut, forelimb and hindlimb - using expression profiling of sorted wildtype and mutant cells combined with genome wide binding site analysis. Furthermore we analyzed the development of the vertebral column at the molecular level by combining transcriptional profiling and genome wide binding data for Bapx1 with similarly generated data sets for Sox9 to assemble a detailed gene regulatory network revealing genes previously not reported to be controlled by either of these two transcription factors. The gene regulatory network appears to control cell fate decisions and morphogenesis in the vertebral column along with the prevention of premature chondrocyte differentiation thus providing a detailed molecular view of vertebral column development.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
NASA Technical Reports Server (NTRS)
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
A Comprehensive Analysis of Nuclear-Encoded Mitochondrial Genes in Schizophrenia.
Gonçalves, Vanessa F; Cappi, Carolina; Hagen, Christian M; Sequeira, Adolfo; Vawter, Marquis P; Derkach, Andriy; Zai, Clement C; Hedley, Paula L; Bybjerg-Grauholm, Jonas; Pouget, Jennie G; Cuperfain, Ari B; Sullivan, Patrick F; Christiansen, Michael; Kennedy, James L; Sun, Lei
2018-05-01
The genetic risk factors of schizophrenia (SCZ), a severe psychiatric disorder, are not yet fully understood. Multiple lines of evidence suggest that mitochondrial dysfunction may play a role in SCZ, but comprehensive association studies are lacking. We hypothesized that variants in nuclear-encoded mitochondrial genes influence susceptibility to SCZ. We conducted gene-based and gene-set analyses using summary association results from the Psychiatric Genomics Consortium Schizophrenia Phase 2 (PGC-SCZ2) genome-wide association study comprising 35,476 cases and 46,839 control subjects. We applied the MAGMA method to three sets of nuclear-encoded mitochondrial genes: oxidative phosphorylation genes, other nuclear-encoded mitochondrial genes, and genes involved in nucleus-mitochondria crosstalk. Furthermore, we conducted a replication study using the iPSYCH SCZ sample of 2290 cases and 21,621 control subjects. In the PGC-SCZ2 sample, 1186 mitochondrial genes were analyzed, among which 159 had p values < .05 and 19 remained significant after multiple testing correction. A meta-analysis of 818 genes combining the PGC-SCZ2 and iPSYCH samples resulted in 104 nominally significant and nine significant genes, suggesting a polygenic model for the nuclear-encoded mitochondrial genes. Gene-set analysis, however, did not show significant results. In an in silico protein-protein interaction network analysis, 14 mitochondrial genes interacted directly with 158 SCZ risk genes identified in PGC-SCZ2 (permutation p = .02), and aldosterone signaling in epithelial cells and mitochondrial dysfunction pathways appeared to be overrepresented in this network of mitochondrial and SCZ risk genes. This study provides evidence that specific aspects of mitochondrial function may play a role in SCZ, but we did not observe its broad involvement even using a large sample. Copyright © 2018 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Identification of Causal Genes, Networks, and Transcriptional Regulators of REM Sleep and Wake
Millstein, Joshua; Winrow, Christopher J.; Kasarskis, Andrew; Owens, Joseph R.; Zhou, Lili; Summa, Keith C.; Fitzpatrick, Karrie; Zhang, Bin; Vitaterna, Martha H.; Schadt, Eric E.; Renger, John J.; Turek, Fred W.
2011-01-01
Study Objective: Sleep-wake traits are well-known to be under substantial genetic control, but the specific genes and gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Thus, the aim of this study was to use systems genetics and statistical approaches to uncover the genetic networks underlying 2 primary sleep traits in the mouse: 24-h duration of REM sleep and wake. Design: Genome-wide RNA expression data from 3 tissues (anterior cortex, hypothalamus, thalamus/midbrain) were used in conjunction with high-density genotyping to identify candidate causal genes and networks mediating the effects of 2 QTL regulating the 24-h duration of REM sleep and one regulating the 24-h duration of wake. Setting: Basic sleep research laboratory. Patients or Participants: Male [C57BL/6J × (BALB/cByJ × C57BL/6J*) F1] N2 mice (n = 283). Interventions: None. Measurements and Results: The genetic variation of a mouse N2 mapping cross was leveraged against sleep-state phenotypic variation as well as quantitative gene expression measurement in key brain regions using integrative genomics approaches to uncover multiple causal sleep-state regulatory genes, including several surprising novel candidates, which interact as components of networks that modulate REM sleep and wake. In particular, it was discovered that a core network module, consisting of 20 genes, involved in the regulation of REM sleep duration is conserved across the cortex, hypothalamus, and thalamus. A novel application of a formal causal inference test was also used to identify those genes directly regulating sleep via control of expression. Conclusion: Systems genetics approaches reveal novel candidate genes, complex networks and specific transcriptional regulators of REM sleep and wake duration in mammals. Citation: Millstein J; Winrow CJ; Kasarskis A; Owens JR; Zhou L; Summa KC; Fitzpatrick K; Zhang B; Vitaterna MH; Schadt EE; Renger JJ; Turek FW. Identification of causal genes, networks, and transcriptional regulators of REM sleep and wake. SLEEP 2011;34(11):1469-1477. PMID:22043117
A Systems' Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks
Kunz, Manfred; Vera, Julio; Wolkenhauer, Olaf
2013-01-01
MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. PMID:24350286
Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat
2011-09-15
Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.
Rong, Junkang; Feltus, F. Alex; Waghmare, Vijay N.; Pierce, Gary J.; Chee, Peng W.; Draye, Xavier; Saranga, Yehoshua; Wright, Robert J.; Wilkins, Thea A.; May, O. Lloyd; Smith, C. Wayne; Gannaway, John R.; Wendel, Jonathan F.; Paterson, Andrew H.
2007-01-01
QTL mapping experiments yield heterogeneous results due to the use of different genotypes, environments, and sampling variation. Compilation of QTL mapping results yields a more complete picture of the genetic control of a trait and reveals patterns in organization of trait variation. A total of 432 QTL mapped in one diploid and 10 tetraploid interspecific cotton populations were aligned using a reference map and depicted in a CMap resource. Early demonstrations that genes from the non-fiber-producing diploid ancestor contribute to tetraploid lint fiber genetics gain further support from multiple populations and environments and advanced-generation studies detecting QTL of small phenotypic effect. Both tetraploid subgenomes contribute QTL at largely non-homeologous locations, suggesting divergent selection acting on many corresponding genes before and/or after polyploid formation. QTL correspondence across studies was only modest, suggesting that additional QTL for the target traits remain to be discovered. Crosses between closely-related genotypes differing by single-gene mutants yield profoundly different QTL landscapes, suggesting that fiber variation involves a complex network of interacting genes. Members of the lint fiber development network appear clustered, with cluster members showing heterogeneous phenotypic effects. Meta-analysis linked to synteny-based and expression-based information provides clues about specific genes and families involved in QTL networks. PMID:17565937
Rong, Junkang; Feltus, F Alex; Waghmare, Vijay N; Pierce, Gary J; Chee, Peng W; Draye, Xavier; Saranga, Yehoshua; Wright, Robert J; Wilkins, Thea A; May, O Lloyd; Smith, C Wayne; Gannaway, John R; Wendel, Jonathan F; Paterson, Andrew H
2007-08-01
QTL mapping experiments yield heterogeneous results due to the use of different genotypes, environments, and sampling variation. Compilation of QTL mapping results yields a more complete picture of the genetic control of a trait and reveals patterns in organization of trait variation. A total of 432 QTL mapped in one diploid and 10 tetraploid interspecific cotton populations were aligned using a reference map and depicted in a CMap resource. Early demonstrations that genes from the non-fiber-producing diploid ancestor contribute to tetraploid lint fiber genetics gain further support from multiple populations and environments and advanced-generation studies detecting QTL of small phenotypic effect. Both tetraploid subgenomes contribute QTL at largely non-homeologous locations, suggesting divergent selection acting on many corresponding genes before and/or after polyploid formation. QTL correspondence across studies was only modest, suggesting that additional QTL for the target traits remain to be discovered. Crosses between closely-related genotypes differing by single-gene mutants yield profoundly different QTL landscapes, suggesting that fiber variation involves a complex network of interacting genes. Members of the lint fiber development network appear clustered, with cluster members showing heterogeneous phenotypic effects. Meta-analysis linked to synteny-based and expression-based information provides clues about specific genes and families involved in QTL networks.
CoNekT: an open-source framework for comparative genomic and transcriptomic network analyses.
Proost, Sebastian; Mutwil, Marek
2018-05-01
The recent accumulation of gene expression data in the form of RNA sequencing creates unprecedented opportunities to study gene regulation and function. Furthermore, comparative analysis of the expression data from multiple species can elucidate which functional gene modules are conserved across species, allowing the study of the evolution of these modules. However, performing such comparative analyses on raw data is not feasible for many biologists. Here, we present CoNekT (Co-expression Network Toolkit), an open source web server, that contains user-friendly tools and interactive visualizations for comparative analyses of gene expression data and co-expression networks. These tools allow analysis and cross-species comparison of (i) gene expression profiles; (ii) co-expression networks; (iii) co-expressed clusters involved in specific biological processes; (iv) tissue-specific gene expression; and (v) expression profiles of gene families. To demonstrate these features, we constructed CoNekT-Plants for green alga, seed plants and flowering plants (Picea abies, Chlamydomonas reinhardtii, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Zea mays and Solanum lycopersicum) and thus provide a web-tool with the broadest available collection of plant phyla. CoNekT-Plants is freely available from http://conekt.plant.tools, while the CoNekT source code and documentation can be found at https://github.molgen.mpg.de/proost/CoNekT/.
Drug Target Prediction and Repositioning Using an Integrated Network-Based Approach
Emig, Dorothea; Ivliev, Alexander; Pustovalova, Olga; Lancashire, Lee; Bureeva, Svetlana; Nikolsky, Yuri; Bessarabova, Marina
2013-01-01
The discovery of novel drug targets is a significant challenge in drug development. Although the human genome comprises approximately 30,000 genes, proteins encoded by fewer than 400 are used as drug targets in the treatment of diseases. Therefore, novel drug targets are extremely valuable as the source for first in class drugs. On the other hand, many of the currently known drug targets are functionally pleiotropic and involved in multiple pathologies. Several of them are exploited for treating multiple diseases, which highlights the need for methods to reliably reposition drug targets to new indications. Network-based methods have been successfully applied to prioritize novel disease-associated genes. In recent years, several such algorithms have been developed, some focusing on local network properties only, and others taking the complete network topology into account. Common to all approaches is the understanding that novel disease-associated candidates are in close overall proximity to known disease genes. However, the relevance of these methods to the prediction of novel drug targets has not yet been assessed. Here, we present a network-based approach for the prediction of drug targets for a given disease. The method allows both repositioning drug targets known for other diseases to the given disease and the prediction of unexploited drug targets which are not used for treatment of any disease. Our approach takes as input a disease gene expression signature and a high-quality interaction network and outputs a prioritized list of drug targets. We demonstrate the high performance of our method and highlight the usefulness of the predictions in three case studies. We present novel drug targets for scleroderma and different types of cancer with their underlying biological processes. Furthermore, we demonstrate the ability of our method to identify non-suspected repositioning candidates using diabetes type 1 as an example. PMID:23593264
Waters, Katrina M.; Liu, Tao; Quesenberry, Ryan D.; Willse, Alan R.; Bandyopadhyay, Somnath; Kathmann, Loel E.; Weber, Thomas J.; Smith, Richard D.; Wiley, H. Steven; Thrall, Brian D.
2012-01-01
To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response. PMID:22479638
Unraveling transcriptional control and cis-regulatory codes using the software suite GeneACT
Cheung, Tom Hiu; Kwan, Yin Lam; Hamady, Micah; Liu, Xuedong
2006-01-01
Deciphering gene regulatory networks requires the systematic identification of functional cis-acting regulatory elements. We present a suite of web-based bioinformatics tools, called GeneACT , that can rapidly detect evolutionarily conserved transcription factor binding sites or microRNA target sites that are either unique or over-represented in differentially expressed genes from DNA microarray data. GeneACT provides graphic visualization and extraction of common regulatory sequence elements in the promoters and 3'-untranslated regions that are conserved across multiple mammalian species. PMID:17064417
O'Brien, M.A.; Costin, B.N.; Miles, M.F.
2014-01-01
Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313
Gene panel testing for hereditary breast cancer.
Winship, Ingrid; Southey, Melissa C
2016-03-21
Inherited predisposition to breast cancer is explained only in part by mutations in the BRCA1 and BRCA2 genes. Most families with an apparent familial clustering of breast cancer who are investigated through Australia's network of genetic services and familial cancer centres do not have mutations in either of these genes. More recently, additional breast cancer predisposition genes, such as PALB2, have been identified. New genetic technology allows a panel of multiple genes to be tested for mutations in a single test. This enables more women and their families to have risk assessment and risk management, in a preventive approach to predictable breast cancer. Predictive testing for a known family-specific mutation in a breast cancer predisposition gene provides personalised risk assessment and evidence-based risk management. Breast cancer predisposition gene panel tests have a greater diagnostic yield than conventional testing of only the BRCA1 and BRCA2 genes. The clinical validity and utility of some of the putative breast cancer predisposition genes is not yet clear. Ethical issues warrant consideration, as multiple gene panel testing has the potential to identify secondary findings not originally sought by the test requested. Multiple gene panel tests may provide an affordable and effective way to investigate the heritability of breast cancer.
The genetic basis of alcoholism: multiple phenotypes, many genes, complex networks.
Morozova, Tatiana V; Goldman, David; Mackay, Trudy F C; Anholt, Robert R H
2012-02-20
Alcoholism is a significant public health problem. A picture of the genetic architecture underlying alcohol-related phenotypes is emerging from genome-wide association studies and work on genetically tractable model organisms.
Analysis of the SOS response of Vibrio and other bacteria with multiple chromosomes.
Sanchez-Alberola, Neus; Campoy, Susana; Barbé, Jordi; Erill, Ivan
2012-02-03
The SOS response is a well-known regulatory network present in most bacteria and aimed at addressing DNA damage. It has also been linked extensively to stress-induced mutagenesis, virulence and the emergence and dissemination of antibiotic resistance determinants. Recently, the SOS response has been shown to regulate the activity of integrases in the chromosomal superintegrons of the Vibrionaceae, which encompasses a wide range of pathogenic species harboring multiple chromosomes. Here we combine in silico and in vitro techniques to perform a comparative genomics analysis of the SOS regulon in the Vibrionaceae, and we extend the methodology to map this transcriptional network in other bacterial species harboring multiple chromosomes. Our analysis provides the first comprehensive description of the SOS response in a family (Vibrionaceae) that includes major human pathogens. It also identifies several previously unreported members of the SOS transcriptional network, including two proteins of unknown function. The analysis of the SOS response in other bacterial species with multiple chromosomes uncovers additional regulon members and reveals that there is a conserved core of SOS genes, and that specialized additions to this basic network take place in different phylogenetic groups. Our results also indicate that across all groups the main elements of the SOS response are always found in the large chromosome, whereas specialized additions are found in the smaller chromosomes and plasmids. Our findings confirm that the SOS response of the Vibrionaceae is strongly linked with pathogenicity and dissemination of antibiotic resistance, and suggest that the characterization of the newly identified members of this regulon could provide key insights into the pathogenesis of Vibrio. The persistent location of key SOS genes in the large chromosome across several bacterial groups confirms that the SOS response plays an essential role in these organisms and sheds light into the mechanisms of evolution of global transcriptional networks involved in adaptability and rapid response to environmental changes, suggesting that small chromosomes may act as evolutionary test beds for the rewiring of transcriptional networks.
Multiple hot-deck imputation for network inference from RNA sequencing data.
Imbert, Alyssa; Valsesia, Armand; Le Gall, Caroline; Armenise, Claudia; Lefebvre, Gregory; Gourraud, Pierre-Antoine; Viguerie, Nathalie; Villa-Vialaneix, Nathalie
2018-05-15
Network inference provides a global view of the relations existing between gene expression in a given transcriptomic experiment (often only for a restricted list of chosen genes). However, it is still a challenging problem: even if the cost of sequencing techniques has decreased over the last years, the number of samples in a given experiment is still (very) small compared to the number of genes. We propose a method to increase the reliability of the inference when RNA-seq expression data have been measured together with an auxiliary dataset that can provide external information on gene expression similarity between samples. Our statistical approach, hd-MI, is based on imputation for samples without available RNA-seq data that are considered as missing data but are observed on the secondary dataset. hd-MI can improve the reliability of the inference for missing rates up to 30% and provides more stable networks with a smaller number of false positive edges. On a biological point of view, hd-MI was also found relevant to infer networks from RNA-seq data acquired in adipose tissue during a nutritional intervention in obese individuals. In these networks, novel links between genes were highlighted, as well as an improved comparability between the two steps of the nutritional intervention. Software and sample data are available as an R package, RNAseqNet, that can be downloaded from the Comprehensive R Archive Network (CRAN). alyssa.imbert@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary data are available at Bioinformatics online.
Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology
2010-01-01
Background In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships. Results The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches. Conclusions The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms. PMID:21172053
Predicting effects of structural stress in a genome-reduced model bacterial metabolism
NASA Astrophysics Data System (ADS)
Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles
2012-08-01
Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.
Regulation of epidermal cell fate in Arabidopsis roots: the importance of multiple feedback loops
Schiefelbein, John; Huang, Ling; Zheng, Xiaohua
2014-01-01
The specification of distinct cell types in multicellular organisms is accomplished via establishment of differential gene expression. A major question is the nature of the mechanisms that establish this differential expression in time and space. In plants, the formation of the hair and non-hair cell types in the root epidermis has been used as a model to understand regulation of cell specification. Recent findings show surprising complexity in the number and the types of regulatory interactions between the multiple transcription factor genes/proteins influencing root epidermis cell fate. Here, we describe this regulatory network and the importance of the multiple feedback loops for its establishment and maintenance. PMID:24596575
Systematic analysis of molecular mechanisms for HCC metastasis via text mining approach.
Zhen, Cheng; Zhu, Caizhong; Chen, Haoyang; Xiong, Yiru; Tan, Junyuan; Chen, Dong; Li, Jin
2017-02-21
To systematically explore the molecular mechanism for hepatocellular carcinoma (HCC) metastasis and identify regulatory genes with text mining methods. Genes with highest frequencies and significant pathways related to HCC metastasis were listed. A handful of proteins such as EGFR, MDM2, TP53 and APP, were identified as hub nodes in PPI (protein-protein interaction) network. Compared with unique genes for HBV-HCCs, genes particular to HCV-HCCs were less, but may participate in more extensive signaling processes. VEGFA, PI3KCA, MAPK1, MMP9 and other genes may play important roles in multiple phenotypes of metastasis. Genes in abstracts of HCC-metastasis literatures were identified. Word frequency analysis, KEGG pathway and PPI network analysis were performed. Then co-occurrence analysis between genes and metastasis-related phenotypes were carried out. Text mining is effective for revealing potential regulators or pathways, but the purpose of it should be specific, and the combination of various methods will be more useful.
Detecting gene subnetworks under selection in biological pathways.
Gouy, Alexandre; Daub, Joséphine T; Excoffier, Laurent
2017-09-19
Advances in high throughput sequencing technologies have created a gap between data production and functional data analysis. Indeed, phenotypes result from interactions between numerous genes, but traditional methods treat loci independently, missing important knowledge brought by network-level emerging properties. Therefore, detecting selection acting on multiple genes affecting the evolution of complex traits remains challenging. In this context, gene network analysis provides a powerful framework to study the evolution of adaptive traits and facilitates the interpretation of genome-wide data. We developed a method to analyse gene networks that is suitable to evidence polygenic selection. The general idea is to search biological pathways for subnetworks of genes that directly interact with each other and that present unusual evolutionary features. Subnetwork search is a typical combinatorial optimization problem that we solve using a simulated annealing approach. We have applied our methodology to find signals of adaptation to high-altitude in human populations. We show that this adaptation has a clear polygenic basis and is influenced by many genetic components. Our approach, implemented in the R package signet, improves on gene-level classical tests for selection by identifying both new candidate genes and new biological processes involved in adaptation to altitude. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A gene regulatory network armature for T-lymphocyte specification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fung, Elizabeth-sharon
Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through whichmore » T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.« less
A link prediction method for heterogeneous networks based on BP neural network
NASA Astrophysics Data System (ADS)
Li, Ji-chao; Zhao, Dan-ling; Ge, Bing-Feng; Yang, Ke-Wei; Chen, Ying-Wu
2018-04-01
Most real-world systems, composed of different types of objects connected via many interconnections, can be abstracted as various complex heterogeneous networks. Link prediction for heterogeneous networks is of great significance for mining missing links and reconfiguring networks according to observed information, with considerable applications in, for example, friend and location recommendations and disease-gene candidate detection. In this paper, we put forward a novel integrated framework, called MPBP (Meta-Path feature-based BP neural network model), to predict multiple types of links for heterogeneous networks. More specifically, the concept of meta-path is introduced, followed by the extraction of meta-path features for heterogeneous networks. Next, based on the extracted meta-path features, a supervised link prediction model is built with a three-layer BP neural network. Then, the solution algorithm of the proposed link prediction model is put forward to obtain predicted results by iteratively training the network. Last, numerical experiments on the dataset of examples of a gene-disease network and a combat network are conducted to verify the effectiveness and feasibility of the proposed MPBP. It shows that the MPBP with very good performance is superior to the baseline methods.
Ethanol modulation of gene networks: implications for alcoholism.
Farris, Sean P; Miles, Michael F
2012-01-01
Alcoholism is a complex disease caused by a confluence of environmental and genetic factors influencing multiple brain pathways to produce a variety of behavioral sequelae, including addiction. Genetic factors contribute to over 50% of the risk for alcoholism and recent evidence points to a large number of genes with small effect sizes as the likely molecular basis for this disease. Recent progress in genomics (microarrays or RNA-Seq) and genetics has led to the identification of a large number of potential candidate genes influencing ethanol behaviors or alcoholism itself. To organize this complex information, investigators have begun to focus on the contribution of gene networks, rather than individual genes, for various ethanol-induced behaviors in animal models or behavioral endophenotypes comprising alcoholism. This chapter reviews some of the methods used for constructing gene networks from genomic data and some of the recent progress made in applying such approaches to the study of the neurobiology of ethanol. We show that rapid technology development in gathering genomic data, together with sophisticated experimental design and a growing collection of analysis tools are producing novel insights for understanding the molecular basis of alcoholism and that such approaches promise new opportunities for therapeutic development. Copyright © 2011 Elsevier Inc. All rights reserved.
dbCPG: A web resource for cancer predisposition genes
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-01-01
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119
Begum, Tina; Ghosh, Tapash Chandra
2014-10-05
To date, numerous studies have been attempted to determine the extent of variation in evolutionary rates between human disease and nondisease (ND) genes. In our present study, we have considered human autosomal monogenic (Mendelian) disease genes, which were classified into two groups according to the number of phenotypic defects, that is, specific disease (SPD) gene (one gene: one defect) and shared disease (SHD) gene (one gene: multiple defects). Here, we have compared the evolutionary rates of these two groups of genes, that is, SPD genes and SHD genes with respect to ND genes. We observed that the average evolutionary rates are slow in SHD group, intermediate in SPD group, and fast in ND group. Group-to-group evolutionary rate differences remain statistically significant regardless of their gene expression levels and number of defects. We demonstrated that disease genes are under strong selective constraint if they emerge through edgetic perturbation or drug-induced perturbation of the interactome network, show tissue-restricted expression, and are involved in transmembrane transport. Among all the factors, our regression analyses interestingly suggest the independent effects of 1) drug-induced perturbation and 2) the interaction term of expression breadth and transmembrane transport on protein evolutionary rates. We reasoned that the drug-induced network disruption is a combination of several edgetic perturbations and, thus, has more severe effect on gene phenotypes. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks
Yamanaka, Ryota; Kitano, Hiroaki
2013-01-01
Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks. PMID:24278007
Deciphering Signaling Pathway Networks to Understand the Molecular Mechanisms of Metformin Action
Sun, Jingchun; Zhao, Min; Jia, Peilin; Wang, Lily; Wu, Yonghui; Iverson, Carissa; Zhou, Yubo; Bowton, Erica; Roden, Dan M.; Denny, Joshua C.; Aldrich, Melinda C.; Xu, Hua; Zhao, Zhongming
2015-01-01
A drug exerts its effects typically through a signal transduction cascade, which is non-linear and involves intertwined networks of multiple signaling pathways. Construction of such a signaling pathway network (SPNetwork) can enable identification of novel drug targets and deep understanding of drug action. However, it is challenging to synopsize critical components of these interwoven pathways into one network. To tackle this issue, we developed a novel computational framework, the Drug-specific Signaling Pathway Network (DSPathNet). The DSPathNet amalgamates the prior drug knowledge and drug-induced gene expression via random walk algorithms. Using the drug metformin, we illustrated this framework and obtained one metformin-specific SPNetwork containing 477 nodes and 1,366 edges. To evaluate this network, we performed the gene set enrichment analysis using the disease genes of type 2 diabetes (T2D) and cancer, one T2D genome-wide association study (GWAS) dataset, three cancer GWAS datasets, and one GWAS dataset of cancer patients with T2D on metformin. The results showed that the metformin network was significantly enriched with disease genes for both T2D and cancer, and that the network also included genes that may be associated with metformin-associated cancer survival. Furthermore, from the metformin SPNetwork and common genes to T2D and cancer, we generated a subnetwork to highlight the molecule crosstalk between T2D and cancer. The follow-up network analyses and literature mining revealed that seven genes (CDKN1A, ESR1, MAX, MYC, PPARGC1A, SP1, and STK11) and one novel MYC-centered pathway with CDKN1A, SP1, and STK11 might play important roles in metformin’s antidiabetic and anticancer effects. Some results are supported by previous studies. In summary, our study 1) develops a novel framework to construct drug-specific signal transduction networks; 2) provides insights into the molecular mode of metformin; 3) serves a model for exploring signaling pathways to facilitate understanding of drug action, disease pathogenesis, and identification of drug targets. PMID:26083494
ReNE: A Cytoscape Plugin for Regulatory Network Enhancement
Politano, Gianfranco; Benso, Alfredo; Savino, Alessandro; Di Carlo, Stefano
2014-01-01
One of the biggest challenges in the study of biological regulatory mechanisms is the integration, americanmodeling, and analysis of the complex interactions which take place in biological networks. Despite post transcriptional regulatory elements (i.e., miRNAs) are widely investigated in current research, their usage and visualization in biological networks is very limited. Regulatory networks are commonly limited to gene entities. To integrate networks with post transcriptional regulatory data, researchers are therefore forced to manually resort to specific third party databases. In this context, we introduce ReNE, a Cytoscape 3.x plugin designed to automatically enrich a standard gene-based regulatory network with more detailed transcriptional, post transcriptional, and translational data, resulting in an enhanced network that more precisely models the actual biological regulatory mechanisms. ReNE can automatically import a network layout from the Reactome or KEGG repositories, or work with custom pathways described using a standard OWL/XML data format that the Cytoscape import procedure accepts. Moreover, ReNE allows researchers to merge multiple pathways coming from different sources. The merged network structure is normalized to guarantee a consistent and uniform description of the network nodes and edges and to enrich all integrated data with additional annotations retrieved from genome-wide databases like NCBI, thus producing a pathway fully manageable through the Cytoscape environment. The normalized network is then analyzed to include missing transcription factors, miRNAs, and proteins. The resulting enhanced network is still a fully functional Cytoscape network where each regulatory element (transcription factor, miRNA, gene, protein) and regulatory mechanism (up-regulation/down-regulation) is clearly visually identifiable, thus enabling a better visual understanding of its role and the effect in the network behavior. The enhanced network produced by ReNE is exportable in multiple formats for further analysis via third party applications. ReNE can be freely installed from the Cytoscape App Store (http://apps.cytoscape.org/apps/rene) and the full source code is freely available for download through a SVN repository accessible at http://www.sysbio.polito.it/tools_svn/BioInformatics/Rene/releases/. ReNE enhances a network by only integrating data from public repositories, without any inference or prediction. The reliability of the introduced interactions only depends on the reliability of the source data, which is out of control of ReNe developers. PMID:25541727
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Ma, Zihao; Carr, Steven A.
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC).more » Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. Molecular & Cellular Proteomics 16: 10.1074/mcp.M116.060301, 121–134, 2017.« less
The genetic basis of alcoholism: multiple phenotypes, many genes, complex networks
2012-01-01
Alcoholism is a significant public health problem. A picture of the genetic architecture underlying alcohol-related phenotypes is emerging from genome-wide association studies and work on genetically tractable model organisms. PMID:22348705
Genes from scratch--the evolutionary fate of de novo genes.
Schlötterer, Christian
2015-04-01
Although considered an extremely unlikely event, many genes emerge from previously noncoding genomic regions. This review covers the entire life cycle of such de novo genes. Two competing hypotheses about the process of de novo gene birth are discussed as well as the high death rate of de novo genes. Despite the high death rate, some de novo genes are retained and remain functional, even in distantly related species, through their integration into gene networks. Further studies combining gene expression with ribosome profiling in multiple populations across different species will be instrumental for an improved understanding of the evolutionary processes operating on de novo genes. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Ando, Tatsuya; Suguro, Miyuki; Kobayashi, Takeshi; Seto, Masao; Honda, Hiroyuki
2003-10-01
A fuzzy neural network (FNN) using gene expression profile data can select combinations of genes from thousands of genes, and is applicable to predict outcome for cancer patients after chemotherapy. However, wide clinical heterogeneity reduces the accuracy of prediction. To overcome this problem, we have proposed an FNN system based on majoritarian decision using multiple noninferior models. We used transcriptional profiling data, which were obtained from "Lymphochip" DNA microarrays (http://llmpp.nih.gov/DLBCL), reported by Rosenwald (N Engl J Med 2002; 346: 1937-47). When the data were analyzed by our FNN system, accuracy (73.4%) of outcome prediction using only 1 FNN model with 4 genes was higher than that (68.5%) of the Cox model using 17 genes. Higher accuracy (91%) was obtained when an FNN system with 9 noninferior models, consisting of 35 independent genes, was used. The genes selected by the system included genes that are informative in the prognosis of Diffuse large B-cell lymphoma (DLBCL), such as genes showing an expression pattern similar to that of CD10 and BCL-6 or similar to that of IRF-4 and BCL-4. We classified 220 DLBCL patients into 5 groups using the prediction results of 9 FNN models. These groups may correspond to DLBCL subtypes. In group A containing half of the 220 patients, patients with poor outcome were found to satisfy 2 rules, i.e., high expression of MAX dimerization with high expression of unknown A (LC_26146), or high expression of MAX dimerization with low expression of unknown B (LC_33144). The present paper is the first to describe the multiple noninferior FNN modeling system. This system is a powerful tool for predicting outcome and classifying patients, and is applicable to other heterogeneous diseases.
Fabi, João Paulo; Broetto, Sabrina Garcia; da Silva, Sarah Lígia Garcia Leme; Zhong, Silin; Lajolo, Franco Maria; do Nascimento, João Roberto Oliveira
2014-01-01
Papaya (Carica papaya L.) is a climacteric fleshy fruit that undergoes dramatic changes during ripening, most noticeably a severe pulp softening. However, little is known regarding the genetics of the cell wall metabolism in papayas. The present work describes the identification and characterization of genes related to pulp softening. We used gene expression profiling to analyze the correlations and co-expression networks of cell wall-related genes, and the results suggest that papaya pulp softening is accomplished by the interactions of multiple glycoside hydrolases. The polygalacturonase cpPG1 appeared to play a central role in the network and was further studied. The transient expression of cpPG1 in papaya results in pulp softening and leaf necrosis in the absence of ethylene action and confirms its role in papaya fruit ripening.
Bayesian state space models for dynamic genetic network construction across multiple tissues.
Liang, Yulan; Kelemen, Arpad
2016-08-01
Construction of gene-gene interaction networks and potential pathways is a challenging and important problem in genomic research for complex diseases while estimating the dynamic changes of the temporal correlations and non-stationarity are the keys in this process. In this paper, we develop dynamic state space models with hierarchical Bayesian settings to tackle this challenge for inferring the dynamic profiles and genetic networks associated with disease treatments. We treat both the stochastic transition matrix and the observation matrix time-variant and include temporal correlation structures in the covariance matrix estimations in the multivariate Bayesian state space models. The unevenly spaced short time courses with unseen time points are treated as hidden state variables. Hierarchical Bayesian approaches with various prior and hyper-prior models with Monte Carlo Markov Chain and Gibbs sampling algorithms are used to estimate the model parameters and the hidden state variables. We apply the proposed Hierarchical Bayesian state space models to multiple tissues (liver, skeletal muscle, and kidney) Affymetrix time course data sets following corticosteroid (CS) drug administration. Both simulation and real data analysis results show that the genomic changes over time and gene-gene interaction in response to CS treatment can be well captured by the proposed models. The proposed dynamic Hierarchical Bayesian state space modeling approaches could be expanded and applied to other large scale genomic data, such as next generation sequence (NGS) combined with real time and time varying electronic health record (EHR) for more comprehensive and robust systematic and network based analysis in order to transform big biomedical data into predictions and diagnostics for precision medicine and personalized healthcare with better decision making and patient outcomes.
Chauhan, Rinki; Ravi, Janani; Datta, Pratik; Chen, Tianlong; Schnappinger, Dirk; Bassler, Kevin E.; Balázsi, Gábor; Gennaro, Maria Laura
2016-01-01
Accessory sigma factors, which reprogram RNA polymerase to transcribe specific gene sets, activate bacterial adaptive responses to noxious environments. Here we reconstruct the complete sigma factor regulatory network of the human pathogen Mycobacterium tuberculosis by an integrated approach. The approach combines identification of direct regulatory interactions between M. tuberculosis sigma factors in an E. coli model system, validation of selected links in M. tuberculosis, and extensive literature review. The resulting network comprises 41 direct interactions among all 13 sigma factors. Analysis of network topology reveals (i) a three-tiered hierarchy initiating at master regulators, (ii) high connectivity and (iii) distinct communities containing multiple sigma factors. These topological features are likely associated with multi-layer signal processing and specialized stress responses involving multiple sigma factors. Moreover, the identification of overrepresented network motifs, such as autoregulation and coregulation of sigma and anti-sigma factor pairs, provides structural information that is relevant for studies of network dynamics. PMID:27029515
Passing messages between biological networks to refine predicted interactions.
Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng
2013-01-01
Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.
Simple method for assembly of CRISPR synergistic activation mediator gRNA expression array.
Vad-Nielsen, Johan; Nielsen, Anders Lade; Luo, Yonglun
2018-05-20
When studying complex interconnected regulatory networks, effective methods for simultaneously manipulating multiple genes expression are paramount. Previously, we have developed a simple method for generation of an all-in-one CRISPR gRNA expression array. We here present a Golden Gate Assembly-based system of synergistic activation mediator (SAM) compatible CRISPR/dCas9 gRNA expression array for the simultaneous activation of multiple genes. Using this system, we demonstrated the simultaneous activation of the transcription factors, TWIST, SNAIL, SLUG, and ZEB1 a human breast cancer cell line. Copyright © 2018 Elsevier B.V. All rights reserved.
Fernandez-Valverde, Selene L; Aguilera, Felipe; Ramos-Díaz, René Alexander
2018-06-18
The advent of high-throughput sequencing technologies has revolutionized the way we understand the transformation of genetic information into morphological traits. Elucidating the network of interactions between genes that govern cell differentiation through development is one of the core challenges in genome research. These networks are known as developmental gene regulatory networks (dGRNs) and consist largely of the functional linkage between developmental control genes, cis-regulatory modules and differentiation genes, which generate spatially and temporally refined patterns of gene expression. Over the last 20 years, great advances have been made in determining these gene interactions mainly in classical model systems, including human, mouse, sea urchin, fruit fly, and worm. This has brought about a radical transformation in the fields of developmental biology and evolutionary biology, allowing the generation of high-resolution gene regulatory maps to analyse cell differentiation during animal development. Such maps have enabled the identification of gene regulatory circuits and have led to the development of network inference methods that can recapitulate the differentiation of specific cell-types or developmental stages. In contrast, dGRN research in non-classical model systems has been limited to the identification of developmental control genes via the candidate gene approach and the characterization of their spatiotemporal expression patterns, as well as to the discovery of cis-regulatory modules via patterns of sequence conservation and/or predicted transcription-factor binding sites. However, thanks to the continuous advances in high-throughput sequencing technologies, this scenario is rapidly changing. Here, we give a historical overview on the architecture and elucidation of the dGRNs. Subsequently, we summarize the approaches available to unravel these regulatory networks, highlighting the vast range of possibilities of integrating multiple technical advances and theoretical approaches to expand our understanding on the global of gene regulation during animal development in non-classical model systems. Such new knowledge will not only lead to greater insights into the evolution of molecular mechanisms underlying cell identity and animal body plans, but also into the evolution of morphological key innovations in animals.
Hovel-Miner, Galadriel; Pampou, Sergey; Faucher, Sebastien P; Clarke, Margaret; Morozova, Irina; Morozov, Pavel; Russo, James J; Shuman, Howard A; Kalachikov, Sergey
2009-04-01
Legionella pneumophila is the causative agent of the severe and potentially fatal pneumonia Legionnaires' disease. L. pneumophila is able to replicate within macrophages and protozoa by establishing a replicative compartment in a process that requires the Icm/Dot type IVB secretion system. The signals and regulatory pathways required for Legionella infection and intracellular replication are poorly understood. Mutation of the rpoS gene, which encodes sigma(S), does not affect growth in rich medium but severely decreases L. pneumophila intracellular multiplication within protozoan hosts. To gain insight into the intracellular multiplication defect of an rpoS mutant, we examined its pattern of gene expression during exponential and postexponential growth. We found that sigma(S) affects distinct groups of genes that contribute to Legionella intracellular multiplication. We demonstrate that rpoS mutants have a functional Icm/Dot system yet are defective for the expression of many genes encoding Icm/Dot-translocated substrates. We also show that sigma(S) affects the transcription of the cpxR and pmrA genes, which encode two-component response regulators that directly affect the transcription of Icm/Dot substrates. Our characterization of the L. pneumophila small RNA csrB homologs, rsmY and rsmZ, introduces a link between sigma(S) and the posttranscriptional regulator CsrA. We analyzed the network of sigma(S)-controlled genes by mutational analysis of transcriptional regulators affected by sigma(S). One of these, encoding the L. pneumophila arginine repressor homolog gene, argR, is required for maximal intracellular growth in amoebae. These data show that sigma(S) is a key regulator of multiple pathways required for L. pneumophila intracellular multiplication.
Gong, Cuihua; Sun, Shangtong; Liu, Bing; Wang, Jing; Chen, Xiaodong
2017-06-01
The study aimed to identify the potential target genes and key miRNAs as well as to explore the underlying mechanisms in the pathogenesis of oral lichen planus (OLP) by bioinformatics analysis. The microarray data of GSE38617 were downloaded from Gene Expression Omnibus (GEO) database. A total of 7 OLP and 7 normal samples were used to identify the differentially expressed genes (DEGs) and miRNAs. The DEGs were then performed functional enrichment analyses. Furthermore, DEG-miRNA network and miRNA-function network were constructed by Cytoscape software. Total 1758 DEGs (598 up- and 1160 down-regulated genes) and 40 miRNAs (17 up- and 23 down-regulated miRNAs) were selected. The up-regulated genes were related to nuclear factor-Kappa B (NF-κB) signaling pathway, while down-regulated genes were mainly enriched in the function of ribosome. Tumor necrosis factor (TNF), caspase recruitment domain family, member 11 (CARD11) and mitochondrial ribosomal protein (MRP) genes were identified in these functions. In addition, miR-302 was a hub node in DEG-miRNA network and regulated cyclin D1 (CCND1). MiR-548a-2 was the key miRNA in miRNA-function network by regulating multiple functions including ribosomal function. The NF-κB signaling pathway and ribosome function may be the pathogenic mechanisms of OLP. The genes such as TNF, CARD11, MRP genes and CCND1 may be potential therapeutic target genes in OLP. MiR-548a-2 and miR-302 may play important roles in OLP development. Copyright © 2017 Elsevier Ltd. All rights reserved.
Rajamani, Deepa; Bhasin, Manoj K
2016-05-03
Pancreatic cancer is an aggressive cancer with dismal prognosis, urgently necessitating better biomarkers to improve therapeutic options and early diagnosis. Traditional approaches of biomarker detection that consider only one aspect of the biological continuum like gene expression alone are limited in their scope and lack robustness in identifying the key regulators of the disease. We have adopted a multidimensional approach involving the cross-talk between the omics spaces to identify key regulators of disease progression. Multidimensional domain-specific disease signatures were obtained using rank-based meta-analysis of individual omics profiles (mRNA, miRNA, DNA methylation) related to pancreatic ductal adenocarcinoma (PDAC). These domain-specific PDAC signatures were integrated to identify genes that were affected across multiple dimensions of omics space in PDAC (genes under multiple regulatory controls, GMCs). To further pin down the regulators of PDAC pathophysiology, a systems-level network was generated from knowledge-based interaction information applied to the above identified GMCs. Key regulators were identified from the GMC network based on network statistics and their functional importance was validated using gene set enrichment analysis and survival analysis. Rank-based meta-analysis identified 5391 genes, 109 miRNAs and 2081 methylation-sites significantly differentially expressed in PDAC (false discovery rate ≤ 0.05). Bimodal integration of meta-analysis signatures revealed 1150 and 715 genes regulated by miRNAs and methylation, respectively. Further analysis identified 189 altered genes that are commonly regulated by miRNA and methylation, hence considered GMCs. Systems-level analysis of the scale-free GMCs network identified eight potential key regulator hubs, namely E2F3, HMGA2, RASA1, IRS1, NUAK1, ACTN1, SKI and DLL1, associated with important pathways driving cancer progression. Survival analysis on individual key regulators revealed that higher expression of IRS1 and DLL1 and lower expression of HMGA2, ACTN1 and SKI were associated with better survival probabilities. It is evident from the results that our hierarchical systems-level multidimensional analysis approach has been successful in isolating the converging regulatory modules and associated key regulatory molecules that are potential biomarkers for pancreatic cancer progression.
Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.
2013-01-01
SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886
Wang, Nani; Zhao, Guizhi; Zhang, Yang; Wang, Xuping; Zhao, Lisha; Xu, Pingcui; Shou, Dan
2017-10-27
BACKGROUND Osteoporosis is a complex bone disorder with a genetic predisposition, and is a cause of health problems worldwide. In China, Curculigo orchioides (CO) has been widely used as a herbal medicine in the prevention and treatment of osteoporosis. However, research on the mechanism of action of CO is still lacking. The aim of this study was to identify the absorbable components, potential targets, and associated treatment pathways of CO using a network pharmacology approach. MATERIAL AND METHODS We explored the chemical components of CO and used the five main principles of drug absorption to identify absorbable components. Targets for the therapeutic actions of CO were obtained from the PharmMapper server database. Pathway enrichment analysis was performed using the Comparative Toxicogenomics Database (CTD). Cytoscape was used to visualize the multiple components-multiple target-multiple pathways-multiple disease network for CO. RESULTS We identified 77 chemical components of CO, of which 32 components could be absorbed in the blood. These potential active components of CO regulated 83 targets and affected 58 pathways. Data analysis showed that the genes for estrogen receptor alpha (ESR1) and beta (ESR2), and the gene for 11 beta-hydroxysteroid dehydrogenase type 1, or cortisone reductase (HSD11B1) were the main targets of CO. Endocrine regulatory factors and factors regulating calcium reabsorption, steroid hormone biosynthesis, and metabolic pathways were related to these main targets and to ten corresponding compounds. CONCLUSIONS The network pharmacology approach used in our study has attempted to explain the mechanisms for the effects of CO in the prevention and treatment of osteoporosis, and provides an alternative approach to the investigation of the effects of this complex compound.
Combining lipophilic dye, in situ hybridization, immunohistochemistry, and histology.
Duncan, Jeremy; Kersigo, Jennifer; Gray, Brian; Fritzsch, Bernd
2011-03-17
Going beyond single gene function to cut deeper into gene regulatory networks requires multiple mutations combined in a single animal. Such analysis of two or more genes needs to be complemented with in situ hybridization of other genes, or immunohistochemistry of their proteins, both in whole mounted developing organs or sections for detailed resolution of the cellular and tissue expression alterations. Combining multiple gene alterations requires the use of cre or flipase to conditionally delete genes and avoid embryonic lethality. Required breeding schemes dramatically enhance effort and cost proportional to the number of genes mutated, with an outcome of very few animals with the full repertoire of genetic modifications desired. Amortizing the vast amount of effort and time to obtain these few precious specimens that are carrying multiple mutations necessitates tissue optimization. Moreover, investigating a single animal with multiple techniques makes it easier to correlate gene deletion defects with expression profiles. We have developed a technique to obtain a more thorough analysis of a given animal; with the ability to analyze several different histologically recognizable structures as well as gene and protein expression all from the same specimen in both whole mounted organs and sections. Although mice have been utilized to demonstrate the effectiveness of this technique it can be applied to a wide array of animals. To do this we combine lipophilic dye tracing, whole mount in situ hybridization, immunohistochemistry, and histology to extract the maximal possible amount of data.
Combining Lipophilic dye, in situ Hybridization, Immunohistochemistry, and Histology
Duncan, Jeremy; Kersigo, Jennifer; Gray, Brian; Fritzsch, Bernd
2011-01-01
Going beyond single gene function to cut deeper into gene regulatory networks requires multiple mutations combined in a single animal. Such analysis of two or more genes needs to be complemented with in situ hybridization of other genes, or immunohistochemistry of their proteins, both in whole mounted developing organs or sections for detailed resolution of the cellular and tissue expression alterations. Combining multiple gene alterations requires the use of cre or flipase to conditionally delete genes and avoid embryonic lethality. Required breeding schemes dramatically enhance effort and cost proportional to the number of genes mutated, with an outcome of very few animals with the full repertoire of genetic modifications desired. Amortizing the vast amount of effort and time to obtain these few precious specimens that are carrying multiple mutations necessitates tissue optimization. Moreover, investigating a single animal with multiple techniques makes it easier to correlate gene deletion defects with expression profiles. We have developed a technique to obtain a more thorough analysis of a given animal; with the ability to analyze several different histologically recognizable structures as well as gene and protein expression all from the same specimen in both whole mounted organs and sections. Although mice have been utilized to demonstrate the effectiveness of this technique it can be applied to a wide array of animals. To do this we combine lipophilic dye tracing, whole mount in situ hybridization, immunohistochemistry, and histology to extract the maximal possible amount of data. PMID:21445047
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Impact of environmental inputs on reverse-engineering approach to network structures.
Wu, Jianhua; Sinfield, James L; Buchanan-Wollaston, Vicky; Feng, Jianfeng
2009-12-04
Uncovering complex network structures from a biological system is one of the main topic in system biology. The network structures can be inferred by the dynamical Bayesian network or Granger causality, but neither techniques have seriously taken into account the impact of environmental inputs. With considerations of natural rhythmic dynamics of biological data, we propose a system biology approach to reveal the impact of environmental inputs on network structures. We first represent the environmental inputs by a harmonic oscillator and combine them with Granger causality to identify environmental inputs and then uncover the causal network structures. We also generalize it to multiple harmonic oscillators to represent various exogenous influences. This system approach is extensively tested with toy models and successfully applied to a real biological network of microarray data of the flowering genes of the model plant Arabidopsis Thaliana. The aim is to identify those genes that are directly affected by the presence of the sunlight and uncover the interactive network structures associating with flowering metabolism. We demonstrate that environmental inputs are crucial for correctly inferring network structures. Harmonic causal method is proved to be a powerful technique to detect environment inputs and uncover network structures, especially when the biological data exhibit periodic oscillations.
Tools and Models for Integrating Multiple Cellular Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerstein, Mark
2015-11-06
In this grant, we have systematically investigated the integrated networks, which are responsible for the coordination of activity between metabolic pathways in prokaryotes. We have developed several computational tools to analyze the topology of the integrated networks consisting of metabolic, regulatory, and physical interaction networks. The tools are all open-source, and they are available to download from Github, and can be incorporated in the Knowledgebase. Here, we summarize our work as follow. Understanding the topology of the integrated networks is the first step toward understanding its dynamics and evolution. For Aim 1 of this grant, we have developed a novelmore » algorithm to determine and measure the hierarchical structure of transcriptional regulatory networks [1]. The hierarchy captures the direction of information flow in the network. The algorithm is generally applicable to regulatory networks in prokaryotes, yeast and higher organisms. Integrated datasets are extremely beneficial in understanding the biology of a system in a compact manner due to the conflation of multiple layers of information. Therefore for Aim 2 of this grant, we have developed several tools and carried out analysis for integrating system-wide genomic information. To make use of the structural data, we have developed DynaSIN for protein-protein interactions networks with various dynamical interfaces [2]. We then examined the association between network topology with phenotypic effects such as gene essentiality. In particular, we have organized E. coli and S. cerevisiae transcriptional regulatory networks into hierarchies. We then correlated gene phenotypic effects by tinkering with different layers to elucidate which layers were more tolerant to perturbations [3]. In the context of evolution, we also developed a workflow to guide the comparison between different types of biological networks across various species using the concept of rewiring [4], and Furthermore, we have developed CRIT for correlation analysis in systems biology [5]. For Aim 3, we have further investigated the scaling relationship that the number of Transcription Factors (TFs) in a genome is proportional to the square of the total number of genes. We have extended the analysis from transcription factors to various classes of functional categories, and from individual categories to joint distribution [6]. By introducing a new analytical framework, we have generalized the original toolbox model to take into account of metabolic network with arbitrary network topology [7].« less
A prior-based integrative framework for functional transcriptional regulatory network inference
Siahpirani, Alireza F.
2017-01-01
Abstract Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization. PMID:27794550
Liu, Jie; Xie, Yaxiong; Ducharme, Danica M K; Shen, Jun; Diwan, Bhalchandra A; Merrick, B Alex; Grissom, Sherry F; Tucker, Charles J; Paules, Richard S; Tennant, Raymond; Waalkes, Michael P
2006-03-01
Our previous work has shown that exposure to inorganic arsenic in utero produces hepatocellular carcinoma (HCC) in adult male mice. To explore further the molecular mechanisms of transplacental arsenic hepatocarcinogenesis, we conducted a second arsenic transplacental carcinogenesis study and used a genomewide microarray to profile arsenic-induced aberrant gene expression more extensively. Briefly, pregnant C3H mice were given drinking water containing 85 ppm arsenic as sodium arsenite or unaltered water from days 8 to 18 of gestation. The incidence of HCC in adult male offspring was increased 4-fold and tumor multiplicity 3-fold after transplacental arsenic exposure. Samples of normal liver and liver tumors were taken at autopsy for genomic analysis. Arsenic exposure in utero resulted in significant alterations (p < 0.001) in the expression of 2,010 genes in arsenic-exposed liver samples and in the expression of 2,540 genes in arsenic-induced HCC. Ingenuity Pathway Analysis revealed that significant alterations in gene expression occurred in a number of biological networks, and Myc plays a critical role in one of the primary networks. Real-time reverse transcriptase-polymerase chain reaction and Western blot analysis of selected genes/proteins showed > 90% concordance. Arsenic-altered gene expression included activation of oncogenes and HCC biomarkers, and increased expression of cell proliferation-related genes, stress proteins, and insulin-like growth factors and genes involved in cell-cell communications. Liver feminization was evidenced by increased expression of estrogen-linked genes and altered expression of genes that encode gender-related metabolic enzymes. These novel findings are in agreement with the biology and histology of arsenic-induced HCC, thereby indicating that multiple genetic events are associated with transplacental arsenic hepatocarcinogenesis.
Identification of a neuronal transcription factor network involved in medulloblastoma development
2013-01-01
Background Medulloblastomas, the most frequent malignant brain tumours affecting children, comprise at least 4 distinct clinicogenetic subgroups. Aberrant sonic hedgehog (SHH) signalling is observed in approximately 25% of tumours and defines one subgroup. Although alterations in SHH pathway genes (e.g. PTCH1, SUFU) are observed in many of these tumours, high throughput genomic analyses have identified few other recurring mutations. Here, we have mutagenised the Ptch+/- murine tumour model using the Sleeping Beauty transposon system to identify additional genes and pathways involved in SHH subgroup medulloblastoma development. Results Mutagenesis significantly increased medulloblastoma frequency and identified 17 candidate cancer genes, including orthologs of genes somatically mutated (PTEN, CREBBP) or associated with poor outcome (PTEN, MYT1L) in the human disease. Strikingly, these candidate genes were enriched for transcription factors (p=2x10-5), the majority of which (6/7; Crebbp, Myt1L, Nfia, Nfib, Tead1 and Tgif2) were linked within a single regulatory network enriched for genes associated with a differentiated neuronal phenotype. Furthermore, activity of this network varied significantly between the human subgroups, was associated with metastatic disease, and predicted poor survival specifically within the SHH subgroup of tumours. Igf2, previously implicated in medulloblastoma, was the most differentially expressed gene in murine tumours with network perturbation, and network activity in both mouse and human tumours was characterised by enrichment for multiple gene-sets indicating increased cell proliferation, IGF signalling, MYC target upregulation, and decreased neuronal differentiation. Conclusions Collectively, our data support a model of medulloblastoma development in SB-mutagenised Ptch+/- mice which involves disruption of a novel transcription factor network leading to Igf2 upregulation, proliferation of GNPs, and tumour formation. Moreover, our results identify rational therapeutic targets for SHH subgroup tumours, alongside prognostic biomarkers for the identification of poor-risk SHH patients. PMID:24252690
Genome-wide association and network analysis of lung function in the Framingham Heart Study.
Liao, Shu-Yi; Lin, Xihong; Christiani, David C
2014-09-01
Single nucleotide polymorphisms have been found to be associated with pulmonary function using genome-wide association studies. However, lung function is a complex trait that is likely to be influenced by multiple gene-gene interactions besides individual genes. Our goal is to build a cellular network to explore the relationship between pulmonary function and genotypes by combining SNP level and network analyses using longitudinal lung function data from the Framingham Heart Study. We analyzed 2,698 genotyped participants from the Offspring cohort that had an average of 3.35 spirometry measurements per person for a mean length of 13 years. Repeated forced expiratory volume in one second (FEV1 ) and the ratio of FEV1 to forced vital capacity (FVC) were used as outcomes. Data were analyzed using linear-mixed models for the association between lung function and alleles by accounting for the correlation among repeated measures over time within the same subject and within-family correlation. Network analyses were performed using dmGWAS and validated with data from the Third Generation cohort. Analyses identified SMAD3, TGFBR2, CD44, CTGF, VCAN, CTNNB1, SCGB1A1, PDE4D, NRG1, EPHB1, and LYN as contributors to pulmonary function. Most of these genes were novel that were not found previously using solely SNP-level analysis. These novel genes are involving the transforming growth factor beta (TGFB)-SMAD pathway, Wnt/beta-catenin pathway, etc. Therefore, combining SNP-level and network analyses using longitudinal lung function data is a useful alternative strategy to identify risk genes. © 2014 WILEY PERIODICALS, INC.
Customizing cell signaling using engineered genetic logic circuits.
Wang, Baojun; Buck, Martin
2012-08-01
Cells live in an ever-changing environment and continuously sense, process and react to environmental signals using their inherent signaling and gene regulatory networks. Recently, there have been great advances on rewiring the native cell signaling and gene networks to program cells to sense multiple noncognate signals and integrate them in a logical manner before initiating a desired response. Here, we summarize the current state-of-the-art of engineering synthetic genetic logic circuits to customize cellular signaling behaviors, and discuss their promising applications in biocomputing, environmental, biotechnological and biomedical areas as well as the remaining challenges in this growing field. Copyright © 2012 Elsevier Ltd. All rights reserved.
Scarpa, Joseph R; Jiang, Peng; Losic, Bojan; Readhead, Ben; Gao, Vance D; Dudley, Joel T; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew
2016-07-01
Recent systems-based analyses have demonstrated that sleep and stress traits emerge from shared genetic and transcriptional networks, and clinical work has elucidated the emergence of sleep dysfunction and stress susceptibility as early symptoms of Huntington's disease. Understanding the biological bases of these early non-motor symptoms may reveal therapeutic targets that prevent disease onset or slow disease progression, but the molecular mechanisms underlying this complex clinical presentation remain largely unknown. In the present work, we specifically examine the relationship between these psychiatric traits and Huntington's disease (HD) by identifying striatal transcriptional networks shared by HD, stress, and sleep phenotypes. First, we utilize a systems-based approach to examine a large publicly available human transcriptomic dataset for HD (GSE3790 from GEO) in a novel way. We use weighted gene coexpression network analysis and differential connectivity analyses to identify transcriptional networks dysregulated in HD, and we use an unbiased ranking scheme that leverages both gene- and network-level information to identify a novel astrocyte-specific network as most relevant to HD caudate. We validate this result in an independent HD cohort. Next, we computationally predict FOXO3 as a regulator of this network, and use multiple publicly available in vitro and in vivo experimental datasets to validate that this astrocyte HD network is downstream of a signaling pathway important in adult neurogenesis (TGFβ-FOXO3). We also map this HD-relevant caudate subnetwork to striatal transcriptional networks in a large (n = 100) chronically stressed (B6xA/J)F2 mouse population that has been extensively phenotyped (328 stress- and sleep-related measurements), and we show that this striatal astrocyte network is correlated to sleep and stress traits, many of which are known to be altered in HD cohorts. We identify causal regulators of this network through Bayesian network analysis, and we highlight their relevance to motor, mood, and sleep traits through multiple in silico approaches, including an examination of their protein binding partners. Finally, we show that these causal regulators may be therapeutically viable for HD because their downstream network was partially modulated by deep brain stimulation of the subthalamic nucleus, a medical intervention thought to confer some therapeutic benefit to HD patients. In conclusion, we show that an astrocyte transcriptional network is primarily associated to HD in the caudate and provide evidence for its relationship to molecular mechanisms of neural stem cell homeostasis. Furthermore, we present a unified systems-based framework for identifying gene networks that are associated with complex non-motor traits that manifest in the earliest phases of HD. By analyzing and integrating multiple independent datasets, we identify a point of molecular convergence between sleep, stress, and HD that reflects their phenotypic comorbidity and reveals a molecular pathway involved in HD progression.
Reyes-Gibby, Cielito C; Yuan, Christine; Wang, Jian; Yeung, Sai-Ching J; Shete, Sanjay
2015-06-05
Addictions to alcohol and tobacco, known risk factors for cancer, are complex heritable disorders. Addictive behaviors have a bidirectional relationship with pain. We hypothesize that the associations between alcohol, smoking, and opioid addiction observed in cancer patients have a genetic basis. Therefore, using bioinformatics tools, we explored the underlying genetic basis and identified new candidate genes and common biological pathways for smoking, alcohol, and opioid addiction. Literature search showed 56 genes associated with alcohol, smoking and opioid addiction. Using Core Analysis function in Ingenuity Pathway Analysis software, we found that ERK1/2 was strongly interconnected across all three addiction networks. Genes involved in immune signaling pathways were shown across all three networks. Connect function from IPA My Pathway toolbox showed that DRD2 is the gene common to both the list of genetic variations associated with all three addiction phenotypes and the components of the brain neuronal signaling network involved in substance addiction. The top canonical pathways associated with the 56 genes were: 1) calcium signaling, 2) GPCR signaling, 3) cAMP-mediated signaling, 4) GABA receptor signaling, and 5) G-alpha i signaling. Cancer patients are often prescribed opioids for cancer pain thus increasing their risk for opioid abuse and addiction. Our findings provide candidate genes and biological pathways underlying addiction phenotypes, which may be future targets for treatment of addiction. Further study of the variations of the candidate genes could allow physicians to make more informed decisions when treating cancer pain with opioid analgesics.
Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells
Borel, Christelle; Mudge, Jonathan M.; Howald, Cédric; Foissac, Sylvain; Ucla, Catherine; Chrast, Jacqueline; Ribeca, Paolo; Martin, David; Murray, Ryan R.; Yang, Xinping; Ghamsari, Lila; Lin, Chenwei; Bell, Ian; Dumais, Erica; Drenkow, Jorg; Tress, Michael L.; Gelpí, Josep Lluís; Orozco, Modesto; Valencia, Alfonso; van Berkum, Nynke L.; Lajoie, Bryan R.; Vidal, Marc; Stamatoyannopoulos, John; Batut, Philippe; Dobin, Alex; Harrow, Jennifer; Hubbard, Tim; Dekker, Job; Frankish, Adam; Salehi-Ashtiani, Kourosh; Reymond, Alexandre; Antonarakis, Stylianos E.; Guigó, Roderic; Gingeras, Thomas R.
2012-01-01
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network. PMID:22238572
Zinati, Zahra; Shamloo-Dashtpagerdi, Roohollah; Behpouri, Ali
2016-01-01
As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characterization of miRNAs along with the corresponding target genes in C. sativus might expand our perspectives on the roles of miRNAs in carotenoid/apocarotenoid biosynthetic pathway. A computational analysis was used to identify miRNAs and their targets using EST (Expressed Sequence Tag) library from mature saffron stigmas. Then, a gene co- expression network was constructed to identify genes which are potentially involved in carotenoid/apocarotenoid biosynthetic pathways. EST analysis led to the identification of two putative miRNAs (miR414 and miR837-5p) along with the corresponding stem- looped precursors. To our knowledge, this is the first report on miR414 and miR837-5p in C. sativus. Co-expression network analysis indicated that miR414 and miR837-5p may play roles in C. sativus metabolic pathways and led to identification of candidate genes including six transcription factors and one protein kinase probably involved in carotenoid/apocarotenoid biosynthetic pathway. Presence of transcription factors, miRNAs and protein kinase in the network indicated multiple layers of regulation in saffron stigma. The candidate genes from this study may help unraveling regulatory networks underlying the carotenoid/apocarotenoid biosynthesis in saffron and designing metabolic engineering for enhanced secondary metabolites. PMID:28261627
Network-Based Identification and Prioritization of Key Regulators of Coronary Artery Disease Loci
Zhao, Yuqi; Chen, Jing; Freudenberg, Johannes M.; Meng, Qingying; Rajpal, Deepak K.; Yang, Xia
2017-01-01
Objective Recent genome-wide association studies of coronary artery disease (CAD) have revealed 58 genome-wide significant and 148 suggestive genetic loci. However, the molecular mechanisms through which they contribute to CAD and the clinical implications of these findings remain largely unknown. We aim to retrieve gene subnetworks of the 206 CAD loci and identify and prioritize candidate regulators to better understand the biological mechanisms underlying the genetic associations. Approach and Results We devised a new integrative genomics approach that incorporated (1) candidate genes from the top CAD loci, (2) the complete genetic association results from the 1000 genomes-based CAD genome-wide association studies from the Coronary Artery Disease Genome Wide Replication and Meta-Analysis Plus the Coronary Artery Disease consortium, (3) tissue-specific gene regulatory networks that depict the potential relationship and interactions between genes, and (4) tissue-specific gene expression patterns between CAD patients and controls. The networks and top-ranked regulators according to these data-driven criteria were further queried against literature, experimental evidence, and drug information to evaluate their disease relevance and potential as drug targets. Our analysis uncovered several potential novel regulators of CAD such as LUM and STAT3, which possess properties suitable as drug targets. We also revealed molecular relations and potential mechanisms through which the top CAD loci operate. Furthermore, we found that multiple CAD-relevant biological processes such as extracellular matrix, inflammatory and immune pathways, complement and coagulation cascades, and lipid metabolism interact in the CAD networks. Conclusions Our data-driven integrative genomics framework unraveled tissue-specific relations among the candidate genes of the CAD genome-wide association studies loci and prioritized novel network regulatory genes orchestrating biological processes relevant to CAD. PMID:26966275
Zheng, Xiaoyan; Cai, Danying; Potter, Daniel; Postman, Joseph; Liu, Jing; Teng, Yuanwen
2014-11-01
Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence datasets. Phylogenetic trees based on both cpDNA and nuclear LFY2int2-N (LN) data resulted in poor resolution, especially, only five primary species were monophyletic in the LN tree. A phylogenetic network of LN suggested that reticulation caused by hybridization is one of the major evolutionary processes for Pyrus species. Polytomies of the gene trees and star-like structure of cpDNA networks suggested rapid radiation is another major evolutionary process, especially for the occidental species. Pyrus calleryana and P. regelii were the earliest diverged Pyrus species. Two North African species, P. cordata, P. spinosa and P. betulaefolia were descendent of primitive stock Pyrus species and still share some common molecular characters. Southwestern China, where a large number of P. pashia populations are found, is probably the most important diversification center of Pyrus. More accessions and nuclear genes are needed for further understanding the evolutionary histories of Pyrus. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Hwang, Sohyun; Kim, Chan Yeong; Ji, Sun-Gou; Go, Junhyeok; Kim, Hanhae; Yang, Sunmo; Kim, Hye Jin; Cho, Ara; Yoon, Sang Sun; Lee, Insuk
2016-05-01
Pseudomonas aeruginosa is a Gram-negative bacterium of clinical significance. Although the genome of PAO1, a prototype strain of P. aeruginosa, has been extensively studied, approximately one-third of the functional genome remains unknown. With the emergence of antibiotic-resistant strains of P. aeruginosa, there is an urgent need to develop novel antibiotic and anti-virulence strategies, which may be facilitated by an approach that explores P. aeruginosa gene function in systems-level models. Here, we present a genome-wide functional network of P. aeruginosa genes, PseudomonasNet, which covers 98% of the coding genome, and a companion web server to generate functional hypotheses using various network-search algorithms. We demonstrate that PseudomonasNet-assisted predictions can effectively identify novel genes involved in virulence and antibiotic resistance. Moreover, an antibiotic-resistance network based on PseudomonasNet reveals that P. aeruginosa has common modular genetic organisations that confer increased or decreased resistance to diverse antibiotics, which accounts for the pervasiveness of cross-resistance across multiple drugs. The same network also suggests that P. aeruginosa has developed mechanism of trade-off in resistance across drugs by altering genetic interactions. Taken together, these results clearly demonstrate the usefulness of a genome-scale functional network to investigate pathogenic systems in P. aeruginosa.
Fabi, João Paulo; Broetto, Sabrina Garcia; da Silva, Sarah Lígia Garcia Leme; Zhong, Silin; Lajolo, Franco Maria; do Nascimento, João Roberto Oliveira
2014-01-01
Papaya (Carica papaya L.) is a climacteric fleshy fruit that undergoes dramatic changes during ripening, most noticeably a severe pulp softening. However, little is known regarding the genetics of the cell wall metabolism in papayas. The present work describes the identification and characterization of genes related to pulp softening. We used gene expression profiling to analyze the correlations and co-expression networks of cell wall-related genes, and the results suggest that papaya pulp softening is accomplished by the interactions of multiple glycoside hydrolases. The polygalacturonase cpPG1 appeared to play a central role in the network and was further studied. The transient expression of cpPG1 in papaya results in pulp softening and leaf necrosis in the absence of ethylene action and confirms its role in papaya fruit ripening. PMID:25162506
Chasman, Deborah; Walters, Kevin B.; Lopes, Tiago J. S.; Eisfeld, Amie J.; Kawaoka, Yoshihiro; Roy, Sushmita
2016-01-01
Mammalian host response to pathogenic infections is controlled by a complex regulatory network connecting regulatory proteins such as transcription factors and signaling proteins to target genes. An important challenge in infectious disease research is to understand molecular similarities and differences in mammalian host response to diverse sets of pathogens. Recently, systems biology studies have produced rich collections of omic profiles measuring host response to infectious agents such as influenza viruses at multiple levels. To gain a comprehensive understanding of the regulatory network driving host response to multiple infectious agents, we integrated host transcriptomes and proteomes using a network-based approach. Our approach combines expression-based regulatory network inference, structured-sparsity based regression, and network information flow to infer putative physical regulatory programs for expression modules. We applied our approach to identify regulatory networks, modules and subnetworks that drive host response to multiple influenza infections. The inferred regulatory network and modules are significantly enriched for known pathways of immune response and implicate apoptosis, splicing, and interferon signaling processes in the differential response of viral infections of different pathogenicities. We used the learned network to prioritize regulators and study virus and time-point specific networks. RNAi-based knockdown of predicted regulators had significant impact on viral replication and include several previously unknown regulators. Taken together, our integrated analysis identified novel module level patterns that capture strain and pathogenicity-specific patterns of expression and helped identify important regulators of host response to influenza infection. PMID:27403523
Foo, Mathias; Gherman, Iulia; Zhang, Peijun; Bates, Declan G; Denby, Katherine J
2018-05-23
Crop disease leads to significant waste worldwide, both pre- and postharvest, with subsequent economic and sustainability consequences. Disease outcome is determined both by the plants' response to the pathogen and by the ability of the pathogen to suppress defense responses and manipulate the plant to enhance colonization. The defense response of a plant is characterized by significant transcriptional reprogramming mediated by underlying gene regulatory networks, and components of these networks are often targeted by attacking pathogens. Here, using gene expression data from Botrytis cinerea-infected Arabidopsis plants, we develop a systematic approach for mitigating the effects of pathogen-induced network perturbations, using the tools of synthetic biology. We employ network inference and system identification techniques to build an accurate model of an Arabidopsis defense subnetwork that contains key genes determining susceptibility of the plant to the pathogen attack. Once validated against time-series data, we use this model to design and test perturbation mitigation strategies based on the use of genetic feedback control. We show how a synthetic feedback controller can be designed to attenuate the effect of external perturbations on the transcription factor CHE in our subnetwork. We investigate and compare two approaches for implementing such a controller biologically-direct implementation of the genetic feedback controller, and rewiring the regulatory regions of multiple genes-to achieve the network motif required to implement the controller. Our results highlight the potential of combining feedback control theory with synthetic biology for engineering plants with enhanced resilience to environmental stress.
MINE: Module Identification in Networks
2011-01-01
Background Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks. Results MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties. Conclusions MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans. PMID:21605434
Semantic integration of data on transcriptional regulation
Baitaluk, Michael; Ponomarenko, Julia
2010-01-01
Motivation: Experimental and predicted data concerning gene transcriptional regulation are distributed among many heterogeneous sources. However, there are no resources to integrate these data automatically or to provide a ‘one-stop shop’ experience for users seeking information essential for deciphering and modeling gene regulatory networks. Results: IntegromeDB, a semantic graph-based ‘deep-web’ data integration system that automatically captures, integrates and manages publicly available data concerning transcriptional regulation, as well as other relevant biological information, is proposed in this article. The problems associated with data integration are addressed by ontology-driven data mapping, multiple data annotation and heterogeneous data querying, also enabling integration of the user's data. IntegromeDB integrates over 100 experimental and computational data sources relating to genomics, transcriptomics, genetics, and functional and interaction data concerning gene transcriptional regulation in eukaryotes and prokaryotes. Availability: IntegromeDB is accessible through the integrated research environment BiologicalNetworks at http://www.BiologicalNetworks.org Contact: baitaluk@sdsc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20427517
Kutejova, Eva; Sasai, Noriaki; Shah, Ankita; Gouti, Mina; Briscoe, James
2016-03-21
In the vertebrate neural tube, a morphogen-induced transcriptional network produces multiple molecularly distinct progenitor domains, each generating different neuronal subtypes. Using an in vitro differentiation system, we defined gene expression signatures of distinct progenitor populations and identified direct gene-regulatory inputs corresponding to locations of specific transcription factor binding. Combined with targeted perturbations of the network, this revealed a mechanism in which a progenitor identity is installed by active repression of the entire transcriptional programs of other neural progenitor fates. In the ventral neural tube, sonic hedgehog (Shh) signaling, together with broadly expressed transcriptional activators, concurrently activates the gene expression programs of several domains. The specific outcome is selected by repressive input provided by Shh-induced transcription factors that act as the key nodes in the network, enabling progenitors to adopt a single definitive identity from several initially permitted options. Together, the data suggest design principles relevant to many developing tissues. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Semantic integration of data on transcriptional regulation.
Baitaluk, Michael; Ponomarenko, Julia
2010-07-01
Experimental and predicted data concerning gene transcriptional regulation are distributed among many heterogeneous sources. However, there are no resources to integrate these data automatically or to provide a 'one-stop shop' experience for users seeking information essential for deciphering and modeling gene regulatory networks. IntegromeDB, a semantic graph-based 'deep-web' data integration system that automatically captures, integrates and manages publicly available data concerning transcriptional regulation, as well as other relevant biological information, is proposed in this article. The problems associated with data integration are addressed by ontology-driven data mapping, multiple data annotation and heterogeneous data querying, also enabling integration of the user's data. IntegromeDB integrates over 100 experimental and computational data sources relating to genomics, transcriptomics, genetics, and functional and interaction data concerning gene transcriptional regulation in eukaryotes and prokaryotes. IntegromeDB is accessible through the integrated research environment BiologicalNetworks at http://www.BiologicalNetworks.org baitaluk@sdsc.edu Supplementary data are available at Bioinformatics online.
Inhibition-Based Biomarkers for Autism Spectrum Disorder.
Levin, April R; Nelson, Charles A
2015-07-01
Autism spectrum disorder (ASD) is a behaviorally defined and heterogeneous disorder. Biomarkers for ASD offer the opportunity to improve prediction, diagnosis, stratification by severity and subtype, monitoring over time and in response to interventions, and overall understanding of the underlying biology of this disorder. A variety of potential biomarkers, from the level of genes and proteins to network-level interactions, is currently being examined. Many of these biomarkers relate to inhibition, which is of particular interest because in many cases ASD is thought to be a disorder of imbalance between excitation and inhibition. Abnormalities in inhibition at the cellular level lead to emergent properties in networks of neurons. These properties take into account a more complete genetic and cellular background than findings at the level of individual genes or cells, and are able to be measured in live humans, offering additional potential as diagnostic biomarkers and predictors of behaviors. In this review we provide examples of how altered inhibition may inform the search for ASD biomarkers at multiple levels, from genes to cells to networks.
Kiefer, Christiane; Koch, Marcus A.
2012-01-01
74 of the currently accepted 111 taxa of the North American genus Boechera (Brassicaceae) were subject to pyhlogenetic reconstruction and network analysis. The dataset comprised 911 accessions for which ITS sequences were analyzed. Phylogenetic analyses yielded largely unresolved trees. Together with the network analysis confirming this result this can be interpreted as an indication for multiple, independent, and rapid diversification events. Network analyses were superimposed with datasets describing i) geographical distribution, ii) taxonomy, iii) reproductive mode, and iv) distribution history based on phylogeographic evidence. Our results provide first direct evidence for enormous reticulate evolution in the entire genus and give further insights into the evolutionary history of this complex genus on a continental scale. In addition two novel single-copy gene markers, orthologues of the Arabidopsis thaliana genes At2g25920 and At3g18900, were analyzed for subsets of taxa and confirmed the findings obtained through the ITS data. PMID:22606266
Benitez, Cecil M.; Qu, Kun; Sugiyama, Takuya; Pauerstein, Philip T.; Liu, Yinghua; Tsai, Jennifer; Gu, Xueying; Ghodasara, Amar; Arda, H. Efsun; Zhang, Jiajing; Dekker, Joseph D.; Tucker, Haley O.; Chang, Howard Y.; Kim, Seung K.
2014-01-01
The regulatory logic underlying global transcriptional programs controlling development of visceral organs like the pancreas remains undiscovered. Here, we profiled gene expression in 12 purified populations of fetal and adult pancreatic epithelial cells representing crucial progenitor cell subsets, and their endocrine or exocrine progeny. Using probabilistic models to decode the general programs organizing gene expression, we identified co-expressed gene sets in cell subsets that revealed patterns and processes governing progenitor cell development, lineage specification, and endocrine cell maturation. Purification of Neurog3 mutant cells and module network analysis linked established regulators such as Neurog3 to unrecognized gene targets and roles in pancreas development. Iterative module network analysis nominated and prioritized transcriptional regulators, including diabetes risk genes. Functional validation of a subset of candidate regulators with corresponding mutant mice revealed that the transcription factors Etv1, Prdm16, Runx1t1 and Bcl11a are essential for pancreas development. Our integrated approach provides a unique framework for identifying regulatory genes and functional gene sets underlying pancreas development and associated diseases such as diabetes mellitus. PMID:25330008
Insights into TREM2 biology by network analysis of human brain gene expression data
Forabosco, Paola; Ramasamy, Adaikalavan; Trabzuni, Daniah; Walker, Robert; Smith, Colin; Bras, Jose; Levine, Adam P.; Hardy, John; Pocock, Jennifer M.; Guerreiro, Rita; Weale, Michael E.; Ryten, Mina
2013-01-01
Rare variants in TREM2 cause susceptibility to late-onset Alzheimer's disease. Here we use microarray-based expression data generated from 101 neuropathologically normal individuals and covering 10 brain regions, including the hippocampus, to understand TREM2 biology in human brain. Using network analysis, we detect a highly preserved TREM2-containing module in human brain, show that it relates to microglia, and demonstrate that TREM2 is a hub gene in 5 brain regions, including the hippocampus, suggesting that it can drive module function. Using enrichment analysis we show significant overrepresentation of genes implicated in the adaptive and innate immune system. Inspection of genes with the highest connectivity to TREM2 suggests that it plays a key role in mediating changes in the microglial cytoskeleton necessary not only for phagocytosis, but also migration. Most importantly, we show that the TREM2-containing module is significantly enriched for genes genetically implicated in Alzheimer's disease, multiple sclerosis, and motor neuron disease, implying that these diseases share common pathways centered on microglia and that among the genes identified are possible new disease-relevant genes. PMID:23855984
Li, Cheng-Wei; Wang, Wen-Hsin; Chen, Bor-Sen
2016-01-01
Aging is an inevitable part of life for humans, and slowing down the aging process has become a main focus of human endeavor. Here, we applied a systems biology approach to construct protein-protein interaction networks, gene regulatory networks, and epigenetic networks, i.e. genetic and epigenetic networks (GENs), of elderly individuals and young controls. We then compared these GENs to extract aging mechanisms using microarray data in peripheral blood mononuclear cells, microRNA (miRNA) data, and database mining. The core GENs of elderly individuals and young controls were obtained by applying principal network projection to GENs based on Principal Component Analysis. By comparing the core networks, we identified that to overcome the accumulated mutation of genes in the aging process the transcription factor JUN can be activated by stress signals, including the MAPK signaling, T-cell receptor signaling, and neurotrophin signaling pathways through DNA methylation of BTG3, G0S2, and AP2B1 and the regulations of mir-223 let-7d, and mir-130a. We also address the aging mechanisms in old men and women. Furthermore, we proposed that drugs designed to target these DNA methylated genes or miRNAs may delay aging. A multiple drug combination comprising phenylalanine, cholesterol, and palbociclib was finally designed for delaying the aging process. PMID:26895224
P³DB 3.0: From plant phosphorylation sites to protein networks.
Yao, Qiuming; Ge, Huangyi; Wu, Shangquan; Zhang, Ning; Chen, Wei; Xu, Chunhui; Gao, Jianjiong; Thelen, Jay J; Xu, Dong
2014-01-01
In the past few years, the Plant Protein Phosphorylation Database (P(3)DB, http://p3db.org) has become one of the most significant in vivo data resources for studying plant phosphoproteomics. We have substantially updated P(3)DB with respect to format, new datasets and analytic tools. In the P(3)DB 3.0, there are altogether 47 923 phosphosites in 16 477 phosphoproteins curated across nine plant organisms from 32 studies, which have met our multiple quality standards for acquisition of in vivo phosphorylation site data. Centralized by these phosphorylation data, multiple related data and annotations are provided, including protein-protein interaction (PPI), gene ontology, protein tertiary structures, orthologous sequences, kinase/phosphatase classification and Kinase Client Assay (KiC Assay) data--all of which provides context for the phosphorylation event. In addition, P(3)DB 3.0 incorporates multiple network viewers for the above features, such as PPI network, kinase-substrate network, phosphatase-substrate network, and domain co-occurrence network to help study phosphorylation from a systems point of view. Furthermore, the new P(3)DB reflects a community-based design through which users can share datasets and automate data depository processes for publication purposes. Each of these new features supports the goal of making P(3)DB a comprehensive, systematic and interactive platform for phosphoproteomics research.
Raelson, John V; Little, Randall D; Ruether, Andreas; Fournier, Hélène; Paquin, Bruno; Van Eerdewegh, Paul; Bradley, W E C; Croteau, Pascal; Nguyen-Huu, Quynh; Segal, Jonathan; Debrus, Sophie; Allard, René; Rosenstiel, Philip; Franke, Andre; Jacobs, Gunnar; Nikolaus, Susanna; Vidal, Jean-Michel; Szego, Peter; Laplante, Nathalie; Clark, Hilary F; Paulussen, René J; Hooper, John W; Keith, Tim P; Belouchi, Abdelmajid; Schreiber, Stefan
2007-09-11
Genome-wide association (GWA) studies offer a powerful unbiased method for the identification of multiple susceptibility genes for complex diseases. Here we report the results of a GWA study for Crohn's disease (CD) using family trios from the Quebec Founder Population (QFP). Haplotype-based association analyses identified multiple regions associated with the disease that met the criteria for genome-wide significance, with many containing a gene whose function appears relevant to CD. A proportion of these were replicated in two independent German Caucasian samples, including the established CD loci NOD2 and IBD5. The recently described IL23R locus was also identified and replicated. For this region, multiple individuals with all major haplotypes in the QFP were sequenced and extensive fine mapping performed to identify risk and protective alleles. Several additional loci, including a region on 3p21 containing several plausible candidate genes, a region near JAKMIP1 on 4p16.1, and two larger regions on chromosome 17 were replicated. Together with previously published loci, the spectrum of CD genes identified to date involves biochemical networks that affect epithelial defense mechanisms, innate and adaptive immune response, and the repair or remodeling of tissue.
Chakraborty, Chiranjib; Bandyopadhyay, Sanghamitra; Doss, C George Priya; Agoramoorthy, Govindasamy
2015-04-01
Maturity onset diabetes of the young (MODY) is a metabolic and genetic disorder. It is different from type 1 and type 2 diabetes with low occurrence level (1-2%) among all diabetes. This disorder is a consequence of β-cell dysfunction. Till date, 11 subtypes of MODY have been identified, and all of them can cause gene mutations. However, very little is known about the gene mapping, molecular phylogenetics, and co-expression among MODY genes and networking between cascades. This study has used latest servers and software such as VarioWatch, ClustalW, MUSCLE, G Blocks, Phylogeny.fr, iTOL, WebLogo, STRING, and KEGG PATHWAY to perform comprehensive analyses of gene mapping, multiple sequences alignment, molecular phylogenetics, protein-protein network design, co-expression analysis of MODY genes, and pathway development. The MODY genes are located in chromosomes-2, 7, 8, 9, 11, 12, 13, 17, and 20. Highly aligned block shows Pro, Gly, Leu, Arg, and Pro residues are highly aligned in the positions of 296, 386, 437, 455, 456 and 598, respectively. Alignment scores inform us that HNF1A and HNF1B proteins have shown high sequence similarity among MODY proteins. Protein-protein network design shows that HNF1A, HNF1B, HNF4A, NEUROD1, PDX1, PAX4, INS, and GCK are strongly connected, and the co-expression analyses between MODY genes also show distinct association between HNF1A and HNF4A genes. This study has used latest tools of bioinformatics to develop a rapid method to assess the evolutionary relationship, the network development, and the associations among eleven MODY genes and cascades. The prediction of sequence conservation, molecular phylogenetics, protein-protein network and the association between the MODY cascades enhances opportunities to get more insights into the less-known MODY disease.
Detection of multiple perturbations in multi-omics biological networks.
Griffin, Paula J; Zhang, Yuqing; Johnson, William Evan; Kolaczyk, Eric D
2018-05-17
Cellular mechanism-of-action is of fundamental concern in many biological studies. It is of particular interest for identifying the cause of disease and learning the way in which treatments act against disease. However, pinpointing such mechanisms is difficult, due to the fact that small perturbations to the cell can have wide-ranging downstream effects. Given a snapshot of cellular activity, it can be challenging to tell where a disturbance originated. The presence of an ever-greater variety of high-throughput biological data offers an opportunity to examine cellular behavior from multiple angles, but also presents the statistical challenge of how to effectively analyze data from multiple sources. In this setting, we propose a method for mechanism-of-action inference by extending network filtering to multi-attribute data. We first estimate a joint Gaussian graphical model across multiple data types using penalized regression and filter for network effects. We then apply a set of likelihood ratio tests to identify the most likely site of the original perturbation. In addition, we propose a conditional testing procedure to allow for detection of multiple perturbations. We demonstrate this methodology on paired gene expression and methylation data from The Cancer Genome Atlas (TCGA). © 2018, The International Biometric Society.
Multiple coupled landscapes and non-adiabatic dynamics with applications to self-activating genes.
Chen, Cong; Zhang, Kun; Feng, Haidong; Sasai, Masaki; Wang, Jin
2015-11-21
Many physical, chemical and biochemical systems (e.g. electronic dynamics and gene regulatory networks) are governed by continuous stochastic processes (e.g. electron dynamics on a particular electronic energy surface and protein (gene product) synthesis) coupled with discrete processes (e.g. hopping among different electronic energy surfaces and on and off switching of genes). One can also think of the underlying dynamics as the continuous motion on a particular landscape and discrete hoppings among different landscapes. The main difference of such systems from the intra-landscape dynamics alone is the emergence of the timescale involved in transitions among different landscapes in addition to the timescale involved in a particular landscape. The adiabatic limit when inter-landscape hoppings are fast compared to continuous intra-landscape dynamics has been studied both analytically and numerically, but the analytical treatment of the non-adiabatic regime where the inter-landscape hoppings are slow or comparable to continuous intra-landscape dynamics remains challenging. In this study, we show that there exists mathematical mapping of the dynamics on 2(N) discretely coupled N continuous dimensional landscapes onto one single landscape in 2N dimensional extended continuous space. On this 2N dimensional landscape, eddy current emerges as a sign of non-equilibrium non-adiabatic dynamics and plays an important role in system evolution. Many interesting physical effects such as the enhancement of fluctuations, irreversibility, dissipation and optimal kinetics emerge due to non-adiabaticity manifested by the eddy current illustrated for an N = 1 self-activator. We further generalize our theory to the N-gene network with multiple binding sites and multiple synthesis rates for discretely coupled non-equilibrium stochastic physical and biological systems.
Zainudin, Suhaila; Arif, Shereena M.
2017-01-01
Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5. PMID:28250767
A Network of Genes Antagonistic to the LIN-35 Retinoblastoma Protein of Caenorhabditis elegans
Polley, Stanley R. G.; Fay, David S.
2012-01-01
The Caenorhabditis elegans pRb ortholog, LIN-35, functions in a wide range of cellular and developmental processes. This includes a role of LIN-35 in nutrient utilization by the intestine, which it carries out redundantly with SLR-2, a zinc-finger protein. This and other redundant functions of LIN-35 were identified in genetic screens for mutations that display synthetic phenotypes in conjunction with loss of lin-35. To explore the intestinal role of LIN-35, we conducted a genome-wide RNA-interference-feeding screen for suppressors of lin-35; slr-2 early larval arrest. Of the 26 suppressors identified, 17 fall into three functional classes: (1) ribosome biogenesis genes, (2) mitochondrial prohibitins, and (3) chromatin regulators. Further characterization indicates that different categories of suppressors act through distinct molecular mechanisms. We also tested lin-35; slr-2 suppressors, as well as suppressors of the synthetic multivulval phenotype, to determine the spectrum of lin-35-synthetic phenotypes that could be suppressed following inhibition of these genes. We identified 19 genes, most of which are evolutionarily conserved, that can suppress multiple unrelated lin-35-synthetic phenotypes. Our study reveals a network of genes broadly antagonistic to LIN-35 as well as genes specific to the role of LIN-35 in intestinal and vulval development. Suppressors of multiple lin-35 phenotypes may be candidate targets for anticancer therapies. Moreover, screening for suppressors of phenotypically distinct synthetic interactions, which share a common altered gene, may prove to be a novel and effective approach for identifying genes whose activities are most directly relevant to the core functions of the shared gene. PMID:22542970
Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas
2015-01-01
Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling
Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K
2006-01-01
Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705
Analysis of the SOS response of Vibrio and other bacteria with multiple chromosomes
2012-01-01
Background The SOS response is a well-known regulatory network present in most bacteria and aimed at addressing DNA damage. It has also been linked extensively to stress-induced mutagenesis, virulence and the emergence and dissemination of antibiotic resistance determinants. Recently, the SOS response has been shown to regulate the activity of integrases in the chromosomal superintegrons of the Vibrionaceae, which encompasses a wide range of pathogenic species harboring multiple chromosomes. Here we combine in silico and in vitro techniques to perform a comparative genomics analysis of the SOS regulon in the Vibrionaceae, and we extend the methodology to map this transcriptional network in other bacterial species harboring multiple chromosomes. Results Our analysis provides the first comprehensive description of the SOS response in a family (Vibrionaceae) that includes major human pathogens. It also identifies several previously unreported members of the SOS transcriptional network, including two proteins of unknown function. The analysis of the SOS response in other bacterial species with multiple chromosomes uncovers additional regulon members and reveals that there is a conserved core of SOS genes, and that specialized additions to this basic network take place in different phylogenetic groups. Our results also indicate that across all groups the main elements of the SOS response are always found in the large chromosome, whereas specialized additions are found in the smaller chromosomes and plasmids. Conclusions Our findings confirm that the SOS response of the Vibrionaceae is strongly linked with pathogenicity and dissemination of antibiotic resistance, and suggest that the characterization of the newly identified members of this regulon could provide key insights into the pathogenesis of Vibrio. The persistent location of key SOS genes in the large chromosome across several bacterial groups confirms that the SOS response plays an essential role in these organisms and sheds light into the mechanisms of evolution of global transcriptional networks involved in adaptability and rapid response to environmental changes, suggesting that small chromosomes may act as evolutionary test beds for the rewiring of transcriptional networks. PMID:22305460
Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.
Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J
2017-01-01
The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.
Wang, Jinglu; Qu, Susu; Wang, Weixiao; Guo, Liyuan; Zhang, Kunlin; Chang, Suhua; Wang, Jing
2016-11-01
Numbers of gene expression profiling studies of bipolar disorder have been published. Besides different array chips and tissues, variety of the data processes in different cohorts aggravated the inconsistency of results of these genome-wide gene expression profiling studies. By searching the gene expression databases, we obtained six data sets for prefrontal cortex (PFC) of bipolar disorder with raw data and combinable platforms. We used standardized pre-processing and quality control procedures to analyze each data set separately and then combined them into a large gene expression matrix with 101 bipolar disorder subjects and 106 controls. A standard linear mixed-effects model was used to calculate the differentially expressed genes (DEGs). Multiple levels of sensitivity analyses and cross validation with genetic data were conducted. Functional and network analyses were carried out on basis of the DEGs. In the result, we identified 198 unique differentially expressed genes in the PFC of bipolar disorder and control. Among them, 115 DEGs were robust to at least three leave-one-out tests or different pre-processing methods; 51 DEGs were validated with genetic association signals. Pathway enrichment analysis showed these DEGs were related with regulation of neurological system, cell death and apoptosis, and several basic binding processes. Protein-protein interaction network further identified one key hub gene. We have contributed the most comprehensive integrated analysis of bipolar disorder expression profiling studies in PFC to date. The DEGs, especially those with multiple validations, may denote a common signature of bipolar disorder and contribute to the pathogenesis of disease. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hara, Toshifumi; Jones, Matthew F.; Subramanian, Murugan; Li, Xiao Ling; Ou, Oliver; Zhu, Yuelin; Yang, Yuan; Wakefield, Lalage M.; Hussain, S. Perwez; Gaedcke, Jochen; Ried, Thomas; Luo, Ji; Caplen, Natasha J.; Lal, Ashish
2014-01-01
MicroRNAs (miRNAs) regulate the expression of hundreds of genes. However, identifying the critical targets within a miRNA-regulated gene network is challenging. One approach is to identify miRNAs that exert a context-dependent effect, followed by expression profiling to determine how specific targets contribute to this selective effect. In this study, we performed miRNA mimic screens in isogenic KRAS-Wild-type (WT) and KRAS-Mutant colorectal cancer (CRC) cell lines to identify miRNAs selectively targeting KRAS-Mutant cells. One of the miRNAs we identified as a selective inhibitor of the survival of multiple KRAS-Mutant CRC lines was miR-126. In KRAS-Mutant cells, miR-126 over-expression increased the G1 compartment, inhibited clonogenicity and tumorigenicity, while exerting no effect on KRAS-WT cells. Unexpectedly, the miR-126-regulated transcriptome of KRAS-WT and KRAS-Mutant cells showed no significant differences. However, by analyzing the overlap between miR-126 targets with the synthetic lethal genes identified by RNAi in KRAS-Mutant cells, we identified and validated a subset of miR-126-regulated genes selectively required for the survival and clonogenicity of KRAS-Mutant cells. Our strategy therefore identified critical target genes within the miR-126-regulated gene network. We propose that the selective effect of miR-126 on KRAS-Mutant cells could be utilized for the development of targeted therapy for KRAS mutant tumors. PMID:25245095
TargetCompare: A web interface to compare simultaneous miRNAs targets
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-dos-Santos, André M; dos Santos, Ândrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. Availability http://lghm.ufpa.br/targetcompare PMID:25352731
TargetCompare: A web interface to compare simultaneous miRNAs targets.
Moreira, Fabiano Cordeiro; Dustan, Bruno; Hamoy, Igor G; Ribeiro-Dos-Santos, André M; Dos Santos, Andrea Ribeiro
2014-01-01
MicroRNAs (miRNAs) are small non-coding nucleotide sequences between 17 and 25 nucleotides in length that primarily function in the regulation of gene expression. A since miRNA has thousand of predict targets in a complex, regulatory cell signaling network. Therefore, it is of interest to study multiple target genes simultaneously. Hence, we describe a web tool (developed using Java programming language and MySQL database server) to analyse multiple targets of pre-selected miRNAs. We cross validated the tool in eight most highly expressed miRNAs in the antrum region of stomach. This helped to identify 43 potential genes that are target of at least six of the referred miRNAs. The developed tool aims to reduce the randomness and increase the chance of selecting strong candidate target genes and miRNAs responsible for playing important roles in the studied tissue. http://lghm.ufpa.br/targetcompare.
The WRKY transcription factor family and senescence in switchgrass.
Rinerson, Charles I; Scully, Erin D; Palmer, Nathan A; Donze-Reiner, Teresa; Rabara, Roel C; Tripathi, Prateek; Shen, Qingxi J; Sattler, Scott E; Rohila, Jai S; Sarath, Gautam; Rushton, Paul J
2015-11-09
Early aerial senescence in switchgrass (Panicum virgatum) can significantly limit biomass yields. WRKY transcription factors that can regulate senescence could be used to reprogram senescence and enhance biomass yields. All potential WRKY genes present in the version 1.0 of the switchgrass genome were identified and curated using manual and bioinformatic methods. Expression profiles of WRKY genes in switchgrass flag leaf RNA-Seq datasets were analyzed using clustering and network analyses tools to identify both WRKY and WRKY-associated gene co-expression networks during leaf development and senescence onset. We identified 240 switchgrass WRKY genes including members of the RW5 and RW6 families of resistance proteins. Weighted gene co-expression network analysis of the flag leaf transcriptomes across development readily separated clusters of co-expressed genes into thirteen modules. A visualization highlighted separation of modules associated with the early and senescence-onset phases of flag leaf growth. The senescence-associated module contained 3000 genes including 23 WRKYs. Putative promoter regions of senescence-associated WRKY genes contained several cis-element-like sequences suggestive of responsiveness to both senescence and stress signaling pathways. A phylogenetic comparison of senescence-associated WRKY genes from switchgrass flag leaf with senescence-associated WRKY genes from other plants revealed notable hotspots in Group I, IIb, and IIe of the phylogenetic tree. We have identified and named 240 WRKY genes in the switchgrass genome. Twenty three of these genes show elevated mRNA levels during the onset of flag leaf senescence. Eleven of the WRKY genes were found in hotspots of related senescence-associated genes from multiple species and thus represent promising targets for future switchgrass genetic improvement. Overall, individual WRKY gene expression profiles could be readily linked to developmental stages of flag leaves.
Passing Messages between Biological Networks to Refine Predicted Interactions
Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng
2013-01-01
Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net. PMID:23741402
Wang, James K. T.; Langfelder, Peter; Horvath, Steve; Palazzolo, Michael J.
2017-01-01
Huntington's disease (HD) is a progressive and autosomal dominant neurodegeneration caused by CAG expansion in the huntingtin gene (HTT), but the pathophysiological mechanism of mutant HTT (mHTT) remains unclear. To study HD using systems biological methodologies on all published data, we undertook the first comprehensive curation of two key PubMed HD datasets: perturbation genes that impact mHTT-driven endpoints and therefore are putatively linked causally to pathogenic mechanisms, and the protein interactome of HTT that reflects its biology. We perused PubMed articles containing co-citation of gene IDs and MeSH terms of interest to generate mechanistic gene sets for iterative enrichment analyses and rank ordering. The HD Perturbation database of 1,218 genes highly overlaps the HTT Interactome of 1,619 genes, suggesting links between normal HTT biology and mHTT pathology. These two HD datasets are enriched for protein networks of key genes underlying two mechanisms not previously implicated in HD nor in each other: exosome synaptic functions and homeostatic synaptic plasticity. Moreover, proteins, possibly including HTT, and miRNA detected in exosomes from a wide variety of sources also highly overlap the HD datasets, suggesting both mechanistic and biomarker links. Finally, the HTT Interactome highly intersects protein networks of pathogenic genes underlying Parkinson's, Alzheimer's and eight non-HD polyglutamine diseases, ALS, and spinal muscular atrophy. These protein networks in turn highly overlap the exosome and homeostatic synaptic plasticity gene sets. Thus, we hypothesize that HTT and other neurodegeneration pathogenic genes form a large interlocking protein network involved in exosome and homeostatic synaptic functions, particularly where the two mechanisms intersect. Mutant pathogenic proteins cause dysfunctions at distinct points in this network, each altering the two mechanisms in specific fashion that contributes to distinct disease pathologies, depending on the gene mutation and the cellular and biological context. This protein network is rich with drug targets, and exosomes may provide disease biomarkers, thus enabling drug discovery. All the curated datasets are made available for other investigators. Elucidating the roles of pathogenic neurodegeneration genes in exosome and homeostatic synaptic functions may provide a unifying framework for the age-dependent, progressive and tissue selective nature of multiple neurodegenerative diseases. PMID:28611571
Wang, James K T; Langfelder, Peter; Horvath, Steve; Palazzolo, Michael J
2017-01-01
Huntington's disease (HD) is a progressive and autosomal dominant neurodegeneration caused by CAG expansion in the huntingtin gene ( HTT ), but the pathophysiological mechanism of mutant HTT (mHTT) remains unclear. To study HD using systems biological methodologies on all published data, we undertook the first comprehensive curation of two key PubMed HD datasets: perturbation genes that impact mHTT-driven endpoints and therefore are putatively linked causally to pathogenic mechanisms, and the protein interactome of HTT that reflects its biology. We perused PubMed articles containing co-citation of gene IDs and MeSH terms of interest to generate mechanistic gene sets for iterative enrichment analyses and rank ordering. The HD Perturbation database of 1,218 genes highly overlaps the HTT Interactome of 1,619 genes, suggesting links between normal HTT biology and mHTT pathology. These two HD datasets are enriched for protein networks of key genes underlying two mechanisms not previously implicated in HD nor in each other: exosome synaptic functions and homeostatic synaptic plasticity. Moreover, proteins, possibly including HTT, and miRNA detected in exosomes from a wide variety of sources also highly overlap the HD datasets, suggesting both mechanistic and biomarker links. Finally, the HTT Interactome highly intersects protein networks of pathogenic genes underlying Parkinson's, Alzheimer's and eight non-HD polyglutamine diseases, ALS, and spinal muscular atrophy. These protein networks in turn highly overlap the exosome and homeostatic synaptic plasticity gene sets. Thus, we hypothesize that HTT and other neurodegeneration pathogenic genes form a large interlocking protein network involved in exosome and homeostatic synaptic functions, particularly where the two mechanisms intersect. Mutant pathogenic proteins cause dysfunctions at distinct points in this network, each altering the two mechanisms in specific fashion that contributes to distinct disease pathologies, depending on the gene mutation and the cellular and biological context. This protein network is rich with drug targets, and exosomes may provide disease biomarkers, thus enabling drug discovery. All the curated datasets are made available for other investigators. Elucidating the roles of pathogenic neurodegeneration genes in exosome and homeostatic synaptic functions may provide a unifying framework for the age-dependent, progressive and tissue selective nature of multiple neurodegenerative diseases.
Roy, Raktim; Shilpa, P Phani; Bagh, Sangram
2016-09-01
Bacteria are important organisms for space missions due to their increased pathogenesis in microgravity that poses risks to the health of astronauts and for projected synthetic biology applications at the space station. We understand little about the effect, at the molecular systems level, of microgravity on bacteria, despite their significant incidence. In this study, we proposed a systems biology pipeline and performed an analysis on published gene expression data sets from multiple seminal studies on Pseudomonas aeruginosa and Salmonella enterica serovar Typhimurium under spaceflight and simulated microgravity conditions. By applying gene set enrichment analysis on the global gene expression data, we directly identified a large number of new, statistically significant cellular and metabolic pathways involved in response to microgravity. Alteration of metabolic pathways in microgravity has rarely been reported before, whereas in this analysis metabolic pathways are prevalent. Several of those pathways were found to be common across studies and species, indicating a common cellular response in microgravity. We clustered genes based on their expression patterns using consensus non-negative matrix factorization. The genes from different mathematically stable clusters showed protein-protein association networks with distinct biological functions, suggesting the plausible functional or regulatory network motifs in response to microgravity. The newly identified pathways and networks showed connection with increased survival of pathogens within macrophages, virulence, and antibiotic resistance in microgravity. Our work establishes a systems biology pipeline and provides an integrated insight into the effect of microgravity at the molecular systems level. Systems biology-Microgravity-Pathways and networks-Bacteria. Astrobiology 16, 677-689.
A Single Multiplex crRNA Array for FnCpf1-Mediated Human Genome Editing.
Sun, Huihui; Li, Fanfan; Liu, Jie; Yang, Fayu; Zeng, Zhenhai; Lv, Xiujuan; Tu, Mengjun; Liu, Yeqing; Ge, Xianglian; Liu, Changbao; Zhao, Junzhao; Zhang, Zongduan; Qu, Jia; Song, Zongming; Gu, Feng
2018-06-15
Cpf1 has been harnessed as a tool for genome manipulation in various species because of its simplicity and high efficiency. Our recent study demonstrated that FnCpf1 could be utilized for human genome editing with notable advantages for target sequence selection due to the flexibility of the protospacer adjacent motif (PAM) sequence. Multiplex genome editing provides a powerful tool for targeting members of multigene families, dissecting gene networks, modeling multigenic disorders in vivo, and applying gene therapy. However, there are no reports at present that show FnCpf1-mediated multiplex genome editing via a single customized CRISPR RNA (crRNA) array. In the present study, we utilize a single customized crRNA array to simultaneously target multiple genes in human cells. In addition, we also demonstrate that a single customized crRNA array to target multiple sites in one gene could be achieved. Collectively, FnCpf1, a powerful genome-editing tool for multiple genomic targets, can be harnessed for effective manipulation of the human genome. Copyright © 2018 The American Society of Gene and Cell Therapy. Published by Elsevier Inc. All rights reserved.
Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J
2015-03-07
Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10(-100)), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation.
Dufour, Yann S.; Donohue, Timothy J.
2015-01-01
Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly. PMID:23046950
Deregulation of an imprinted gene network in prostate cancer
Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A
2014-01-01
Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes. PMID:24513574
Deregulation of an imprinted gene network in prostate cancer.
Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A
2014-05-01
Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes.
Interconnected network motifs control podocyte morphology and kidney function.
Azeloglu, Evren U; Hardy, Simon V; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y; Fang, Wei; Xiong, Huabao; Neves, Susana R; Jain, Mohit R; Li, Hong; Ma'ayan, Avi; Gordon, Ronald E; He, John Cijiang; Iyengar, Ravi
2014-02-04
Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3',5'-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element-binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor-driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease.
Interconnected Network Motifs Control Podocyte Morphology and Kidney Function
Azeloglu, Evren U.; Hardy, Simon V.; Eungdamrong, Narat John; Chen, Yibang; Jayaraman, Gomathi; Chuang, Peter Y.; Fang, Wei; Xiong, Huabao; Neves, Susana R.; Jain, Mohit R.; Li, Hong; Ma’ayan, Avi; Gordon, Ronald E.; He, John Cijiang; Iyengar, Ravi
2014-01-01
Podocytes are kidney cells with specialized morphology that is required for glomerular filtration. Diseases, such as diabetes, or drug exposure that causes disruption of the podocyte foot process morphology results in kidney pathophysiology. Proteomic analysis of glomeruli isolated from rats with puromycin-induced kidney disease and control rats indicated that protein kinase A (PKA), which is activated by adenosine 3′,5′-monophosphate (cAMP), is a key regulator of podocyte morphology and function. In podocytes, cAMP signaling activates cAMP response element–binding protein (CREB) to enhance expression of the gene encoding a differentiation marker, synaptopodin, a protein that associates with actin and promotes its bundling. We constructed and experimentally verified a β-adrenergic receptor–driven network with multiple feedback and feedforward motifs that controls CREB activity. To determine how the motifs interacted to regulate gene expression, we mapped multicompartment dynamical models, including information about protein subcellular localization, onto the network topology using Petri net formalisms. These computational analyses indicated that the juxtaposition of multiple feedback and feedforward motifs enabled the prolonged CREB activation necessary for synaptopodin expression and actin bundling. Drug-induced modulation of these motifs in diseased rats led to recovery of normal morphology and physiological function in vivo. Thus, analysis of regulatory motifs using network dynamics can provide insights into pathophysiology that enable predictions for drug intervention strategies to treat kidney disease. PMID:24497609
Zhang, Jingpu; Zhang, Zuping; Wang, Zixiang; Liu, Yuting; Deng, Lei
2018-05-15
Long non-coding RNAs (lncRNAs) are an enormous collection of functional non-coding RNAs. Over the past decades, a large number of novel lncRNA genes have been identified. However, most of the lncRNAs remain function uncharacterized at present. Computational approaches provide a new insight to understand the potential functional implications of lncRNAs. Considering that each lncRNA may have multiple functions and a function may be further specialized into sub-functions, here we describe NeuraNetL2GO, a computational ontological function prediction approach for lncRNAs using hierarchical multi-label classification strategy based on multiple neural networks. The neural networks are incrementally trained level by level, each performing the prediction of gene ontology (GO) terms belonging to a given level. In NeuraNetL2GO, we use topological features of the lncRNA similarity network as the input of the neural networks and employ the output results to annotate the lncRNAs. We show that NeuraNetL2GO achieves the best performance and the overall advantage in maximum F-measure and coverage on the manually annotated lncRNA2GO-55 dataset compared to other state-of-the-art methods. The source code and data are available at http://denglab.org/NeuraNetL2GO/. leideng@csu.edu.cn. Supplementary data are available at Bioinformatics online.
Systematic network assessment of the carcinogenic activities of cadmium
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Peizhan; Duan, Xiaohua; Li, Mian
Cadmium has been defined as type I carcinogen for humans, but the underlying mechanisms of its carcinogenic activity and its influence on protein-protein interactions in cells are not fully elucidated. The aim of the current study was to evaluate, systematically, the carcinogenic activity of cadmium with systems biology approaches. From a literature search of 209 studies that performed with cellular models, 208 proteins influenced by cadmium exposure were identified. All of these were assessed by Western blotting and were recognized as key nodes in network analyses. The protein-protein functional interaction networks were constructed with NetBox software and visualized with Cytoscapemore » software. These cadmium-rewired genes were used to construct a scale-free, highly connected biological protein interaction network with 850 nodes and 8770 edges. Of the network, nine key modules were identified and 60 key signaling pathways, including the estrogen, RAS, PI3K-Akt, NF-κB, HIF-1α, Jak-STAT, and TGF-β signaling pathways, were significantly enriched. With breast cancer, colorectal and prostate cancer cellular models, we validated the key node genes in the network that had been previously reported or inferred form the network by Western blotting methods, including STAT3, JNK, p38, SMAD2/3, P65, AKT1, and HIF-1α. These results suggested the established network was robust and provided a systematic view of the carcinogenic activities of cadmium in human. - Highlights: • A cadmium-influenced network with 850 nodes and 8770 edges was established. • The cadmium-rewired gene network was scale-free and highly connected. • Nine modules were identified, and 60 key signaling pathways related to cadmium-induced carcinogenesis were found. • Key mediators in the network were validated in multiple cellular models.« less
Peng, Wei; Lan, Wei; Zhong, Jiancheng; Wang, Jianxin; Pan, Yi
2017-07-15
MicroRNAs have been reported to have close relationship with diseases due to their deregulation of the expression of target mRNAs. Detecting disease-related microRNAs is helpful for disease therapies. With the development of high throughput experimental techniques, a large number of microRNAs have been sequenced. However, it is still a big challenge to identify which microRNAs are related to diseases. Recently, researchers are interesting in combining multiple-biological information to identify the associations between microRNAs and diseases. In this work, we have proposed a novel method to predict the microRNA-disease associations based on four biological properties. They are microRNA, disease, gene and environment factor. Compared with previous methods, our method makes predictions not only by using the prior knowledge of associations among microRNAs, disease, environment factors and genes, but also by using the internal relationship among these biological properties. We constructed four biological networks based on the similarity of microRNAs, diseases, environment factors and genes, respectively. Then random walking was implemented on the four networks unequally. In the walking course, the associations can be inferred from the neighbors in the same networks. Meanwhile the association information can be transferred from one network to another. The results of experiment showed that our method achieved better prediction performance than other existing state-of-the-art methods. Copyright © 2017 Elsevier Inc. All rights reserved.
Shared molecular networks in orofacial and neural tube development.
Kousa, Youssef A; Mansour, Tamer A; Seada, Haitham; Matoo, Samaneh; Schutte, Brian C
2017-01-30
Single genetic variants can affect multiple tissues during development. Thus it is possible that disruption of shared gene regulatory networks might underlie syndromic presentations. In this study, we explore this idea through examination of two critical developmental programs that control orofacial and neural tube development and identify shared regulatory factors and networks. Identification of these networks has the potential to yield additional candidate genes for poorly understood developmental disorders and assist in modeling and perhaps managing risk factors to prevent morbidly and mortality. We reviewed the literature to identify genes common between orofacial and neural tube defects and development. We then conducted a bioinformatic analysis to identify shared molecular targets and pathways in the development of these tissues. Finally, we examine publicly available RNA-Seq data to identify which of these genes are expressed in both tissues during development. We identify common regulatory factors in orofacial and neural tube development. Pathway enrichment analysis shows that folate, cancer and hedgehog signaling pathways are shared in neural tube and orofacial development. Developing neural tissues differentially express mouse exencephaly and cleft palate genes, whereas developing orofacial tissues were enriched for both clefting and neural tube defect genes. These data suggest that key developmental factors and pathways are shared between orofacial and neural tube defects. We conclude that it might be most beneficial to focus on common regulatory factors and pathways to better understand pathology and develop preventative measures for these birth defects. Birth Defects Research 109:169-179, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Roy, Raktim; Phani Shilpa, P.; Bagh, Sangram
2016-09-01
Bacteria are important organisms for space missions due to their increased pathogenesis in microgravity that poses risks to the health of astronauts and for projected synthetic biology applications at the space station. We understand little about the effect, at the molecular systems level, of microgravity on bacteria, despite their significant incidence. In this study, we proposed a systems biology pipeline and performed an analysis on published gene expression data sets from multiple seminal studies on Pseudomonas aeruginosa and Salmonella enterica serovar Typhimurium under spaceflight and simulated microgravity conditions. By applying gene set enrichment analysis on the global gene expression data, we directly identified a large number of new, statistically significant cellular and metabolic pathways involved in response to microgravity. Alteration of metabolic pathways in microgravity has rarely been reported before, whereas in this analysis metabolic pathways are prevalent. Several of those pathways were found to be common across studies and species, indicating a common cellular response in microgravity. We clustered genes based on their expression patterns using consensus non-negative matrix factorization. The genes from different mathematically stable clusters showed protein-protein association networks with distinct biological functions, suggesting the plausible functional or regulatory network motifs in response to microgravity. The newly identified pathways and networks showed connection with increased survival of pathogens within macrophages, virulence, and antibiotic resistance in microgravity. Our work establishes a systems biology pipeline and provides an integrated insight into the effect of microgravity at the molecular systems level.
2013-01-01
Background The regenerative response of Schwann cells after peripheral nerve injury is a critical process directly related to the pathophysiology of a number of neurodegenerative diseases. This SC injury response is dependent on an intricate gene regulatory program coordinated by a number of transcription factors and microRNAs, but the interactions among them remain largely unknown. Uncovering the transcriptional and post-transcriptional regulatory networks governing the Schwann cell injury response is a key step towards a better understanding of Schwann cell biology and may help develop novel therapies for related diseases. Performing such comprehensive network analysis requires systematic bioinformatics methods to integrate multiple genomic datasets. Results In this study we present a computational pipeline to infer transcription factor and microRNA regulatory networks. Our approach combined mRNA and microRNA expression profiling data, ChIP-Seq data of transcription factors, and computational transcription factor and microRNA target prediction. Using mRNA and microRNA expression data collected in a Schwann cell injury model, we constructed a regulatory network and studied regulatory pathways involved in Schwann cell response to injury. Furthermore, we analyzed network motifs and obtained insights on cooperative regulation of transcription factors and microRNAs in Schwann cell injury recovery. Conclusions This work demonstrates a systematic method for gene regulatory network inference that may be used to gain new information on gene regulation by transcription factors and microRNAs. PMID:23387820
Mo, Chunyan; Wan, Shumin; Xia, Youquan; Ren, Ning; Zhou, Yang; Jiang, Xingyu
2018-01-01
Cassava is an energy crop that is tolerant of multiple abiotic stresses. It has been reported that the interaction between Calcineurin B-like (CBL) protein and CBL-interacting protein kinase (CIPK) is implicated in plant development and responses to various stresses. However, little is known about their functions in cassava. Herein, 8 CBL ( MeCBL ) and 26 CIPK ( MeCIPK ) genes were isolated from cassava by genome searching and cloning of cDNA sequences of Arabidopsis CBL s and CIPK s. Reverse-transcriptase polymerase chain reaction (RT-PCR) analysis showed that the expression levels of MeCBL and MeCIPK genes were different in different tissues throughout the life cycle. The expression patterns of 7 CBL and 26 CIPK genes in response to NaCl, PEG, heat and cold stresses were analyzed by quantitative real-time PCR (qRT-PCR), and it was found that the expression of each was induced by multiple stimuli. Furthermore, we found that many pairs of CBLs and CIPKs could interact with each other via investigating the interactions between 8 CBL and 25 CIPK proteins using a yeast two-hybrid system. Yeast cells co-transformed with cassava MeCIPK24, MeCBL10 , and Na + /H + antiporter MeSOS1 genes exhibited higher salt tolerance compared to those with one or two genes. These results suggest that the cassava CBL-CIPK signal network might play key roles in response to abiotic stresses.
Mo, Chunyan; Wan, Shumin; Xia, Youquan; Ren, Ning; Zhou, Yang; Jiang, Xingyu
2018-01-01
Cassava is an energy crop that is tolerant of multiple abiotic stresses. It has been reported that the interaction between Calcineurin B-like (CBL) protein and CBL-interacting protein kinase (CIPK) is implicated in plant development and responses to various stresses. However, little is known about their functions in cassava. Herein, 8 CBL (MeCBL) and 26 CIPK (MeCIPK) genes were isolated from cassava by genome searching and cloning of cDNA sequences of Arabidopsis CBLs and CIPKs. Reverse-transcriptase polymerase chain reaction (RT-PCR) analysis showed that the expression levels of MeCBL and MeCIPK genes were different in different tissues throughout the life cycle. The expression patterns of 7 CBL and 26 CIPK genes in response to NaCl, PEG, heat and cold stresses were analyzed by quantitative real-time PCR (qRT-PCR), and it was found that the expression of each was induced by multiple stimuli. Furthermore, we found that many pairs of CBLs and CIPKs could interact with each other via investigating the interactions between 8 CBL and 25 CIPK proteins using a yeast two-hybrid system. Yeast cells co-transformed with cassava MeCIPK24, MeCBL10, and Na+/H+ antiporter MeSOS1 genes exhibited higher salt tolerance compared to those with one or two genes. These results suggest that the cassava CBL-CIPK signal network might play key roles in response to abiotic stresses. PMID:29552024
Ling, Sheng; Chen, Caisheng; Wang, Yang; Sun, Xiaocong; Lu, Zhanhua; Ouyang, Yidan; Yao, Jialing
2015-02-19
The anthers and pollen grains are critical for male fertility and hybrid rice breeding. The development of rice mature anther and pollen consists of multiple continuous stages. However, molecular mechanisms regulating mature anther development were poorly understood. In this study, we have identified 291 mature anther-preferentially expressed genes (OsSTA) in rice based on Affymetrix microarray data. Gene Ontology (GO) analysis indicated that OsSTA genes mainly participated in metabolic and cellular processes that are likely important for rice anther and pollen development. The expression patterns of OsSTA genes were validated using real-time PCR and mRNA in situ hybridizations. Cis-element identification showed that most of the OsSTA genes had the cis-elements responsive to phytohormone regulation. Co-expression analysis of OsSTA genes showed that genes annotated with pectinesterase and calcium ion binding activities were rich in the network, suggesting that OsSTA genes could be involved in pollen germination and anther dehiscence. Furthermore, OsSTA RNAi transgenic lines showed male-sterility and pollen germination defects. The results suggested that OsSTA genes function in rice male fertility, pollen germination and anther dehiscence and established molecular regulating networks that lay the foundation for further functional studies.
Technologies and Approaches to Elucidate and Model the Virulence Program of Salmonella.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDermott, Jason E.; Yoon, Hyunjin; Nakayasu, Ernesto S.
Salmonella is a primary cause of enteric diseases in a variety of animals. During its evolution into a pathogenic bacterium, Salmonella acquired an elaborate regulatory network that responds to multiple environmental stimuli within host animals and integrates them resulting in fine regulation of the virulence program. The coordinated action by this regulatory network involves numerous virulence regulators, necessitating genome-wide profiling analysis to assess and combine efforts from multiple regulons. In this review we discuss recent high-throughput analytic approaches to understand the regulatory network of Salmonella that controls virulence processes. Application of high-throughput analyses have generated a large amount of datamore » and driven development of computational approaches required for data integration. Therefore, we also cover computer-aided network analyses to infer regulatory networks, and demonstrate how genome-scale data can be used to construct regulatory and metabolic systems models of Salmonella pathogenesis. Genes that are coordinately controlled by multiple virulence regulators under infectious conditions are more likely to be important for pathogenesis. Thus, reconstructing the global regulatory network during infection or, at the very least, under conditions that mimic the host cellular environment not only provides a bird’s eye view of Salmonella survival strategy in response to hostile host environments but also serves as an efficient means to identify novel virulence factors that are essential for Salmonella to accomplish systemic infection in the host.« less
Yu, Yun; Degnan, James H.; Nakhleh, Luay
2012-01-01
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. PMID:22536161
Multilevel regulation of gene expression by microRNAs.
Makeyev, Eugene V; Maniatis, Tom
2008-03-28
MicroRNAs (miRNAs) are approximately 22-nucleotide-long noncoding RNAs that normally function by suppressing translation and destabilizing messenger RNAs bearing complementary target sequences. Some miRNAs are expressed in a cell- or tissue-specific manner and may contribute to the establishment and/or maintenance of cellular identity. Recent studies indicate that tissue-specific miRNAs may function at multiple hierarchical levels of gene regulatory networks, from targeting hundreds of effector genes incompatible with the differentiated state to controlling the levels of global regulators of transcription and alternative pre-mRNA splicing. This multilevel regulation may allow individual miRNAs to profoundly affect the gene expression program of differentiated cells.
Rossin, Elizabeth J.; Lage, Kasper; Raychaudhuri, Soumya; Xavier, Ramnik J.; Tatar, Diana; Benita, Yair
2011-01-01
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein–protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease. PMID:21249183
Breeding and Genetics Symposium: networks and pathways to guide genomic selection.
Snelling, W M; Cushman, R A; Keele, J W; Maltecca, C; Thomas, M G; Fortes, M R S; Reverter, A
2013-02-01
Many traits affecting profitability and sustainability of meat, milk, and fiber production are polygenic, with no single gene having an overwhelming influence on observed variation. No knowledge of the specific genes controlling these traits has been needed to make substantial improvement through selection. Significant gains have been made through phenotypic selection enhanced by pedigree relationships and continually improving statistical methodology. Genomic selection, recently enabled by assays for dense SNP located throughout the genome, promises to increase selection accuracy and accelerate genetic improvement by emphasizing the SNP most strongly correlated to phenotype although the genes and sequence variants affecting phenotype remain largely unknown. These genomic predictions theoretically rely on linkage disequilibrium (LD) between genotyped SNP and unknown functional variants, but familial linkage may increase effectiveness when predicting individuals related to those in the training data. Genomic selection with functional SNP genotypes should be less reliant on LD patterns shared by training and target populations, possibly allowing robust prediction across unrelated populations. Although the specific variants causing polygenic variation may never be known with certainty, a number of tools and resources can be used to identify those most likely to affect phenotype. Associations of dense SNP genotypes with phenotype provide a 1-dimensional approach for identifying genes affecting specific traits; in contrast, associations with multiple traits allow defining networks of genes interacting to affect correlated traits. Such networks are especially compelling when corroborated by existing functional annotation and established molecular pathways. The SNP occurring within network genes, obtained from public databases or derived from genome and transcriptome sequences, may be classified according to expected effects on gene products. As illustrated by functionally informed genomic predictions being more accurate than naive whole-genome predictions of beef tenderness, coupling evidence from livestock genotypes, phenotypes, gene expression, and genomic variants with existing knowledge of gene functions and interactions may provide greater insight into the genes and genomic mechanisms affecting polygenic traits and facilitate functional genomic selection for economically important traits.
Targeted brain proteomics uncover multiple pathways to Alzheimer's dementia.
Yu, Lei; Petyuk, Vladislav A; Gaiteri, Chris; Mostafavi, Sara; Young-Pearse, Tracy; Shah, Raj C; Buchman, Aron S; Schneider, Julie A; Piehowski, Paul D; Sontag, Ryan L; Fillmore, Thomas L; Shi, Tujin; Smith, Richard D; De Jager, Philip L; Bennett, David A
2018-06-16
Previous gene expression analysis identified a network of co-expressed genes that is associated with β-amyloid neuropathology and cognitive decline in older adults. The current work targeted influential genes in this network with quantitative proteomics to identify potential novel therapeutic targets. Data came from 834 community-based older persons who were followed annually, died and underwent brain autopsy. Uniform structured postmortem evaluations assessed the burden of β-amyloid and other common age-related neuropathologies. Selected reaction monitoring quantified cortical protein abundance of 12 genes prioritized from a molecular network of aging human brain that is implicated in Alzheimer's dementia. Regression and linear mixed models examined the protein associations with β-amyloid load and other neuropathologic indices as well as cognitive decline over multiple years prior to death. The average age at death was 88.6 years. 349 participants (41.9%) had Alzheimer's dementia at death. A higher level of PLXNB1 abundance was associated with more β-amyloid load (p=1.0 × 10 -7 ) and higher PHFtau tangle density (p=2.3 × 10 -7 ), and the association of PLXNB1 with cognitive decline is mediated by these known Alzheimer's disease pathologies. On the other hand, higher IGFBP5, HSPB2, AK4 and lower ITPK1 levels were associated with faster cognitive decline and, unlike PLXNB1, these associations were not fully explained by common neuropathologic indices, suggesting novel mechanisms leading to cognitive decline. Using targeted proteomics, this work identified cortical proteins involved in Alzheimer's dementia and begins to dissect two different molecular pathways: one affecting β-amyloid deposition and another affecting resilience without a known pathologic footprint. This article is protected by copyright. All rights reserved. © 2018 American Neurological Association.
Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex
2016-11-16
Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.
Shahin, Arwa; Smulders, Marinus J. M.; van Tuyl, Jaap M.; Arens, Paul; Bakker, Freek T.
2014-01-01
Next Generation Sequencing (NGS) may enable estimating relationships among genotypes using allelic variation of multiple nuclear genes simultaneously. We explored the potential and caveats of this strategy in four genetically distant Lilium cultivars to estimate their genetic divergence from transcriptome sequences using three approaches: POFAD (Phylogeny of Organisms from Allelic Data, uses allelic information of sequence data), RAxML (Randomized Accelerated Maximum Likelihood, tree building based on concatenated consensus sequences) and Consensus Network (constructing a network summarizing among gene tree conflicts). Twenty six gene contigs were chosen based on the presence of orthologous sequences in all cultivars, seven of which also had an orthologous sequence in Tulipa, used as out-group. The three approaches generated the same topology. Although the resolution offered by these approaches is high, in this case there was no extra benefit in using allelic information. We conclude that these 26 genes can be widely applied to construct a species tree for the genus Lilium. PMID:25368628
Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex
2016-01-01
Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968
Dumas, Marc-Emmanuel; Domange, Céline; Calderari, Sophie; Martínez, Andrea Rodríguez; Ayala, Rafael; Wilder, Steven P; Suárez-Zamorano, Nicolas; Collins, Stephan C; Wallis, Robert H; Gu, Quan; Wang, Yulan; Hue, Christophe; Otto, Georg W; Argoud, Karène; Navratil, Vincent; Mitchell, Steve C; Lindon, John C; Holmes, Elaine; Cazier, Jean-Baptiste; Nicholson, Jeremy K; Gauguier, Dominique
2016-09-30
The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus occurs through complex organ-specific cellular mechanisms and networks contributing to impaired insulin secretion and insulin resistance. Genome-wide gene expression profiling systems can dissect the genetic contributions to metabolome and transcriptome regulations. The integrative analysis of multiple gene expression traits and metabolic phenotypes (i.e., metabotypes) together with their underlying genetic regulation remains a challenge. Here, we introduce a systems genetics approach based on the topological analysis of a combined molecular network made of genes and metabolites identified through expression and metabotype quantitative trait locus mapping (i.e., eQTL and mQTL) to prioritise biological characterisation of candidate genes and traits. We used systematic metabotyping by 1 H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualize the shortest paths between metabolites and genes significantly associated with each genomic block. Despite strong genomic similarities (95-99 %) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting the metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific mQTLs and genome-wide eQTLs. Variation in key metabolites like glucose, succinate, lactate, or 3-hydroxybutyrate and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing the shortest path length drove prioritization of biological validations by gene silencing. These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulation and to characterize novel functional roles for genes determining tissue-specific metabolism.
Lakatos, Anita; Goldberg, Natalie R S; Blurton-Jones, Mathew
2017-03-10
We previously demonstrated that transplantation of murine neural stem cells (NSCs) can improve motor and cognitive function in a transgenic model of Dementia with Lewy Bodies (DLB). These benefits occurred without changes in human α-synuclein pathology and were mediated in part by stem cell-induced elevation of brain-derived neurotrophic factor (BDNF). However, instrastriatal NSC transplantation likely alters the brain microenvironment via multiple mechanisms that may synergize to promote cognitive and motor recovery. The underlying neurobiology that mediates such restoration no doubt involves numerous genes acting in concert to modulate signaling within and between host brain cells and transplanted NSCs. In order to identify functionally connected gene networks and additional mechanisms that may contribute to stem cell-induced benefits, we performed weighted gene co-expression network analysis (WGCNA) on striatal tissue isolated from NSC- and vehicle-injected wild-type and DLB mice. Combining continuous behavioral and biochemical data with genome wide expression via network analysis proved to be a powerful approach; revealing significant alterations in immune response, neurotransmission, and mitochondria function. Taken together, these data shed further light on the gene network and biological processes that underlie the therapeutic effects of NSC transplantation on α-synuclein induced cognitive and motor impairments, thereby highlighting additional therapeutic targets for synucleinopathies.
SYNTHETIC BIOLOGY. Emergent genetic oscillations in a synthetic microbial consortium.
Chen, Ye; Kim, Jae Kyoung; Hirning, Andrew J; Josić, Krešimir; Bennett, Matthew R
2015-08-28
A challenge of synthetic biology is the creation of cooperative microbial systems that exhibit population-level behaviors. Such systems use cellular signaling mechanisms to regulate gene expression across multiple cell types. We describe the construction of a synthetic microbial consortium consisting of two distinct cell types—an "activator" strain and a "repressor" strain. These strains produced two orthogonal cell-signaling molecules that regulate gene expression within a synthetic circuit spanning both strains. The two strains generated emergent, population-level oscillations only when cultured together. Certain network topologies of the two-strain circuit were better at maintaining robust oscillations than others. The ability to program population-level dynamics through the genetic engineering of multiple cooperative strains points the way toward engineering complex synthetic tissues and organs with multiple cell types. Copyright © 2015, American Association for the Advancement of Science.
A multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors for functional gene analysis.
Weber, Kristoffer; Bartsch, Udo; Stocking, Carol; Fehse, Boris
2008-04-01
Functional gene analysis requires the possibility of overexpression, as well as downregulation of one, or ideally several, potentially interacting genes. Lentiviral vectors are well suited for this purpose as they ensure stable expression of complementary DNAs (cDNAs), as well as short-hairpin RNAs (shRNAs), and can efficiently transduce a wide spectrum of cell targets when packaged within the coat proteins of other viruses. Here we introduce a multicolor panel of novel lentiviral "gene ontology" (LeGO) vectors designed according to the "building blocks" principle. Using a wide spectrum of different fluorescent markers, including drug-selectable enhanced green fluorescent protein (eGFP)- and dTomato-blasticidin-S resistance fusion proteins, LeGO vectors allow simultaneous analysis of multiple genes and shRNAs of interest within single, easily identifiable cells. Furthermore, each functional module is flanked by unique cloning sites, ensuring flexibility and individual optimization. The efficacy of these vectors for analyzing multiple genes in a single cell was demonstrated in several different cell types, including hematopoietic, endothelial, and neural stem and progenitor cells, as well as hepatocytes. LeGO vectors thus represent a valuable tool for investigating gene networks using conditional ectopic expression and knock-down approaches simultaneously.
Sharma, Amitabh; Gulbahce, Natali; Pevzner, Samuel J.; Menche, Jörg; Ladenvall, Claes; Folkersen, Lasse; Eriksson, Per; Orho-Melander, Marju; Barabási, Albert-László
2013-01-01
Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes. PMID:23882023
Petunia, Your Next Supermodel?
Vandenbussche, Michiel; Chambrier, Pierre; Rodrigues Bento, Suzanne; Morel, Patrice
2016-01-01
Plant biology in general, and plant evo–devo in particular would strongly benefit from a broader range of available model systems. In recent years, technological advances have facilitated the analysis and comparison of individual gene functions in multiple species, representing now a fairly wide taxonomic range of the plant kingdom. Because genes are embedded in gene networks, studying evolution of gene function ultimately should be put in the context of studying the evolution of entire gene networks, since changes in the function of a single gene will normally go together with further changes in its network environment. For this reason, plant comparative biology/evo–devo will require the availability of a defined set of ‘super’ models occupying key taxonomic positions, in which performing gene functional analysis and testing genetic interactions ideally is as straightforward as, e.g., in Arabidopsis. Here we review why petunia has the potential to become one of these future supermodels, as a representative of the Asterid clade. We will first detail its intrinsic qualities as a model system. Next, we highlight how the revolution in sequencing technologies will now finally allows exploitation of the petunia system to its full potential, despite that petunia has already a long history as a model in plant molecular biology and genetics. We conclude with a series of arguments in favor of a more diversified multi-model approach in plant biology, and we point out where the petunia model system may further play a role, based on its biological features and molecular toolkit. PMID:26870078
Sugathan, Aarathi; Biagioli, Marta; Golzio, Christelle; Erdin, Serkan; Blumenthal, Ian; Manavalan, Poornima; Ragavendran, Ashok; Brand, Harrison; Lucente, Diane; Miles, Judith; Sheridan, Steven D.; Stortchevoi, Alexei; Kellis, Manolis; Haggarty, Stephen J.; Katsanis, Nicholas; Gusella, James F.; Talkowski, Michael E.
2014-01-01
Truncating mutations of chromodomain helicase DNA-binding protein 8 (CHD8), and of many other genes with diverse functions, are strong-effect risk factors for autism spectrum disorder (ASD), suggesting multiple mechanisms of pathogenesis. We explored the transcriptional networks that CHD8 regulates in neural progenitor cells (NPCs) by reducing its expression and then integrating transcriptome sequencing (RNA sequencing) with genome-wide CHD8 binding (ChIP sequencing). Suppressing CHD8 to levels comparable with the loss of a single allele caused altered expression of 1,756 genes, 64.9% of which were up-regulated. CHD8 showed widespread binding to chromatin, with 7,324 replicated sites that marked 5,658 genes. Integration of these data suggests that a limited array of direct regulatory effects of CHD8 produced a much larger network of secondary expression changes. Genes indirectly down-regulated (i.e., without CHD8-binding sites) reflect pathways involved in brain development, including synapse formation, neuron differentiation, cell adhesion, and axon guidance, whereas CHD8-bound genes are strongly associated with chromatin modification and transcriptional regulation. Genes associated with ASD were strongly enriched among indirectly down-regulated loci (P < 10−8) and CHD8-bound genes (P = 0.0043), which align with previously identified coexpression modules during fetal development. We also find an intriguing enrichment of cancer-related gene sets among CHD8-bound genes (P < 10−10). In vivo suppression of chd8 in zebrafish produced macrocephaly comparable to that of humans with inactivating mutations. These data indicate that heterozygous disruption of CHD8 precipitates a network of gene-expression changes involved in neurodevelopmental pathways in which many ASD-associated genes may converge on shared mechanisms of pathogenesis. PMID:25294932
Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression.
Fairfax, Benjamin P; Humburg, Peter; Makino, Seiko; Naranbhai, Vivek; Wong, Daniel; Lau, Evelyn; Jostins, Luke; Plant, Katharine; Andrews, Robert; McGee, Chris; Knight, Julian C
2014-03-07
To systematically investigate the impact of immune stimulation upon regulatory variant activity, we exposed primary monocytes from 432 healthy Europeans to interferon-γ (IFN-γ) or differing durations of lipopolysaccharide and mapped expression quantitative trait loci (eQTLs). More than half of cis-eQTLs identified, involving hundreds of genes and associated pathways, are detected specifically in stimulated monocytes. Induced innate immune activity reveals multiple master regulatory trans-eQTLs including the major histocompatibility complex (MHC), coding variants altering enzyme and receptor function, an IFN-β cytokine network showing temporal specificity, and an interferon regulatory factor 2 (IRF2) transcription factor-modulated network. Induced eQTL are significantly enriched for genome-wide association study loci, identifying context-specific associations to putative causal genes including CARD9, ATM, and IRF8. Thus, applying pathophysiologically relevant immune stimuli assists resolution of functional genetic variants.
Global Alignment of Pairwise Protein Interaction Networks for Maximal Common Conserved Patterns
Tian, Wenhong; Samatova, Nagiza F.
2013-01-01
A number of tools for the alignment of protein-protein interaction (PPI) networks have laid the foundation for PPI network analysis. Most of alignment tools focus on finding conserved interaction regions across the PPI networks through either local or global mapping of similar sequences. Researchers are still trying to improve the speed, scalability, and accuracy of network alignment. In view of this, we introduce a connected-components based fast algorithm, HopeMap, for network alignment. Observing that the size of true orthologs across species is small comparing to the total number of proteins in all species, we take a different approach based onmore » a precompiled list of homologs identified by KO terms. Applying this approach to S. cerevisiae (yeast) and D. melanogaster (fly), E. coli K12 and S. typhimurium , E. coli K12 and C. crescenttus , we analyze all clusters identified in the alignment. The results are evaluated through up-to-date known gene annotations, gene ontology (GO), and KEGG ortholog groups (KO). Comparing to existing tools, our approach is fast with linear computational cost, highly accurate in terms of KO and GO terms specificity and sensitivity, and can be extended to multiple alignments easily.« less
Faraji, Farhoud; Hu, Ying; Wu, Gang; Goldberger, Natalie E.; Walker, Renard C.; Zhang, Jinghui; Hunter, Kent W.
2014-01-01
Metastasis is the result of stochastic genomic and epigenetic events leading to gene expression profiles that drive tumor dissemination. Here we exploit the principle that metastatic propensity is modified by the genetic background to generate prognostic gene expression signatures that illuminate regulators of metastasis. We also identify multiple microRNAs whose germline variation is causally linked to tumor progression and metastasis. We employ network analysis of global gene expression profiles in tumors derived from a panel of recombinant inbred mice to identify a network of co-expressed genes centered on Cnot2 that predicts metastasis-free survival. Modulating Cnot2 expression changes tumor cell metastatic potential in vivo, supporting a functional role for Cnot2 in metastasis. Small RNA sequencing of the same tumor set revealed a negative correlation between expression of the Mir216/217 cluster and tumor progression. Expression quantitative trait locus analysis (eQTL) identified cis-eQTLs at the Mir216/217 locus, indicating that differences in expression may be inherited. Ectopic expression of Mir216/217 in tumor cells suppressed metastasis in vivo. Finally, small RNA sequencing and mRNA expression profiling data were integrated to reveal that miR-3470a/b target a high proportion of network transcripts. In vivo analysis of Mir3470a/b demonstrated that both promote metastasis. Moreover, Mir3470b is a likely regulator of the Cnot2 network as its overexpression down-regulated expression of network hub genes and enhanced metastasis in vivo, phenocopying Cnot2 knockdown. The resulting data from this strategy identify Cnot2 as a novel regulator of metastasis and demonstrate the power of our systems-level approach in identifying modifiers of metastasis. PMID:24322557
Matsu-Ura, Toru; Dovzhenok, Andrey A; Coradetti, Samuel T; Subramanian, Krithika R; Meyer, Daniel R; Kwon, Jaesang J; Kim, Caleb; Salomonis, Nathan; Glass, N Louise; Lim, Sookkyung; Hong, Christian I
2018-05-18
Second-generation or lignocellulosic biofuels are a tangible source of renewable energy, which is critical to combat climate change by reducing the carbon footprint. Filamentous fungi secrete cellulose-degrading enzymes called cellulases, which are used for production of lignocellulosic biofuels. However, inefficient production of cellulases is a major obstacle for industrial-scale production of second-generation biofuels. We used computational simulations to design and implement synthetic positive feedback loops to increase gene expression of a key transcription factor, CLR-2, that activates a large number of cellulases in a filamentous fungus, Neurospora crassa. Overexpression of CLR-2 reveals previously unappreciated roles of CLR-2 in lignocellulosic gene network, which enabled simultaneous induction of approximately 50% of 78 lignocellulosic degradation-related genes in our engineered Neurospora strains. This engineering results in dramatically increased cellulase activity due to cooperative orchestration of multiple enzymes involved in the cellulose degradation pathway. Our work provides a proof of principle in utilizing mathematical modeling and synthetic biology to improve the efficiency of cellulase synthesis for second-generation biofuel production.
A graph-theory framework for evaluating landscape connectivity and conservation planning.
Minor, Emily S; Urban, Dean L
2008-04-01
Connectivity of habitat patches is thought to be important for movement of genes, individuals, populations, and species over multiple temporal and spatial scales. We used graph theory to characterize multiple aspects of landscape connectivity in a habitat network in the North Carolina Piedmont (U.S.A). We compared this landscape with simulated networks with known topology, resistance to disturbance, and rate of movement. We introduced graph measures such as compartmentalization and clustering, which can be used to identify locations on the landscape that may be especially resilient to human development or areas that may be most suitable for conservation. Our analyses indicated that for songbirds the Piedmont habitat network was well connected. Furthermore, the habitat network had commonalities with planar networks, which exhibit slow movement, and scale-free networks, which are resistant to random disturbances. These results suggest that connectivity in the habitat network was high enough to prevent the negative consequences of isolation but not so high as to allow rapid spread of disease. Our graph-theory framework provided insight into regional and emergent global network properties in an intuitive and visual way and allowed us to make inferences about rates and paths of species movements and vulnerability to disturbance. This approach can be applied easily to assessing habitat connectivity in any fragmented or patchy landscape.
Genetic variation influences glutamate concentrations in brains of patients with multiple sclerosis.
Baranzini, Sergio E; Srinivasan, Radhika; Khankhanian, Pouya; Okuda, Darin T; Nelson, Sarah J; Matthews, Paul M; Hauser, Stephen L; Oksenberg, Jorge R; Pelletier, Daniel
2010-09-01
Glutamate is the main excitatory neurotransmitter in the mammalian brain. Appropriate transmission of nerve impulses through glutamatergic synapses is required throughout the brain and forms the basis of many processes including learning and memory. However, abnormally high levels of extracellular brain glutamate can lead to neuroaxonal cell death. We have previously reported elevated glutamate levels in the brains of patients suffering from multiple sclerosis. Here two complementary analyses to assess the extent of genomic control over glutamate levels were used. First, a genome-wide association analysis in 382 patients with multiple sclerosis using brain glutamate concentration as a quantitative trait was conducted. In a second approach, a protein interaction network was used to find associated genes within the same pathway. The top associated marker was rs794185 (P < 6.44 x 10(-7)), a non-coding single nucleotide polymorphism within the gene sulphatase modifying factor 1. Our pathway approach identified a module composed of 70 genes with high relevance to glutamate biology. Individuals carrying a higher number of associated alleles from genes in this module showed the highest levels of glutamate. These individuals also showed greater decreases in N-acetylaspartate and in brain volume over 1 year of follow-up. Patients were then stratified by the amount of annual brain volume loss and the same approach was performed in the 'high' (n = 250) and 'low' (n = 132) neurodegeneration groups. The association with rs794185 was highly significant in the group with high neurodegeneration. Further, results from the network-based pathway analysis remained largely unchanged even after stratification. Results from these analyses indicated that variance in the activity of neurochemical pathways implicated in neurodegeneration is explained, at least in part, by the inheritance of common genetic polymorphisms. Spectroscopy-based imaging provides a novel quantitative endophenotype for genetic association studies directed towards identifying new factors that contribute to the heterogeneity of clinical expression of multiple sclerosis.
Genome-Wide Detection and Analysis of Multifunctional Genes
Pritykin, Yuri; Ghersi, Dario; Singh, Mona
2015-01-01
Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Association of variants in innate immune genes with asthma and eczema
Sharma, Sunita; Poon, Audrey; Himes, Blanca E.; Lasky-Su, Jessica; Sordillo, Joanne E.; Belanger, Kathleen; Milton, Donald K.; Bracken, Michael B.; Triche, Elizabeth W.; Leaderer, Brian P.; Gold, Diane R.; Litonjua, Augusto A.
2012-01-01
Background The innate immune pathway is important in the pathogenesis of asthma and eczema. However, only a few variants in these genes have been associated with either disease. We investigate the association between polymorphisms of genes in the innate immune pathway with childhood asthma and eczema. In addition, we compare individual associations with those discovered using a multivariate approach. Methods Using a novel method, case control based association testing (C2BAT), 569 single nucleotide polymorphisms (SNPs) in 44 innate immune genes were tested for association with asthma and eczema in children from the Boston Home Allergens and Asthma Study and the Connecticut Childhood Asthma Study. The screening algorithm was used to identify the top SNPs associated with asthma and eczema. We next investigated the interaction of innate immune variants with asthma and eczema risk using Bayesian networks. Results After correction for multiple comparisons, 7 SNPs in 6 genes (CARD25, TGFB1, LY96, ACAA1, DEFB1, and IFNG) were associated with asthma (adjusted p-value<0.02), while 5 SNPs in 3 different genes (CD80, STAT4, and IRAKI) were significantly associated with eczema (adjusted p-value < 0.02). None of these SNPs were associated with both asthma and eczema. Bayesian network analysis identified 4 SNPs that were predictive of asthma and 10 SNPs that predicted eczema. Of the genes identified using Bayesian networks, only CD80 was associated with eczema in the single-SNP study. Using novel methodology that allows for screening and replication in the same population, we have identified associations of innate immune genes with asthma and eczema. Bayesian network analysis suggests that additional SNPs influence disease susceptibility via SNP interactions. Conclusion Our findings suggest that innate immune genes contribute to the pathogenesis of asthma and eczema, and that these diseases likely have different genetic determinants. PMID:22192168
Microfluidics-Based PCR for Fusion Transcript Detection.
Chen, Hui
2016-01-01
The microfluidic technology allows the production of network of submillimeter-size fluidic channels and reservoirs in a variety of material systems. The microfluidic-based polymerase chain reaction (PCR) allows automated multiplexing of multiple samples and multiple assays simultaneously within a network of microfluidic channels and chambers that are co-ordinated in controlled fashion by the valves. The individual PCR reaction is performed in nanoliter volume, which allows testing on samples with limited DNA and RNA. The microfluidics devices are used in various types of PCR such as digital PCR and single molecular emulsion PCR for genotyping, gene expression, and miRNA expression. In this chapter, the use of a microfluidics-based PCR for simultaneous screening of 14 known fusion transcripts in patients with leukemia is described.
Ruzicka, W Brad; Subburaju, Sivan; Benes, Francine M
2015-06-01
Dysfunction related to γ-aminobutyric acid (GABA)-ergic neurotransmission in the pathophysiology of major psychosis has been well established by the work of multiple groups across several decades, including the widely replicated downregulation of GAD1. Prior gene expression and network analyses within the human hippocampus implicate a broader network of genes, termed the GAD1 regulatory network, in regulation of GAD1 expression. Several genes within this GAD1 regulatory network show diagnosis- and sector-specific expression changes within the circuitry of the hippocampus, influencing abnormal GAD1 expression in schizophrenia and bipolar disorder. To investigate the hypothesis that aberrant DNA methylation contributes to circuit- and diagnosis-specific abnormal expression of GAD1 regulatory network genes in psychotic illness. This epigenetic association study targeting GAD1 regulatory network genes was conducted between July 1, 2012, and June 30, 2014. Postmortem human hippocampus tissue samples were obtained from 8 patients with schizophrenia, 8 patients with bipolar disorder, and 8 healthy control participants matched for age, sex, postmortem interval, and other potential confounds from the Harvard Brain Tissue Resource Center, McLean Hospital, Belmont, Massachusetts. We extracted DNA from laser-microdissected stratum oriens tissue of cornu ammonis 2/3 (CA2/3) and CA1 postmortem human hippocampus, bisulfite modified it, and assessed it with the Infinium HumanMethylation450 BeadChip (Illumina, Inc). The subset of CpG loci associated with GAD1 regulatory network genes was analyzed in R version 3.1.0 software (R Foundation) using the minfi package. Findings were validated using bisulfite pyrosequencing. Methylation levels at 1308 GAD1 regulatory network-associated CpG loci were assessed both as individual sites to identify differentially methylated positions and by sharing information among colocalized probes to identify differentially methylated regions. A total of 146 differentially methylated positions with a false detection rate lower than 0.05 were identified across all 6 groups (2 circuit locations in each of 3 diagnostic categories), and 54 differentially methylated regions with P < .01 were identified in single-group comparisons. Methylation changes were enriched in MSX1, CCND2, and DAXX at specific loci within the hippocampus of patients with schizophrenia and bipolar disorder. This work demonstrates diagnosis- and circuit-specific DNA methylation changes at a subset of GAD1 regulatory network genes in the human hippocampus in schizophrenia and bipolar disorder. These genes participate in chromatin regulation and cell cycle control, supporting the concept that the established GABAergic dysfunction in these disorders is related to disruption of GABAergic interneuron physiology at specific circuit locations within the human hippocampus.
Darlington, Todd M; McCarthy, Riley D; Cox, Ryan J; Miyamoto-Ditmon, Jill; Gallego, Xavier; Ehringer, Marissa A
2016-01-01
Hedonic substitution, where wheel running reduces voluntary ethanol consumption has been observed in prior studies. Here we replicate and expand on previous work showing that mice decrease voluntary ethanol consumption and preference when given access to a running wheel. While earlier work has been limited mainly to behavioral studies, here we assess the underlying molecular mechanisms that may account for this interaction. From four groups of female C57BL/6J mice (control, access to two-bottle choice ethanol, access to a running wheel, and access to both two-bottle choice ethanol and a running wheel), mRNA-sequencing of the striatum identified differential gene expression. Many genes in ethanol preference quantitative trait loci were differentially expressed due to running. Furthermore, we conducted Weighted Gene Co-expression Network Analysis and identified gene networks corresponding to each effect behavioral group. Candidate genes for mediating the behavioral interaction between ethanol consumption and wheel running include multiple potassium channel genes, Oprm1, Prkcg, Stxbp1, Crhr1, Gabra3, Slc6a13, Stx1b, Pomc, Rassf5, Polr2a, and Camta2. After observing an overlap of many genes and functional groups previously identified in studies of initial sensitivity to ethanol, we hypothesized that wheel running may induce a change in sensitivity, thereby affecting ethanol consumption. A behavioral study examining Loss of Righting Reflex to ethanol following exercise trended toward supporting this hypothesis. These data provide a rich resource for future studies that may better characterize the observed transcriptional changes in gene networks in response to ethanol consumption and wheel running. PMID:27063791
Diagnosing phenotypes of single-sample individuals by edge biomarkers.
Zhang, Wanwei; Zeng, Tao; Liu, Xiaoping; Chen, Luonan
2015-06-01
Network or edge biomarkers are a reliable form to characterize phenotypes or diseases. However, obtaining edges or correlations between molecules for an individual requires measurement of multiple samples of that individual, which are generally unavailable in clinical practice. Thus, it is strongly demanded to diagnose a disease by edge or network biomarkers in one-sample-for-one-individual context. Here, we developed a new computational framework, EdgeBiomarker, to integrate edge and node biomarkers to diagnose phenotype of each single test sample. By applying the method to datasets of lung and breast cancer, it reveals new marker genes/gene-pairs and related sub-networks for distinguishing earlier and advanced cancer stages. Our method shows advantages over traditional methods: (i) edge biomarkers extracted from non-differentially expressed genes achieve better cross-validation accuracy of diagnosis than molecule or node biomarkers from differentially expressed genes, suggesting that certain pathogenic information is only present at the level of network and under-estimated by traditional methods; (ii) edge biomarkers categorize patients into low/high survival rate in a more reliable manner; (iii) edge biomarkers are significantly enriched in relevant biological functions or pathways, implying that the association changes in a network, rather than expression changes in individual molecules, tend to be causally related to cancer development. The new framework of edge biomarkers paves the way for diagnosing diseases and analyzing their molecular mechanisms by edges or networks in one-sample-for-one-individual basis. This also provides a powerful tool for precision medicine or big-data medicine. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Genomic Methods for Clinical and Translational Pain Research
Wang, Dan; Kim, Hyungsuk; Wang, Xiao-Min; Dionne, Raymond
2012-01-01
Pain is a complex sensory experience for which the molecular mechanisms are yet to be fully elucidated. Individual differences in pain sensitivity are mediated by a complex network of multiple gene polymorphisms, physiological and psychological processes, and environmental factors. Here, we present the methods for applying unbiased molecular-genetic approaches, genome-wide association study (GWAS), and global gene expression analysis, to help better understand the molecular basis of pain sensitivity in humans and variable responses to analgesic drugs. PMID:22351080
Novel and rare functional genomic variants in multiple autoimmune syndrome and Sjögren's syndrome.
Johar, Angad S; Mastronardi, Claudio; Rojas-Villarraga, Adriana; Patel, Hardip R; Chuah, Aaron; Peng, Kaiman; Higgins, Angela; Milburn, Peter; Palmer, Stephanie; Silva-Lara, Maria Fernanda; Velez, Jorge I; Andrews, Dan; Field, Matthew; Huttley, Gavin; Goodnow, Chris; Anaya, Juan-Manuel; Arcos-Burgos, Mauricio
2015-06-02
Multiple autoimmune syndrome (MAS), an extreme phenotype of autoimmune disorders, is a very well suited trait to tackle genomic variants of these conditions. Whole exome sequencing (WES) is a widely used strategy for detection of protein coding and splicing variants associated with inherited diseases. The DNA of eight patients affected by MAS [all of whom presenting with Sjögren's syndrome (SS)], four patients affected by SS alone and 38 unaffected individuals, were subject to WES. Filters to identify novel and rare functional (pathogenic-deleterious) homozygous and/or compound heterozygous variants in these patients and controls were applied. Bioinformatics tools such as the Human gene connectome as well as pathway and network analysis were applied to test overrepresentation of genes harbouring these variants in critical pathways and networks involved in autoimmunity. Eleven novel and rare functional variants were identified in cases but not in controls, harboured in: MACF1, KIAA0754, DUSP12, ICA1, CELA1, LRP1/STAT6, GRIN3B, ANKLE1, TMEM161A, and FKRP. These were subsequently subject to network analysis and their functional relatedness to genes already associated with autoimmunity was evaluated. Notably, the LRP1/STAT6 novel mutation was homozygous in one MAS affected patient and heterozygous in another. LRP1/STAT6 disclosed the strongest plausibility for autoimmunity. LRP1/STAT6 are involved in extracellular and intracellular anti-inflammatory pathways that play key roles in maintaining the homeostasis of the immune system. Further; networks, pathways, and interaction analyses showed that LRP1 is functionally related to the HLA-B and IL10 genes and it has a substantial impact within immunological pathways and/or reaction to bacterial and other foreign proteins (phagocytosis, regulation of phospholipase A2 activity, negative regulation of apoptosis and response to lipopolysaccharides). Further, ICA1 and STAT6 were also closely related to AIRE and IRF5, two very well known autoimmunity genes. Novel and rare exonic mutations that may account for autoimmunity were identified. Among those, the LRP1/STAT6 novel mutation has the strongest case for being categorised as potentially causative of MAS given the presence of intriguing patterns of functional interaction with other major genes shaping autoimmunity.
Accurate Encoding and Decoding by Single Cells: Amplitude Versus Frequency Modulation
Micali, Gabriele; Aquino, Gerardo; Richards, David M.; Endres, Robert G.
2015-01-01
Cells sense external concentrations and, via biochemical signaling, respond by regulating the expression of target proteins. Both in signaling networks and gene regulation there are two main mechanisms by which the concentration can be encoded internally: amplitude modulation (AM), where the absolute concentration of an internal signaling molecule encodes the stimulus, and frequency modulation (FM), where the period between successive bursts represents the stimulus. Although both mechanisms have been observed in biological systems, the question of when it is beneficial for cells to use either AM or FM is largely unanswered. Here, we first consider a simple model for a single receptor (or ion channel), which can either signal continuously whenever a ligand is bound, or produce a burst in signaling molecule upon receptor binding. We find that bursty signaling is more accurate than continuous signaling only for sufficiently fast dynamics. This suggests that modulation based on bursts may be more common in signaling networks than in gene regulation. We then extend our model to multiple receptors, where continuous and bursty signaling are equivalent to AM and FM respectively, finding that AM is always more accurate. This implies that the reason some cells use FM is related to factors other than accuracy, such as the ability to coordinate expression of multiple genes or to implement threshold crossing mechanisms. PMID:26030820
Heart morphogenesis gene regulatory networks revealed by temporal expression analysis.
Hill, Jonathon T; Demarest, Bradley; Gorsi, Bushra; Smith, Megan; Yost, H Joseph
2017-10-01
During embryogenesis the heart forms as a linear tube that then undergoes multiple simultaneous morphogenetic events to obtain its mature shape. To understand the gene regulatory networks (GRNs) driving this phase of heart development, during which many congenital heart disease malformations likely arise, we conducted an RNA-seq timecourse in zebrafish from 30 hpf to 72 hpf and identified 5861 genes with altered expression. We clustered the genes by temporal expression pattern, identified transcription factor binding motifs enriched in each cluster, and generated a model GRN for the major gene batteries in heart morphogenesis. This approach predicted hundreds of regulatory interactions and found batteries enriched in specific cell and tissue types, indicating that the approach can be used to narrow the search for novel genetic markers and regulatory interactions. Subsequent analyses confirmed the GRN using two mutants, Tbx5 and nkx2-5 , and identified sets of duplicated zebrafish genes that do not show temporal subfunctionalization. This dataset provides an essential resource for future studies on the genetic/epigenetic pathways implicated in congenital heart defects and the mechanisms of cardiac transcriptional regulation. © 2017. Published by The Company of Biologists Ltd.
Hu, Wei; Wang, Lianzhe; Tie, Weiwei; Yan, Yan; Ding, Zehong; Liu, Juhua; Li, Meiying; Peng, Ming; Xu, Biyu; Jin, Zhiqiang
2016-01-01
The leucine zipper (bZIP) transcription factors play important roles in multiple biological processes. However, less information is available regarding the bZIP family in the important fruit crop banana. In this study, 121 bZIP transcription factor genes were identified in the banana genome. Phylogenetic analysis showed that MabZIPs were classified into 11 subfamilies. The majority of MabZIP genes in the same subfamily shared similar gene structures and conserved motifs. The comprehensive transcriptome analysis of two banana genotypes revealed the differential expression patterns of MabZIP genes in different organs, in various stages of fruit development and ripening, and in responses to abiotic stresses, including drought, cold, and salt. Interaction networks and co-expression assays showed that group A MabZIP-mediated networks participated in various stress signaling, which was strongly activated in Musa ABB Pisang Awak. This study provided new insights into the complicated transcriptional control of MabZIP genes and provided robust tissue-specific, development-dependent, and abiotic stress-responsive candidate MabZIP genes for potential applications in the genetic improvement of banana cultivars. PMID:27445085
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.
Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao
2016-04-01
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.
Ye, R; Carneiro, A M D; Han, Q; Airey, D; Sanders-Bush, E; Zhang, B; Lu, L; Williams, R; Blakely, R D
2014-03-01
Presynaptic serotonin (5-hydroxytryptamine, 5-HT) transporters (SERT) regulate 5-HT signaling via antidepressant-sensitive clearance of released neurotransmitter. Polymorphisms in the human SERT gene (SLC6A4) have been linked to risk for multiple neuropsychiatric disorders, including depression, obsessive-compulsive disorder and autism. Using BXD recombinant inbred mice, a genetic reference population that can support the discovery of novel determinants of complex traits, merging collective trait assessments with bioinformatics approaches, we examine phenotypic and molecular networks associated with SERT gene and protein expression. Correlational analyses revealed a network of genes that significantly associated with SERT mRNA levels. We quantified SERT protein expression levels and identified region- and gender-specific quantitative trait loci (QTLs), one of which associated with male midbrain SERT protein expression, centered on the protocadherin-15 gene (Pcdh15), overlapped with a QTL for midbrain 5-HT levels. Pcdh15 was also the only QTL-associated gene whose midbrain mRNA expression significantly associated with both SERT protein and 5-HT traits, suggesting an unrecognized role of the cell adhesion protein in the development or function of 5-HT neurons. To test this hypothesis, we assessed SERT protein and 5-HT traits in the Pcdh15 functional null line (Pcdh15(av-) (3J) ), studies that revealed a strong, negative influence of Pcdh15 on these phenotypes. Together, our findings illustrate the power of multidimensional profiling of recombinant inbred lines in the analysis of molecular networks that support synaptic signaling, and that, as in the case of Pcdh15, can reveal novel relationships that may underlie risk for mental illness. © 2014 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Peng, Hui; Lan, Chaowang; Zheng, Yi; Hutvagner, Gyorgy; Tao, Dacheng; Li, Jinyan
2017-03-24
MicroRNAs always function cooperatively in their regulation of gene expression. Dysfunctions of these co-functional microRNAs can play significant roles in disease development. We are interested in those multi-disease associated co-functional microRNAs that regulate their common dysfunctional target genes cooperatively in the development of multiple diseases. The research is potentially useful for human disease studies at the transcriptional level and for the study of multi-purpose microRNA therapeutics. We designed a computational method to detect multi-disease associated co-functional microRNA pairs and conducted cross disease analysis on a reconstructed disease-gene-microRNA (DGR) tripartite network. The construction of the DGR tripartite network is by the integration of newly predicted disease-microRNA associations with those relationships of diseases, microRNAs and genes maintained by existing databases. The prediction method uses a set of reliable negative samples of disease-microRNA association and a pre-computed kernel matrix instead of kernel functions. From this reconstructed DGR tripartite network, multi-disease associated co-functional microRNA pairs are detected together with their common dysfunctional target genes and ranked by a novel scoring method. We also conducted proof-of-concept case studies on cancer-related co-functional microRNA pairs as well as on non-cancer disease-related microRNA pairs. With the prioritization of the co-functional microRNAs that relate to a series of diseases, we found that the co-function phenomenon is not unusual. We also confirmed that the regulation of the microRNAs for the development of cancers is more complex and have more unique properties than those of non-cancer diseases.
Lynx web services for annotations and systems analysis of multi-gene disorders.
Sulakhe, Dinanath; Taylor, Andrew; Balasubramanian, Sandhya; Feng, Bo; Xie, Bingqing; Börnigen, Daniela; Dave, Utpal J; Foster, Ian T; Gilliam, T Conrad; Maltsev, Natalia
2014-07-01
Lynx is a web-based integrated systems biology platform that supports annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Lynx has integrated multiple classes of biomedical data (genomic, proteomic, pathways, phenotypic, toxicogenomic, contextual and others) from various public databases as well as manually curated data from our group and collaborators (LynxKB). Lynx provides tools for gene list enrichment analysis using multiple functional annotations and network-based gene prioritization. Lynx provides access to the integrated database and the analytical tools via REST based Web Services (http://lynx.ci.uchicago.edu/webservices.html). This comprises data retrieval services for specific functional annotations, services to search across the complete LynxKB (powered by Lucene), and services to access the analytical tools built within the Lynx platform. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ficklin, Stephen P; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.
Ficklin, Stephen P.; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance. PMID:23874666
Jones, Anya C; Troy, Niamh M; White, Elisha; Hollams, Elysia M; Gout, Alexander M; Ling, Kak-Ming; Kicic, Anthony; Stick, Stephen M; Sly, Peter D; Holt, Patrick G; Hall, Graham L; Bosco, Anthony
2018-01-24
Atopic asthma is a persistent disease characterized by intermittent wheeze and progressive loss of lung function. The disease is thought to be driven primarily by chronic aeroallergen-induced type 2-associated inflammation. However, the vast majority of atopics do not develop asthma despite ongoing aeroallergen exposure, suggesting additional mechanisms operate in conjunction with type 2 immunity to drive asthma pathogenesis. We employed RNA-Seq profiling of sputum-derived cells to identify gene networks operative at baseline in house dust mite-sensitized (HDM S ) subjects with/without wheezing history that are characteristic of the ongoing asthmatic state. The expression of type 2 effectors (IL-5, IL-13) was equivalent in both cohorts of subjects. However, in HDM S -wheezers they were associated with upregulation of two coexpression modules comprising multiple type 2- and epithelial-associated genes. The first module was interlinked by the hubs EGFR, ERBB2, CDH1 and IL-13. The second module was associated with CDHR3 and mucociliary clearance genes. Our findings provide new insight into the molecular mechanisms operative at baseline in the airway mucosa in atopic asthmatics undergoing natural aeroallergen exposure, and suggest that susceptibility to asthma amongst these subjects involves complex interactions between type 2- and epithelial-associated gene networks, which are not operative in equivalently sensitized/exposed atopic non-asthmatics.
Genome network medicine: innovation to overcome huge challenges in cancer therapy.
Roukos, Dimitrios H
2014-01-01
The post-ENCODE era shapes now a new biomedical research direction for understanding transcriptional and signaling networks driving gene expression and core cellular processes such as cell fate, survival, and apoptosis. Over the past half century, the Francis Crick 'central dogma' of single n gene/protein-phenotype (trait/disease) has defined biology, human physiology, disease, diagnostics, and drugs discovery. However, the ENCODE project and several other genomic studies using high-throughput sequencing technologies, computational strategies, and imaging techniques to visualize regulatory networks, provide evidence that transcriptional process and gene expression are regulated by highly complex dynamic molecular and signaling networks. This Focus article describes the linear experimentation-based limitations of diagnostics and therapeutics to cure advanced cancer and the need to move on from reductionist to network-based approaches. With evident a wide genomic heterogeneity, the power and challenges of next-generation sequencing (NGS) technologies to identify a patient's personal mutational landscape for tailoring the best target drugs in the individual patient are discussed. However, the available drugs are not capable of targeting aberrant signaling networks and research on functional transcriptional heterogeneity and functional genome organization is poorly understood. Therefore, the future clinical genome network medicine aiming at overcoming multiple problems in the new fields of regulatory DNA mapping, noncoding RNA, enhancer RNAs, and dynamic complexity of transcriptional circuitry are also discussed expecting in new innovation technology and strong appreciation of clinical data and evidence-based medicine. The problematic and potential solutions in the discovery of next-generation, molecular, and signaling circuitry-based biomarkers and drugs are explored. © 2013 Wiley Periodicals, Inc.
BiologicalNetworks 2.0 - an integrative view of genome biology data
2010-01-01
Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org. PMID:21190573
Gérard, Claude; Novák, Béla
2013-01-01
microRNAs (miRNAs) are small noncoding RNAs that are important post-transcriptional regulators of gene expression. miRNAs can induce thresholds in protein synthesis. Such thresholds in protein output can be also achieved by oligomerization of transcription factors (TF) for the control of gene expression. First, we propose a minimal model for protein expression regulated by miRNA and by oligomerization of TF. We show that miRNA and oligomerization of TF generate a buffer, which increases the robustness of protein output towards molecular noise as well as towards random variation of kinetics parameters. Next, we extend the model by considering that the same miRNA can bind to multiple messenger RNAs, which accounts for the dynamics of a minimal competing endogenous RNAs (ceRNAs) network. The model shows that, through common miRNA regulation, TF can control the expression of all proteins formed by the ceRNA network, even if it drives the expression of only one gene in the network. The model further suggests that the threshold in protein synthesis mediated by the oligomerization of TF can be propagated to the other genes, which can increase the robustness of the expression of all genes in such ceRNA network. Furthermore, we show that a miRNA could increase the time delay of a “Goodwin-like” oscillator model, which may favor the occurrence of oscillations of large amplitude. This result predicts important roles of miRNAs in the control of the molecular mechanisms leading to the emergence of biological rhythms. Moreover, a model for the latter oscillator embedded in a ceRNA network indicates that the oscillatory behavior can be propagated, via the shared miRNA, to all proteins formed by such ceRNA network. Thus, by means of computational models, we show that miRNAs could act as vectors allowing the propagation of robustness in protein synthesis as well as oscillatory behaviors within ceRNA networks. PMID:24376695
Fuentes, Nathalie; Roy, Arpan; Mishra, Vikas; Cabello, Noe; Silveyra, Patricia
2018-05-08
Sex differences in the incidence and prognosis of respiratory diseases have been reported. Studies have shown that women are at increased risk of adverse health outcomes from air pollution than men, but sex-specific immune gene expression patterns and regulatory networks have not been well studied in the lung. MicroRNAs (miRNAs) are environmentally sensitive posttranscriptional regulators of gene expression that may mediate the damaging effects of inhaled pollutants in the lung, by altering the expression of innate immunity molecules. Male and female mice of the C57BL/6 background were exposed to 2 ppm of ozone or filtered air (control) for 3 h. Female mice were also exposed at different stages of the estrous cycle. Following exposure, lungs were harvested and total RNA was extracted. We used PCR arrays to study sex differences in the expression of 84 miRNAs predicted to target inflammatory and immune genes. We identified differentially expressed miRNA signatures in the lungs of male vs. female exposed to ozone. In silico pathway analyses identified sex-specific biological networks affected by exposure to ozone that ranged from direct predicted gene targeting to complex interactions with multiple intermediates. We also identified differences in miRNA expression and predicted regulatory networks in females exposed to ozone at different estrous cycle stages. Our results indicate that both sex and hormonal status can influence lung miRNA expression in response to ozone exposure, indicating that sex-specific miRNA regulation of inflammatory gene expression could mediate differential pollution-induced health outcomes in men and women.
Lawless, Nathan; Reinhardt, Timothy A; Bryan, Kenneth; Baker, Mike; Pesch, Bruce; Zimmerman, Duane; Zuelke, Kurt; Sonstegard, Tad; O'Farrelly, Cliona; Lippolis, John D; Lynn, David J
2014-01-27
Bovine mastitis is an inflammation-driven disease of the bovine mammary gland that costs the global dairy industry several billion dollars per year. Because disease susceptibility is a multifactorial complex phenotype, an integrative biology approach is required to dissect the molecular networks involved. Here, we report such an approach using next-generation sequencing combined with advanced network and pathway biology methods to simultaneously profile mRNA and miRNA expression at multiple time points (0, 12, 24, 36 and 48 hr) in milk and blood FACS-isolated CD14(+) monocytes from animals infected in vivo with Streptococcus uberis. More than 3700 differentially expressed (DE) genes were identified in milk-isolated monocytes (MIMs), a key immune cell recruited to the site of infection during mastitis. Upregulated genes were significantly enriched for inflammatory pathways, whereas downregulated genes were enriched for nonglycolytic metabolic pathways. Monocyte transcriptional changes in the blood, however, were more subtle but highlighted the impact of this infection systemically. Genes upregulated in blood-isolated monocytes (BIMs) showed a significant association with interferon and chemokine signaling. Furthermore, 26 miRNAs were DE in MIMs and three were DE in BIMs. Pathway analysis revealed that predicted targets of downregulated miRNAs were highly enriched for roles in innate immunity (FDR < 3.4E-8), particularly TLR signaling, whereas upregulated miRNAs preferentially targeted genes involved in metabolism. We conclude that during S. uberis infection miRNAs are key amplifiers of monocyte inflammatory response networks and repressors of several metabolic pathways. Copyright © 2014 Lawless et al.
Mathur, Deepali; María-Lafuente, Eva; Ureña-Peralta, Juan R.; Sorribes, Lucas; Hernández, Alberto; Casanova, Bonaventura; López-Rodas, Gerardo; Coret-Ferrer, Francisco; Burgal-Marti, Maria
2017-01-01
Axonal damage is widely accepted as a major cause of permanent functional disability in Multiple Sclerosis (MS). In relapsing-remitting MS, there is a possibility of remyelination by myelin producing cells and restoration of neurological function. The purpose of this study was to delineate the pathophysiological mechanisms underpinning axonal injury through hitherto unknown factors present in cerebrospinal fluid (CSF) that may regulate axonal damage, remyelinate the axon and make functional recovery possible. We employed primary cultures of rat unmyelinated cerebellar granule neurons and treated them with CSF obtained from MS and Neuromyelitis optica (NMO) patients. We performed microarray gene expression profiling to study changes in gene expression in treated neurons as compared to controls. Additionally, we determined the influence of gene-gene interaction upon the whole metabolic network in our experimental conditions using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) program. Our findings revealed the downregulated expression of genes involved in glucose metabolism in MS-derived CSF-treated neurons and upregulated expression of genes in NMO-derived CSF-treated neurons. We conclude that factors in the CSF of these patients caused a perturbation in metabolic gene(s) expression and suggest that MS appears to be linked with metabolic deformity. PMID:29267205
Yi, Ming; Mudunuri, Uma; Che, Anney; Stephens, Robert M
2009-06-29
One of the challenges in the analysis of microarray data is to integrate and compare the selected (e.g., differential) gene lists from multiple experiments for common or unique underlying biological themes. A common way to approach this problem is to extract common genes from these gene lists and then subject these genes to enrichment analysis to reveal the underlying biology. However, the capacity of this approach is largely restricted by the limited number of common genes shared by datasets from multiple experiments, which could be caused by the complexity of the biological system itself. We now introduce a new Pathway Pattern Extraction Pipeline (PPEP), which extends the existing WPS application by providing a new pathway-level comparative analysis scheme. To facilitate comparing and correlating results from different studies and sources, PPEP contains new interfaces that allow evaluation of the pathway-level enrichment patterns across multiple gene lists. As an exploratory tool, this analysis pipeline may help reveal the underlying biological themes at both the pathway and gene levels. The analysis scheme provided by PPEP begins with multiple gene lists, which may be derived from different studies in terms of the biological contexts, applied technologies, or methodologies. These lists are then subjected to pathway-level comparative analysis for extraction of pathway-level patterns. This analysis pipeline helps to explore the commonality or uniqueness of these lists at the level of pathways or biological processes from different but relevant biological systems using a combination of statistical enrichment measurements, pathway-level pattern extraction, and graphical display of the relationships of genes and their associated pathways as Gene-Term Association Networks (GTANs) within the WPS platform. As a proof of concept, we have used the new method to analyze many datasets from our collaborators as well as some public microarray datasets. This tool provides a new pathway-level analysis scheme for integrative and comparative analysis of data derived from different but relevant systems. The tool is freely available as a Pathway Pattern Extraction Pipeline implemented in our existing software package WPS, which can be obtained at http://www.abcc.ncifcrf.gov/wps/wps_index.php.
Mohsenizadeh, Daniel N; Dehghannasiri, Roozbeh; Dougherty, Edward R
2018-01-01
In systems biology, network models are often used to study interactions among cellular components, a salient aim being to develop drugs and therapeutic mechanisms to change the dynamical behavior of the network to avoid undesirable phenotypes. Owing to limited knowledge, model uncertainty is commonplace and network dynamics can be updated in different ways, thereby giving multiple dynamic trajectories, that is, dynamics uncertainty. In this manuscript, we propose an experimental design method that can effectively reduce the dynamics uncertainty and improve performance in an interaction-based network. Both dynamics uncertainty and experimental error are quantified with respect to the modeling objective, herein, therapeutic intervention. The aim of experimental design is to select among a set of candidate experiments the experiment whose outcome, when applied to the network model, maximally reduces the dynamics uncertainty pertinent to the intervention objective.
Dowell, Karen G; Simons, Allen K; Bai, Hao; Kell, Braden; Wang, Zack Z; Yun, Kyuson; Hibbs, Matthew A
2014-05-01
Embryonic stem cells (ESCs), characterized by their ability to both self-renew and differentiate into multiple cell lineages, are a powerful model for biomedical research and developmental biology. Human and mouse ESCs share many features, yet have distinctive aspects, including fundamental differences in the signaling pathways and cell cycle controls that support self-renewal. Here, we explore the molecular basis of human ESC self-renewal using Bayesian network machine learning to integrate cell-type-specific, high-throughput data for gene function discovery. We integrated high-throughput ESC data from 83 human studies (~1.8 million data points collected under 1,100 conditions) and 62 mouse studies (~2.4 million data points collected under 1,085 conditions) into separate human and mouse predictive networks focused on ESC self-renewal to analyze shared and distinct functional relationships among protein-coding gene orthologs. Computational evaluations show that these networks are highly accurate, literature validation confirms their biological relevance, and reverse transcriptase polymerase chain reaction (RT-PCR) validation supports our predictions. Our results reflect the importance of key regulatory genes known to be strongly associated with self-renewal and pluripotency in both species (e.g., POU5F1, SOX2, and NANOG), identify metabolic differences between species (e.g., threonine metabolism), clarify differences between human and mouse ESC developmental signaling pathways (e.g., leukemia inhibitory factor (LIF)-activated JAK/STAT in mouse; NODAL/ACTIVIN-A-activated fibroblast growth factor in human), and reveal many novel genes and pathways predicted to be functionally associated with self-renewal in each species. These interactive networks are available online at www.StemSight.org for stem cell researchers to develop new hypotheses, discover potential mechanisms involving sparsely annotated genes, and prioritize genes of interest for experimental validation. © 2013 AlphaMed Press.
Ghosh, Sujoy; Vivar, Juan; Nelson, Christopher P; Willenborg, Christina; Segrè, Ayellet V; Mäkinen, Ville-Petteri; Nikpay, Majid; Erdmann, Jeannette; Blankenberg, Stefan; O'Donnell, Christopher; März, Winfried; Laaksonen, Reijo; Stewart, Alexandre FR; Epstein, Stephen E; Shah, Svati H; Granger, Christopher B; Hazen, Stanley L; Kathiresan, Sekar; Reilly, Muredach P; Yang, Xia; Quertermous, Thomas; Samani, Nilesh J; Schunkert, Heribert; Assimes, Themistocles L; McPherson, Ruth
2016-01-01
Objective Genome-wide association (GWA) studies have identified multiple genetic variants affecting the risk of coronary artery disease (CAD). However, individually these explain only a small fraction of the heritability of CAD and for most, the causal biological mechanisms remain unclear. We sought to obtain further insights into potential causal processes of CAD by integrating large-scale GWA data with expertly curated databases of core human pathways and functional networks. Approaches and Results Employing pathways (gene sets) from Reactome, we carried out a two-stage gene set enrichment analysis strategy. From a meta-analyzed discovery cohort of 7 CADGWAS data sets (9,889 cases/11,089 controls), nominally significant gene-sets were tested for replication in a meta-analysis of 9 additional studies (15,502 cases/55,730 controls) from the CARDIoGRAM Consortium. A total of 32 of 639 Reactome pathways tested showed convincing association with CAD (replication p<0.05). These pathways resided in 9 of 21 core biological processes represented in Reactome, and included pathways relevant to extracellular matrix integrity, innate immunity, axon guidance, and signaling by PDRF, NOTCH, and the TGF-β/SMAD receptor complex. Many of these pathways had strengths of association comparable to those observed in lipid transport pathways. Network analysis of unique genes within the replicated pathways further revealed several interconnected functional and topologically interacting modules representing novel associations (e.g. semaphorin regulated axonal guidance pathway) besides confirming known processes (lipid metabolism). The connectivity in the observed networks was statistically significant compared to random networks (p<0.001). Network centrality analysis (‘degree’ and ‘betweenness’) further identified genes (e.g. NCAM1, FYN, FURIN etc.) likely to play critical roles in the maintenance and functioning of several of the replicated pathways. Conclusions These findings provide novel insights into how genetic variation, interpreted in the context of biological processes and functional interactions among genes, may help define the genetic architecture of CAD. PMID:25977570
Meyer, Miriah; Wunderlich, Zeba; Simirenko, Lisa; Luengo Hendriks, Cris L.; Keränen, Soile V. E.; Henriquez, Clara; Knowles, David W.; Biggin, Mark D.; Eisen, Michael B.; DePace, Angela H.
2011-01-01
Differences in the level, timing, or location of gene expression can contribute to alternative phenotypes at the molecular and organismal level. Understanding the origins of expression differences is complicated by the fact that organismal morphology and gene regulatory networks could potentially vary even between closely related species. To assess the scope of such changes, we used high-resolution imaging methods to measure mRNA expression in blastoderm embryos of Drosophila yakuba and Drosophila pseudoobscura and assembled these data into cellular resolution atlases, where expression levels for 13 genes in the segmentation network are averaged into species-specific, cellular resolution morphological frameworks. We demonstrate that the blastoderm embryos of these species differ in their morphology in terms of size, shape, and number of nuclei. We present an approach to compare cellular gene expression patterns between species, while accounting for varying embryo morphology, and apply it to our data and an equivalent dataset for Drosophila melanogaster. Our analysis reveals that all individual genes differ quantitatively in their spatio-temporal expression patterns between these species, primarily in terms of their relative position and dynamics. Despite many small quantitative differences, cellular gene expression profiles for the whole set of genes examined are largely similar. This suggests that cell types at this stage of development are conserved, though they can differ in their relative position by up to 3–4 cell widths and in their relative proportion between species by as much as 5-fold. Quantitative differences in the dynamics and relative level of a subset of genes between corresponding cell types may reflect altered regulatory functions between species. Our results emphasize that transcriptional networks can diverge over short evolutionary timescales and that even small changes can lead to distinct output in terms of the placement and number of equivalent cells. PMID:22046143
Modise, David M.; Gemeildien, Junaid; Ndimba, Bongani K.; Christoffels, Alan
2018-01-01
Background Crop response to the changing climate and unpredictable effects of global warming with adverse conditions such as drought stress has brought concerns about food security to the fore; crop yield loss is a major cause of concern in this regard. Identification of genes with multiple responses across environmental stresses is the genetic foundation that leads to crop adaptation to environmental perturbations. Methods In this paper, we introduce an integrated approach to assess candidate genes for multiple stress responses across-species. The approach combines ontology based semantic data integration with expression profiling, comparative genomics, phylogenomics, functional gene enrichment and gene enrichment network analysis to identify genes associated with plant stress phenotypes. Five different ontologies, viz., Gene Ontology (GO), Trait Ontology (TO), Plant Ontology (PO), Growth Ontology (GRO) and Environment Ontology (EO) were used to semantically integrate drought related information. Results Target genes linked to Quantitative Trait Loci (QTLs) controlling yield and stress tolerance in sorghum (Sorghum bicolor (L.) Moench) and closely related species were identified. Based on the enriched GO terms of the biological processes, 1116 sorghum genes with potential responses to 5 different stresses, such as drought (18%), salt (32%), cold (20%), heat (8%) and oxidative stress (25%) were identified to be over-expressed. Out of 169 sorghum drought responsive QTLs associated genes that were identified based on expression datasets, 56% were shown to have multiple stress responses. On the other hand, out of 168 additional genes that have been evaluated for orthologous pairs, 90% were conserved across species for drought tolerance. Over 50% of identified maize and rice genes were responsive to drought and salt stresses and were co-located within multifunctional QTLs. Among the total identified multi-stress responsive genes, 272 targets were shown to be co-localized within QTLs associated with different traits that are responsive to multiple stresses. Ontology mapping was used to validate the identified genes, while reconstruction of the phylogenetic tree was instrumental to infer the evolutionary relationship of the sorghum orthologs. The results also show specific genes responsible for various interrelated components of drought response mechanism such as drought tolerance, drought avoidance and drought escape. Conclusions We submit that this approach is novel and to our knowledge, has not been used previously in any other research; it enables us to perform cross-species queries for genes that are likely to be associated with multiple stress tolerance, as a means to identify novel targets for engineering stress resistance in sorghum and possibly, in other crop species. PMID:29590108
Wang, Zhishi; Craven, Mark; Newton, Michael A.; Ahlquist, Paul
2013-01-01
Systematic, genome-wide RNA interference (RNAi) analysis is a powerful approach to identify gene functions that support or modulate selected biological processes. An emerging challenge shared with some other genome-wide approaches is that independent RNAi studies often show limited agreement in their lists of implicated genes. To better understand this, we analyzed four genome-wide RNAi studies that identified host genes involved in influenza virus replication. These studies collectively identified and validated the roles of 614 cell genes, but pair-wise overlap among the four gene lists was only 3% to 15% (average 6.7%). However, a number of functional categories were overrepresented in multiple studies. The pair-wise overlap of these enriched-category lists was high, ∼19%, implying more agreement among studies than apparent at the gene level. Probing this further, we found that the gene lists implicated by independent studies were highly connected in interacting networks by independent functional measures such as protein-protein interactions, at rates significantly higher than predicted by chance. We also developed a general, model-based approach to gauge the effects of false-positive and false-negative factors and to estimate, from a limited number of studies, the total number of genes involved in a process. For influenza virus replication, this novel statistical approach estimates the total number of cell genes involved to be ∼2,800. This and multiple other aspects of our experimental and computational results imply that, when following good quality control practices, the low overlap between studies is primarily due to false negatives rather than false-positive gene identifications. These results and methods have implications for and applications to multiple forms of genome-wide analysis. PMID:24068911
Evolutionary Origins of Cancer Driver Genes and Implications for Cancer Prognosis
Chu, Xin-Yi; Zhou, Xiong-Hui; Cui, Ze-Jia; Zhang, Hong-Yu
2017-01-01
The cancer atavistic theory suggests that carcinogenesis is a reverse evolution process. It is thus of great interest to explore the evolutionary origins of cancer driver genes and the relevant mechanisms underlying the carcinogenesis. Moreover, the evolutionary features of cancer driver genes could be helpful in selecting cancer biomarkers from high-throughput data. In this study, through analyzing the cancer endogenous molecular networks, we revealed that the subnetwork originating from eukaryota could control the unlimited proliferation of cancer cells, and the subnetwork originating from eumetazoa could recapitulate the other hallmarks of cancer. In addition, investigations based on multiple datasets revealed that cancer driver genes were enriched in genes originating from eukaryota, opisthokonta, and eumetazoa. These results have important implications for enhancing the robustness of cancer prognosis models through selecting the gene signatures by the gene age information. PMID:28708071
Evolutionary Origins of Cancer Driver Genes and Implications for Cancer Prognosis.
Chu, Xin-Yi; Jiang, Ling-Han; Zhou, Xiong-Hui; Cui, Ze-Jia; Zhang, Hong-Yu
2017-07-14
The cancer atavistic theory suggests that carcinogenesis is a reverse evolution process. It is thus of great interest to explore the evolutionary origins of cancer driver genes and the relevant mechanisms underlying the carcinogenesis. Moreover, the evolutionary features of cancer driver genes could be helpful in selecting cancer biomarkers from high-throughput data. In this study, through analyzing the cancer endogenous molecular networks, we revealed that the subnetwork originating from eukaryota could control the unlimited proliferation of cancer cells, and the subnetwork originating from eumetazoa could recapitulate the other hallmarks of cancer. In addition, investigations based on multiple datasets revealed that cancer driver genes were enriched in genes originating from eukaryota, opisthokonta, and eumetazoa. These results have important implications for enhancing the robustness of cancer prognosis models through selecting the gene signatures by the gene age information.
Telonis-Scott, Marina; Sgrò, Carla M.; Hoffmann, Ary A.; Griffin, Philippa C.
2016-01-01
Repeated attempts to map the genomic basis of complex traits often yield different outcomes because of the influence of genetic background, gene-by-environment interactions, and/or statistical limitations. However, where repeatability is low at the level of individual genes, overlap often occurs in gene ontology categories, genetic pathways, and interaction networks. Here we report on the genomic overlap for natural desiccation resistance from a Pool-genome-wide association study experiment and a selection experiment in flies collected from the same region in southeastern Australia in different years. We identified over 600 single nucleotide polymorphisms associated with desiccation resistance in flies derived from almost 1,000 wild-caught genotypes, a similar number of loci to that observed in our previous genomic study of selected lines, demonstrating the genetic complexity of this ecologically important trait. By harnessing the power of cross-study comparison, we narrowed the candidates from almost 400 genes in each study to a core set of 45 genes, enriched for stimulus, stress, and defense responses. In addition to gene-level overlap, there was higher order congruence at the network and functional levels, suggesting genetic redundancy in key stress sensing, stress response, immunity, signaling, and gene expression pathways. We also identified variants linked to different molecular aspects of desiccation physiology previously verified from functional experiments. Our approach provides insight into the genomic basis of a complex and ecologically important trait and predicts candidate genetic pathways to explore in multiple genetic backgrounds and related species within a functional framework. PMID:26733490
Genome-Wide Analysis of the Complex Transcriptional Networks of Rice Developing Seeds
Xue, Liang-Jiao; Zhang, Jing-Jing; Xue, Hong-Wei
2012-01-01
Background The development of rice (Oryza sativa) seed is closely associated with assimilates storage and plant yield, and is fine controlled by complex regulatory networks. Exhaustive transcriptome analysis of developing rice embryo and endosperm will help to characterize the genes possibly involved in the regulation of seed development and provide clues of yield and quality improvement. Principal Findings Our analysis showed that genes involved in metabolism regulation, hormone response and cellular organization processes are predominantly expressed during rice development. Interestingly, 191 transcription factor (TF)-encoding genes are predominantly expressed in seed and 59 TFs are regulated during seed development, some of which are homologs of seed-specific TFs or regulators of Arabidopsis seed development. Gene co-expression network analysis showed these TFs associated with multiple cellular and metabolism pathways, indicating a complex regulation of rice seed development. Further, by employing a cold-resistant cultivar Hanfeng (HF), genome-wide analyses of seed transcriptome at normal and low temperature reveal that rice seed is sensitive to low temperature at early stage and many genes associated with seed development are down-regulated by low temperature, indicating that the delayed development of rice seed by low temperature is mainly caused by the inhibition of the development-related genes. The transcriptional response of seed and seedling to low temperature is different, and the differential expressions of genes in signaling and metabolism pathways may contribute to the chilling tolerance of HF during seed development. Conclusions These results provide informative clues and will significantly improve the understanding of rice seed development regulation and the mechanism of cold response in rice seed. PMID:22363552
The DOPA decarboxylase (DDC) gene is associated with alerting attention.
Zhu, Bi; Chen, Chuansheng; Moyzis, Robert K; Dong, Qi; Chen, Chunhui; He, Qinghua; Li, Jin; Li, Jun; Lei, Xuemei; Lin, Chongde
2013-06-03
DOPA decarboxylase (DDC) is involved in the synthesis of dopamine, norepinephrine and serotonin. It has been suggested that genes involved in the dopamine, norepinephrine, and cholinergic systems play an essential role in the efficiency of human attention networks. Attention refers to the cognitive process of obtaining and maintaining the alert state, orienting to sensory events, and regulating the conflicts of thoughts and behavior. The present study tested seven single nucleotide polymorphisms (SNPs) within the DDC gene for association with attention, which was assessed by the Attention Network Test to detect three networks of attention, including alerting, orienting, and executive attention, in a healthy Han Chinese sample (N=451). Association analysis for individual SNPs indicated that four of the seven SNPs (rs3887825, rs7786398, rs10499695, and rs6969081) were significantly associated with alerting attention. Haplotype-based association analysis revealed that alerting was associated with the haplotype G-A-T for SNPs rs7786398-rs10499695-rs6969081. These associations remained significant after correcting for multiple testing by max(T) permutation. No association was found for orienting and executive attention. This study provides the first evidence for the involvement of the DDC gene in alerting attention. A better understanding of the genetic basis of distinct attention networks would allow us to develop more effective diagnosis, treatment, and prevention of deficient or underdeveloped alerting attention as well as its related prevalent neuropsychiatric disorders. Copyright © 2012 Elsevier Inc. All rights reserved.
An integrated network of Arabidopsis growth regulators and its use for gene prioritization.
Sabaghian, Ehsan; Drebert, Zuzanna; Inzé, Dirk; Saeys, Yvan
2015-12-01
Elucidating the molecular mechanisms that govern plant growth has been an important topic in plant research, and current advances in large-scale data generation call for computational tools that efficiently combine these different data sources to generate novel hypotheses. In this work, we present a novel, integrated network that combines multiple large-scale data sources to characterize growth regulatory genes in Arabidopsis, one of the main plant model organisms. The contributions of this work are twofold: first, we characterized a set of carefully selected growth regulators with respect to their connectivity patterns in the integrated network, and, subsequently, we explored to which extent these connectivity patterns can be used to suggest new growth regulators. Using a large-scale comparative study, we designed new supervised machine learning methods to prioritize growth regulators. Our results show that these methods significantly improve current state-of-the-art prioritization techniques, and are able to suggest meaningful new growth regulators. In addition, the integrated network is made available to the scientific community, providing a rich data source that will be useful for many biological processes, not necessarily restricted to plant growth.
Larson, Nicholas B; McDonnell, Shannon K; Fogarty, Zach; Larson, Melissa C; Cheville, John; Riska, Shaun; Baheti, Saurabh; Weber, Alexandra M; Nair, Asha A; Wang, Liang; O'Brien, Daniel; Davila, Jaime; Schaid, Daniel J; Thibodeau, Stephen N
2017-10-17
Large-scale genome-wide association studies have identified multiple single-nucleotide polymorphisms associated with risk of prostate cancer. Many of these genetic variants are presumed to be regulatory in nature; however, follow-up expression quantitative trait loci (eQTL) association studies have to-date been restricted largely to cis -acting associations due to study limitations. While trans -eQTL scans suffer from high testing dimensionality, recent evidence indicates most trans -eQTL associations are mediated by cis -regulated genes, such as transcription factors. Leveraging a data-driven gene co-expression network, we conducted a comprehensive cis -mediator analysis using RNA-Seq data from 471 normal prostate tissue samples to identify downstream regulatory associations of previously identified prostate cancer risk variants. We discovered multiple trans -eQTL associations that were significantly mediated by cis -regulated transcripts, four of which involved risk locus 17q12, proximal transcription factor HNF1B , and target trans -genes with known HNF response elements ( MIA2 , SRC , SEMA6A , KIF12 ). We additionally identified evidence of cis -acting down-regulation of MSMB via rs10993994 corresponding to reduced co-expression of NDRG1 . The majority of these cis -mediator relationships demonstrated trans -eQTL replicability in 87 prostate tissue samples from the Gene-Tissue Expression Project. These findings provide further biological context to known risk loci and outline new hypotheses for investigation into the etiology of prostate cancer.
Ruzicka, W. Brad; Subburaju, Sivan; Benes, Francine M.
2017-01-01
IMPORTANCE Dysfunction related to γ-aminobutyric acid (GABA)–ergic neurotransmission in the pathophysiology of major psychosis has been well established by the work of multiple groups across several decades, including the widely replicated downregulation of GAD1. Prior gene expression and network analyses within the human hippocampus implicate a broader network of genes, termed the GAD1 regulatory network, in regulation of GAD1 expression. Several genes within this GAD1 regulatory network show diagnosis- and sector-specific expression changes within the circuitry of the hippocampus, influencing abnormal GAD1 expression in schizophrenia and bipolar disorder. OBJECTIVE To investigate the hypothesis that aberrant DNA methylation contributes to circuit- and diagnosis-specific abnormal expression of GAD1 regulatory network genes in psychotic illness. DESIGN, SETTING, AND PARTICIPANTS This epigenetic association study targeting GAD1 regulatory network genes was conducted between July 1, 2012, and June 30, 2014. Postmortem human hippocampus tissue samples were obtained from 8patients with schizophrenia, 8 patients with bipolar disorder, and 8 healthy control participants matched for age, sex, postmortem interval, and other potential confounds from the Harvard Brain Tissue Resource Center, McLean Hospital, Belmont,Massachusetts. We extracted DNA from laser-microdissected stratum oriens tissue of cornu ammonis 2/3 (CA2/3) and CA1 postmortem human hippocampus, bisulfite modified it, and assessed it with the Infinium HumanMethylation450 BeadChip (Illumina, Inc). The subset of CpG loci associated with GAD1 regulatory network genes was analyzed in R version 3.1.0 software (R Foundation) using the minfi package. Findings were validated using bisulfite pyrosequencing. MAIN OUTCOMES AND MEASURES Methylation levels at 1308 GAD1 regulatory network–associated CpG loci were assessed both as individual sites to identify differentially methylated positions and by sharing information among colocalized probes to identify differentially methylated regions. RESULTS A total of 146 differentially methylated positions with a false detection rate lower than 0.05 were identified across all 6 groups (2 circuit locations in each of 3 diagnostic categories), and 54 differentially methylated regions with P < .01 were identified in single-group comparisons. Methylation changes were enriched in MSX1, CCND2, and DAXX at specific loci within the hippocampus of patients with schizophrenia and bipolar disorder. CONCLUSIONS AND RELEVANCE This work demonstrates diagnosis- and circuit-specific DNA methylation changes at a subset of GAD1 regulatory network genes in the human hippocampus in schizophrenia and bipolar disorder. These genes participate in chromatin regulation and cell cycle control, supporting the concept that the established GABAergic dysfunction in these disorders is related to disruption of GABAergic interneuron physiology at specific circuit locations within the human hippocampus. PMID:25738424
Matsuzaki, Jun; Kawahara, Yoshihiro; Izawa, Takeshi
2015-01-01
Plant circadian clocks that oscillate autonomously with a roughly 24-h period are entrained by fluctuating light and temperature and globally regulate downstream genes in the field. However, it remains unknown how punctual internal time produced by the circadian clock in the field is and how it is affected by environmental fluctuations due to weather or daylength. Using hundreds of samples of field-grown rice (Oryza sativa) leaves, we developed a statistical model for the expression of circadian clock-related genes integrating diurnally entrained circadian clock with phase setting by light, both responses to light and temperature gated by the circadian clock. We show that expression of individual genes was strongly affected by temperature. However, internal time estimated from expression of multiple genes, which may reflect transcriptional regulation of downstream genes, is punctual to 22 min and not affected by weather, daylength, or plant developmental age in the field. We also revealed perturbed progression of internal time under controlled environment or in a mutant of the circadian clock gene GIGANTEA. Thus, we demonstrated that the circadian clock is a regulatory network of multiple genes that retains accurate physical time of day by integrating the perturbations on individual genes under fluctuating environments in the field. PMID:25757473
Sartor, Maureen A.; Schnekenburger, Michael; Marlowe, Jennifer L.; Reichard, John F.; Wang, Ying; Fan, Yunxia; Ma, Ci; Karyala, Saikumar; Halbleib, Danielle; Liu, Xiangdong; Medvedovic, Mario; Puga, Alvaro
2009-01-01
Background The vertebrate aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that regulates cellular responses to environmental polycyclic and halogenated compounds. The naive receptor is believed to reside in an inactive cytosolic complex that translocates to the nucleus and induces transcription of xenobiotic detoxification genes after activation by ligand. Objectives We conducted an integrative genomewide analysis of AHR gene targets in mouse hepatoma cells and determined whether AHR regulatory functions may take place in the absence of an exogenous ligand. Methods The network of AHR-binding targets in the mouse genome was mapped through a multipronged approach involving chromatin immunoprecipitation/chip and global gene expression signatures. The findings were integrated into a prior functional knowledge base from Gene Ontology, interaction networks, Kyoto Encyclopedia of Genes and Genomes pathways, sequence motif analysis, and literature molecular concepts. Results We found the naive receptor in unstimulated cells bound to an extensive array of gene clusters with functions in regulation of gene expression, differentiation, and pattern specification, connecting multiple morphogenetic and developmental programs. Activation by the ligand displaced the receptor from some of these targets toward sites in the promoters of xenobiotic metabolism genes. Conclusions The vertebrate AHR appears to possess unsuspected regulatory functions that may be potential targets of environmental injury. PMID:19654925
Aging effects on DNA methylation modules in human brain and blood tissue
2012-01-01
Background Several recent studies reported aging effects on DNA methylation levels of individual CpG dinucleotides. But it is not yet known whether aging-related consensus modules, in the form of clusters of correlated CpG markers, can be found that are present in multiple human tissues. Such a module could facilitate the understanding of aging effects on multiple tissues. Results We therefore employed weighted correlation network analysis of 2,442 Illumina DNA methylation arrays from brain and blood tissues, which enabled the identification of an age-related co-methylation module. Module preservation analysis confirmed that this module can also be found in diverse independent data sets. Biological evaluation showed that module membership is associated with Polycomb group target occupancy counts, CpG island status and autosomal chromosome location. Functional enrichment analysis revealed that the aging-related consensus module comprises genes that are involved in nervous system development, neuron differentiation and neurogenesis, and that it contains promoter CpGs of genes known to be down-regulated in early Alzheimer's disease. A comparison with a standard, non-module based meta-analysis revealed that selecting CpGs based on module membership leads to significantly increased gene ontology enrichment, thus demonstrating that studying aging effects via consensus network analysis enhances the biological insights gained. Conclusions Overall, our analysis revealed a robustly defined age-related co-methylation module that is present in multiple human tissues, including blood and brain. We conclude that blood is a promising surrogate for brain tissue when studying the effects of age on DNA methylation profiles. PMID:23034122
Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes
2013-01-01
Background Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking. Results In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes using Arabidopsis NimbleGen ATH6 microarrays. In total 6061 transcripts were significantly cold regulated (p < 0.01) in 10 ecotypes, including 498 transcription factors and 315 transposable elements. The majority of the transcripts (75%) showed ecotype specific expression pattern. By using sequence data available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about regulatory interactions between transcription factors and their target genes in the model plant A. thaliana, we have adopted a powerful systems genetics approach- Network Component Analysis (NCA) to construct an in-silico transcriptional regulatory network model during response to cold stress. The resulting regulatory network contained 1,275 nodes and 7,720 connections, with 178 transcription factors and 1,331 target genes. Conclusions A. thaliana ecotypes exhibit considerable variation in transcriptome level responses to non-freezing cold stress treatment. Ecotype specific transcripts and related gene ontology (GO) categories were identified to delineate natural variation of cold stress regulated differential gene expression in the model plant A. thaliana. The predicted regulatory network model was able to identify new ecotype specific transcription factors and their regulatory interactions, which might be crucial for their local geographic adaptation to cold temperature. Additionally, since the approach presented here is general, it could be adapted to study networks regulating biological process in any biological systems. PMID:24148294
Taroni, Jaclyn N; Greene, Casey S; Martyanov, Viktor; Wood, Tammara A; Christmann, Romy B; Farber, Harrison W; Lafyatis, Robert A; Denton, Christopher P; Hinchcliff, Monique E; Pioli, Patricia A; Mahoney, J Matthew; Whitfield, Michael L
2017-03-23
Systemic sclerosis (SSc) is a multi-organ autoimmune disease characterized by skin fibrosis. Internal organ involvement is heterogeneous. It is unknown whether disease mechanisms are common across all involved affected tissues or if each manifestation has a distinct underlying pathology. We used consensus clustering to compare gene expression profiles of biopsies from four SSc-affected tissues (skin, lung, esophagus, and peripheral blood) from patients with SSc, and the related conditions pulmonary fibrosis (PF) and pulmonary arterial hypertension, and derived a consensus disease-associate signature across all tissues. We used this signature to query tissue-specific functional genomic networks. We performed novel network analyses to contrast the skin and lung microenvironments and to assess the functional role of the inflammatory and fibrotic genes in each organ. Lastly, we tested the expression of macrophage activation state-associated gene sets for enrichment in skin and lung using a Wilcoxon rank sum test. We identified a common pathogenic gene expression signature-an immune-fibrotic axis-indicative of pro-fibrotic macrophages (MØs) in multiple tissues (skin, lung, esophagus, and peripheral blood mononuclear cells) affected by SSc. While the co-expression of these genes is common to all tissues, the functional consequences of this upregulation differ by organ. We used this disease-associated signature to query tissue-specific functional genomic networks to identify common and tissue-specific pathologies of SSc and related conditions. In contrast to skin, in the lung-specific functional network we identify a distinct lung-resident MØ signature associated with lipid stimulation and alternative activation. In keeping with our network results, we find distinct MØ alternative activation transcriptional programs in SSc-associated PF lung and in the skin of patients with an "inflammatory" SSc gene expression signature. Our results suggest that the innate immune system is central to SSc disease processes but that subtle distinctions exist between tissues. Our approach provides a framework for examining molecular signatures of disease in fibrosis and autoimmune diseases and for leveraging publicly available data to understand common and tissue-specific disease processes in complex human diseases.
Is My Network Module Preserved and Reproducible?
Langfelder, Peter; Luo, Rui; Oldham, Michael C.; Horvath, Steve
2011-01-01
In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation. PMID:21283776
Gioutlakis, Aris; Klapa, Maria I.
2017-01-01
It has been acknowledged that source databases recording experimentally supported human protein-protein interactions (PPIs) exhibit limited overlap. Thus, the reconstruction of a comprehensive PPI network requires appropriate integration of multiple heterogeneous primary datasets, presenting the PPIs at various genetic reference levels. Existing PPI meta-databases perform integration via normalization; namely, PPIs are merged after converted to a certain target level. Hence, the node set of the integrated network depends each time on the number and type of the combined datasets. Moreover, the irreversible a priori normalization process hinders the identification of normalization artifacts in the integrated network, which originate from the nonlinearity characterizing the genetic information flow. PICKLE (Protein InteraCtion KnowLedgebasE) 2.0 implements a new architecture for this recently introduced human PPI meta-database. Its main novel feature over the existing meta-databases is its approach to primary PPI dataset integration via genetic information ontology. Building upon the PICKLE principles of using the reviewed human complete proteome (RHCP) of UniProtKB/Swiss-Prot as the reference protein interactor set, and filtering out protein interactions with low probability of being direct based on the available evidence, PICKLE 2.0 first assembles the RHCP genetic information ontology network by connecting the corresponding genes, nucleotide sequences (mRNAs) and proteins (UniProt entries) and then integrates PPI datasets by superimposing them on the ontology network without any a priori transformations. Importantly, this process allows the resulting heterogeneous integrated network to be reversibly normalized to any level of genetic reference without loss of the original information, the latter being used for identification of normalization biases, and enables the appraisal of potential false positive interactions through PPI source database cross-checking. The PICKLE web-based interface (www.pickle.gr) allows for the simultaneous query of multiple entities and provides integrated human PPI networks at either the protein (UniProt) or the gene level, at three PPI filtering modes. PMID:29023571
Genome-wide network-based pathway analysis of CSF t-tau/Aβ1-42 ratio in the ADNI cohort.
Cong, Wang; Meng, Xianglian; Li, Jin; Zhang, Qiushi; Chen, Feng; Liu, Wenjie; Wang, Ying; Cheng, Sipu; Yao, Xiaohui; Yan, Jingwen; Kim, Sungeun; Saykin, Andrew J; Liang, Hong; Shen, Li
2017-05-30
The cerebrospinal fluid (CSF) levels of total tau (t-tau) and Aβ 1-42 are potential early diagnostic markers for probable Alzheimer's disease (AD). The influence of genetic variation on these CSF biomarkers has been investigated in candidate or genome-wide association studies (GWAS). However, the investigation of statistically modest associations in GWAS in the context of biological networks is still an under-explored topic in AD studies. The main objective of this study is to gain further biological insights via the integration of statistical gene associations in AD with physical protein interaction networks. The CSF and genotyping data of 843 study subjects (199 CN, 85 SMC, 239 EMCI, 207 LMCI, 113 AD) from the Alzheimer's Disease Neuroimaging Initiative (ADNI) were analyzed. PLINK was used to perform GWAS on the t-tau/Aβ 1-42 ratio using quality controlled genotype data, including 563,980 single nucleotide polymorphisms (SNPs), with age, sex and diagnosis as covariates. Gene-level p-values were obtained by VEGAS2. Genes with p-value ≤ 0.05 were mapped on to a protein-protein interaction (PPI) network (9,617 nodes, 39,240 edges, from the HPRD Database). We integrated a consensus model strategy into the iPINBPA network analysis framework, and named it as CM-iPINBPA. Four consensus modules (CMs) were discovered by CM-iPINBPA, and were functionally annotated using the pathway analysis tool Enrichr. The intersection of four CMs forms a common subnetwork of 29 genes, including those related to tau phosphorylation (GSK3B, SUMO1, AKAP5, CALM1 and DLG4), amyloid beta production (CASP8, PIK3R1, PPA1, PARP1, CSNK2A1, NGFR, and RHOA), and AD (BCL3, CFLAR, SMAD1, and HIF1A). This study coupled a consensus module (CM) strategy with the iPINBPA network analysis framework, and applied it to the GWAS of CSF t-tau/Aβ1-42 ratio in an AD study. The genome-wide network analysis yielded 4 enriched CMs that share not only genes related to tau phosphorylation or amyloid beta production but also multiple genes enriching several KEGG pathways such as Alzheimer's disease, colorectal cancer, gliomas, renal cell carcinoma, Huntington's disease, and others. This study demonstrated that integration of gene-level associations with CMs could yield statistically significant findings to offer valuable biological insights (e.g., functional interaction among the protein products of these genes) and suggest high confidence candidates for subsequent analyses.
Inouye, Michael; Ripatti, Samuli; Kettunen, Johannes; Lyytikäinen, Leo-Pekka; Oksala, Niku; Laurila, Pirkka-Pekka; Kangas, Antti J.; Soininen, Pasi; Savolainen, Markku J.; Viikari, Jorma; Kähönen, Mika; Perola, Markus; Salomaa, Veikko; Raitakari, Olli; Lehtimäki, Terho; Taskinen, Marja-Riitta; Järvelin, Marjo-Riitta; Ala-Korpela, Mika; Palotie, Aarno; de Bakker, Paul I. W.
2012-01-01
Association testing of multiple correlated phenotypes offers better power than univariate analysis of single traits. We analyzed 6,600 individuals from two population-based cohorts with both genome-wide SNP data and serum metabolomic profiles. From the observed correlation structure of 130 metabolites measured by nuclear magnetic resonance, we identified 11 metabolic networks and performed a multivariate genome-wide association analysis. We identified 34 genomic loci at genome-wide significance, of which 7 are novel. In comparison to univariate tests, multivariate association analysis identified nearly twice as many significant associations in total. Multi-tissue gene expression studies identified variants in our top loci, SERPINA1 and AQP9, as eQTLs and showed that SERPINA1 and AQP9 expression in human blood was associated with metabolites from their corresponding metabolic networks. Finally, liver expression of AQP9 was associated with atherosclerotic lesion area in mice, and in human arterial tissue both SERPINA1 and AQP9 were shown to be upregulated (6.3-fold and 4.6-fold, respectively) in atherosclerotic plaques. Our study illustrates the power of multi-phenotype GWAS and highlights candidate genes for atherosclerosis. PMID:22916037
Arneson, Douglas; Bhattacharya, Anindya; Shu, Le; Mäkinen, Ville-Petteri; Yang, Xia
2016-09-09
Human diseases are commonly the result of multidimensional changes at molecular, cellular, and systemic levels. Recent advances in genomic technologies have enabled an outpour of omics datasets that capture these changes. However, separate analyses of these various data only provide fragmented understanding and do not capture the holistic view of disease mechanisms. To meet the urgent needs for tools that effectively integrate multiple types of omics data to derive biological insights, we have developed Mergeomics, a computational pipeline that integrates multidimensional disease association data with functional genomics and molecular networks to retrieve biological pathways, gene networks, and central regulators critical for disease development. To make the Mergeomics pipeline available to a wider research community, we have implemented an online, user-friendly web server ( http://mergeomics. idre.ucla.edu/ ). The web server features a modular implementation of the Mergeomics pipeline with detailed tutorials. Additionally, it provides curated genomic resources including tissue-specific expression quantitative trait loci, ENCODE functional annotations, biological pathways, and molecular networks, and offers interactive visualization of analytical results. Multiple computational tools including Marker Dependency Filtering (MDF), Marker Set Enrichment Analysis (MSEA), Meta-MSEA, and Weighted Key Driver Analysis (wKDA) can be used separately or in flexible combinations. User-defined summary-level genomic association datasets (e.g., genetic, transcriptomic, epigenomic) related to a particular disease or phenotype can be uploaded and computed real-time to yield biologically interpretable results, which can be viewed online and downloaded for later use. Our Mergeomics web server offers researchers flexible and user-friendly tools to facilitate integration of multidimensional data into holistic views of disease mechanisms in the form of tissue-specific key regulators, biological pathways, and gene networks.
Robinson, Sean; Nevalainen, Jaakko; Pinna, Guillaume; Campalans, Anna; Radicella, J. Pablo; Guyon, Laurent
2017-01-01
Abstract Motivation: Incorporating gene interaction data into the identification of ‘hit’ genes in genomic experiments is a well-established approach leveraging the ‘guilt by association’ assumption to obtain a network based hit list of functionally related genes. We aim to develop a method to allow for multivariate gene scores and multiple hit labels in order to extend the analysis of genomic screening data within such an approach. Results: We propose a Markov random field-based method to achieve our aim and show that the particular advantages of our method compared with those currently used lead to new insights in previously analysed data as well as for our own motivating data. Our method additionally achieves the best performance in an independent simulation experiment. The real data applications we consider comprise of a survival analysis and differential expression experiment and a cell-based RNA interference functional screen. Availability and implementation: We provide all of the data and code related to the results in the paper. Contact: sean.j.robinson@utu.fi or laurent.guyon@cea.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28881978
Mikhailov, Alexander T; Torrado, Mario
2018-05-12
There is growing evidence that putative gene regulatory networks including cardio-enriched transcription factors, such as PITX2, TBX5, ZFHX3, and SHOX2, and their effector/target genes along with downstream non-coding RNAs can play a potentially important role in the process of adaptive and maladaptive atrial rhythm remodeling. In turn, expression of atrial fibrillation-associated transcription factors is under the control of upstream regulatory non-coding RNAs. This review broadly explores gene regulatory mechanisms associated with susceptibility to atrial fibrillation-with key examples from both animal models and patients-within the context of both cardiac transcription factors and non-coding RNAs. These two systems appear to have multiple levels of cross-regulation and act coordinately to achieve effective control of atrial rhythm effector gene expression. Perturbations of a dynamic expression balance between transcription factors and corresponding non-coding RNAs can provoke the development or promote the progression of atrial fibrillation. We also outline deficiencies in current models and discuss ongoing studies to clarify remaining mechanistic questions. An understanding of the function of transcription factors and non-coding RNAs in gene regulatory networks associated with atrial fibrillation risk will enable the development of innovative therapeutic strategies.
Zhu, Jie; Qin, Yufang; Liu, Taigang; Wang, Jun; Zheng, Xiaoqi
2013-01-01
Identification of gene-phenotype relationships is a fundamental challenge in human health clinic. Based on the observation that genes causing the same or similar phenotypes tend to correlate with each other in the protein-protein interaction network, a lot of network-based approaches were proposed based on different underlying models. A recent comparative study showed that diffusion-based methods achieve the state-of-the-art predictive performance. In this paper, a new diffusion-based method was proposed to prioritize candidate disease genes. Diffusion profile of a disease was defined as the stationary distribution of candidate genes given a random walk with restart where similarities between phenotypes are incorporated. Then, candidate disease genes are prioritized by comparing their diffusion profiles with that of the disease. Finally, the effectiveness of our method was demonstrated through the leave-one-out cross-validation against control genes from artificial linkage intervals and randomly chosen genes. Comparative study showed that our method achieves improved performance compared to some classical diffusion-based methods. To further illustrate our method, we used our algorithm to predict new causing genes of 16 multifactorial diseases including Prostate cancer and Alzheimer's disease, and the top predictions were in good consistent with literature reports. Our study indicates that integration of multiple information sources, especially the phenotype similarity profile data, and introduction of global similarity measure between disease and gene diffusion profiles are helpful for prioritizing candidate disease genes. Programs and data are available upon request.
Exploring the bZIP transcription factor regulatory network in Neurospora crassa
Tian, Chaoguang; Li, Jingyi; Glass, N. Louise
2011-01-01
Transcription factors (TFs) are key nodes of regulatory networks in eukaryotic organisms, including filamentous fungi such as Neurospora crassa. The 178 predicted DNA-binding TFs in N. crassa are distributed primarily among six gene families, which represent an ancient expansion in filamentous ascomycete genomes; 98 TF genes show detectable expression levels during vegetative growth of N. crassa, including 35 that show a significant difference in expression level between hyphae at the periphery versus hyphae in the interior of a colony. Regulatory networks within a species genome include paralogous TFs and their respective target genes (TF regulon). To investigate TF network evolution in N. crassa, we focused on the basic leucine zipper (bZIP) TF family, which contains nine members. We performed baseline transcriptional profiling during vegetative growth of the wild-type and seven isogenic, viable bZIP deletion mutants. We further characterized the regulatory network of one member of the bZIP family, NCU03905. NCU03905 encodes an Ap1-like protein (NcAp-1), which is involved in resistance to multiple stress responses, including oxidative and heavy metal stress. Relocalization of NcAp-1 from the cytoplasm to the nucleus was associated with exposure to stress. A comparison of the NcAp-1 regulon with Ap1-like regulons in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans and Aspergillus fumigatus showed both conservation and divergence. These data indicate how N. crassa responds to stress and provide information on pathway evolution. PMID:21081763
Exploring the bZIP transcription factor regulatory network in Neurospora crassa.
Tian, Chaoguang; Li, Jingyi; Glass, N Louise
2011-03-01
Transcription factors (TFs) are key nodes of regulatory networks in eukaryotic organisms, including filamentous fungi such as Neurospora crassa. The 178 predicted DNA-binding TFs in N. crassa are distributed primarily among six gene families, which represent an ancient expansion in filamentous ascomycete genomes; 98 TF genes show detectable expression levels during vegetative growth of N. crassa, including 35 that show a significant difference in expression level between hyphae at the periphery versus hyphae in the interior of a colony. Regulatory networks within a species genome include paralogous TFs and their respective target genes (TF regulon). To investigate TF network evolution in N. crassa, we focused on the basic leucine zipper (bZIP) TF family, which contains nine members. We performed baseline transcriptional profiling during vegetative growth of the wild-type and seven isogenic, viable bZIP deletion mutants. We further characterized the regulatory network of one member of the bZIP family, NCU03905. NCU03905 encodes an Ap1-like protein (NcAp-1), which is involved in resistance to multiple stress responses, including oxidative and heavy metal stress. Relocalization of NcAp-1 from the cytoplasm to the nucleus was associated with exposure to stress. A comparison of the NcAp-1 regulon with Ap1-like regulons in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans and Aspergillus fumigatus showed both conservation and divergence. These data indicate how N. crassa responds to stress and provide information on pathway evolution.
An integrative data mining approach to identifying adverse outcome pathway signatures.
Oki, Noffisat O; Edwards, Stephen W
2016-03-28
The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression, which has been previously established in the literature as a primary cause of glaucoma. These case studies highlight the value in integrating multiple data sources when defining cpAOPs for HTS data. Copyright © 2016. Published by Elsevier Ireland Ltd.
Ye, Yusen; Gao, Lin; Zhang, Shihua
2017-01-01
Transcription factors play a key role in transcriptional regulation of genes and determination of cellular identity through combinatorial interactions. However, current studies about combinatorial regulation is deficient due to lack of experimental data in the same cellular environment and extensive existence of data noise. Here, we adopt a Bayesian CANDECOMP/PARAFAC (CP) factorization approach (BCPF) to integrate multiple datasets in a network paradigm for determining precise TF interaction landscapes. In our first application, we apply BCPF to integrate three networks built based on diverse datasets of multiple cell lines from ENCODE respectively to predict a global and precise TF interaction network. This network gives 38 novel TF interactions with distinct biological functions. In our second application, we apply BCPF to seven types of cell type TF regulatory networks and predict seven cell lineage TF interaction networks, respectively. By further exploring the dynamics and modularity of them, we find cell lineage-specific hub TFs participate in cell type or lineage-specific regulation by interacting with non-specific TFs. Furthermore, we illustrate the biological function of hub TFs by taking those of cancer lineage and blood lineage as examples. Taken together, our integrative analysis can reveal more precise and extensive description about human TF combinatorial interactions. PMID:29033978
Ye, Yusen; Gao, Lin; Zhang, Shihua
2017-01-01
Transcription factors play a key role in transcriptional regulation of genes and determination of cellular identity through combinatorial interactions. However, current studies about combinatorial regulation is deficient due to lack of experimental data in the same cellular environment and extensive existence of data noise. Here, we adopt a Bayesian CANDECOMP/PARAFAC (CP) factorization approach (BCPF) to integrate multiple datasets in a network paradigm for determining precise TF interaction landscapes. In our first application, we apply BCPF to integrate three networks built based on diverse datasets of multiple cell lines from ENCODE respectively to predict a global and precise TF interaction network. This network gives 38 novel TF interactions with distinct biological functions. In our second application, we apply BCPF to seven types of cell type TF regulatory networks and predict seven cell lineage TF interaction networks, respectively. By further exploring the dynamics and modularity of them, we find cell lineage-specific hub TFs participate in cell type or lineage-specific regulation by interacting with non-specific TFs. Furthermore, we illustrate the biological function of hub TFs by taking those of cancer lineage and blood lineage as examples. Taken together, our integrative analysis can reveal more precise and extensive description about human TF combinatorial interactions.
Altered Micro-RNA Degradation Promotes Tumor Heterogeneity: A Result from Boolean Network Modeling.
Wu, Yunyi; Krueger, Gerhard R F; Wang, Guanyu
2016-02-01
Cancer heterogeneity may reflect differential dynamical outcomes of the regulatory network encompassing biomolecules at both transcriptional and post-transcriptional levels. In other words, differential gene-expression profiles may correspond to different stable steady states of a mathematical model for simulation of biomolecular networks. To test this hypothesis, we simplified a regulatory network that is important for soft-tissue sarcoma metastasis and heterogeneity, comprising of transcription factors, micro-RNAs, and signaling components of the NOTCH pathway. We then used a Boolean network model to simulate the dynamics of this network, and particularly investigated the consequences of differential miRNA degradation modes. We found that efficient miRNA degradation is crucial for sustaining a homogenous and healthy phenotype, while defective miRNA degradation may lead to multiple stable steady states and ultimately to carcinogenesis and heterogeneity. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.
Genomic and Epigenomic Alterations in Cancer.
Chakravarthi, Balabhadrapatruni V S K; Nepal, Saroj; Varambally, Sooryanarayana
2016-07-01
Multiple genetic and epigenetic events characterize tumor progression and define the identity of the tumors. Advances in high-throughput technologies, like gene expression profiling, next-generation sequencing, proteomics, and metabolomics, have enabled detailed molecular characterization of various tumors. The integration and analyses of these high-throughput data have unraveled many novel molecular aberrations and network alterations in tumors. These molecular alterations include multiple cancer-driving mutations, gene fusions, amplification, deletion, and post-translational modifications, among others. Many of these genomic events are being used in cancer diagnosis, whereas others are therapeutically targeted with small-molecule inhibitors. Multiple genes/enzymes that play a role in DNA and histone modifications are also altered in various cancers, changing the epigenomic landscape during cancer initiation and progression. Apart from protein-coding genes, studies are uncovering the critical regulatory roles played by noncoding RNAs and noncoding regions of the genome during cancer progression. Many of these genomic and epigenetic events function in tandem to drive tumor development and metastasis. Concurrent advances in genome-modulating technologies, like gene silencing and genome editing, are providing ability to understand in detail the process of cancer initiation, progression, and signaling as well as opening up avenues for therapeutic targeting. In this review, we discuss some of the recent advances in cancer genomic and epigenomic research. Copyright © 2016 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
Optimal Information Processing in Biochemical Networks
NASA Astrophysics Data System (ADS)
Wiggins, Chris
2012-02-01
A variety of experimental results over the past decades provide examples of near-optimal information processing in biological networks, including in biochemical and transcriptional regulatory networks. Computing information-theoretic quantities requires first choosing or computing the joint probability distribution describing multiple nodes in such a network --- for example, representing the probability distribution of finding an integer copy number of each of two interacting reactants or gene products while respecting the `intrinsic' small copy number noise constraining information transmission at the scale of the cell. I'll given an overview of some recent analytic and numerical work facilitating calculation of such joint distributions and the associated information, which in turn makes possible numerical optimization of information flow in models of noisy regulatory and biochemical networks. Illustrating cases include quantification of form-function relations, ideal design of regulatory cascades, and response to oscillatory driving.
Zou, Chenhui; La Bonte, Laura R.; Pavlov, Vasile I.; Stahl, Gregory L.
2012-01-01
Hyperglycemia, in the absence of type 1 or 2 diabetes, is an independent risk factor for cardiovascular disease. We have previously demonstrated a central role for mannose binding lectin (MBL)-mediated cardiac dysfunction in acute hyperglycemic mice. In this study, we applied whole-genome microarray data analysis to investigate MBL’s role in systematic gene expression changes. The data predict possible intracellular events taking place in multiple cellular compartments such as enhanced insulin signaling pathway sensitivity, promoted mitochondrial respiratory function, improved cellular energy expenditure and protein quality control, improved cytoskeleton structure, and facilitated intracellular trafficking, all of which may contribute to the organismal health of MBL null mice against acute hyperglycemia. Our data show a tight association between gene expression profile and tissue function which might be a very useful tool in predicting cellular targets and regulatory networks connected with in vivo observations, providing clues for further mechanistic studies. PMID:22375142
Transcriptional Networks Controlled by NKX2-1 in the Development of Forebrain GABAergic Neurons
Sandberg, Magnus; Flandin, Pierre; Silberberg, Shanni; ...
2016-09-21
The embryonic basal ganglia generates multiple projection neurons and interneuron subtypes from distinct progenitor domains. Combinatorial interactions of transcription factors and chromatin are thought to regulate gene expression. In the medial ganglionic eminence, the NKX2-1 transcription factor controls regional identity and, with LHX6, is necessary to specify pallidal projection neurons and forebrain interneurons. Here, we dissected the molecular functions of NKX2-1 by defining its chromosomal binding, regulation of gene expression, and epigenetic state. NKX2-1 binding at distal regulatory elements led to a repressed epigenetic state and transcriptional repression in the ventricular zone. Conversely, NKX2-1 is required to establish a permissivemore » chromatin state and transcriptional activation in the sub-ventricular and mantle zones. Moreover, combinatorial binding of NKX2-1 and LHX6 promotes transcriptionally permissive chromatin and activates genes expressed in cortical migrating interneurons. Our integrated approach gives a foundation for elucidating transcriptional networks guiding the development of the MGE and its descendants.« less
Convergent genetic and expression data implicate immunity in Alzheimer's disease
Jones, Lesley; Lambert, Jean-Charles; Wang, Li-San; Choi, Seung-Hoan; Harold, Denise; Vedernikov, Alexey; Escott-Price, Valentina; Stone, Timothy; Richards, Alexander; Bellenguez, Céline; Ibrahim-Verbaas, Carla A; Naj, Adam C; Sims, Rebecca; Gerrish, Amy; Jun, Gyungah; DeStefano, Anita L; Bis, Joshua C; Beecham, Gary W; Grenier-Boley, Benjamin; Russo, Giancarlo; Thornton-Wells, Tricia A; Jones, Nicola; Smith, Albert V; Chouraki, Vincent; Thomas, Charlene; Ikram, M Arfan; Zelenika, Diana; Vardarajan, Badri N; Kamatani, Yoichiro; Lin, Chiao-Feng; Schmidt, Helena; Kunkle, Brian; Dunstan, Melanie L; Ruiz, Agustin; Bihoreau, Marie-Thérèse; Reitz, Christiane; Pasquier, Florence; Hollingworth, Paul; Hanon, Olivier; Fitzpatrick, Annette L; Buxbaum, Joseph D; Campion, Dominique; Crane, Paul K; Becker, Tim; Gudnason, Vilmundur; Cruchaga, Carlos; Craig, David; Amin, Najaf; Berr, Claudine; Lopez, Oscar L; De Jager, Philip L; Deramecourt, Vincent; Johnston, Janet A; Evans, Denis; Lovestone, Simon; Letteneur, Luc; Kornhuber, Johanes; Tárraga, Lluís; Rubinsztein, David C; Eiriksdottir, Gudny; Sleegers, Kristel; Goate, Alison M; Fiévet, Nathalie; Huentelman, Matthew J; Gill, Michael; Emilsson, Valur; Brown, Kristelle; Kamboh, M Ilyas; Keller, Lina; Barberger-Gateau, Pascale; McGuinness, Bernadette; Larson, Eric B; Myers, Amanda J; Dufouil, Carole; Todd, Stephen; Wallon, David; Love, Seth; Kehoe, Pat; Rogaeva, Ekaterina; Gallacher, John; George-Hyslop, Peter St; Clarimon, Jordi; Lleὀ, Alberti; Bayer, Anthony; Tsuang, Debby W; Yu, Lei; Tsolaki, Magda; Bossù, Paola; Spalletta, Gianfranco; Proitsi, Petra; Collinge, John; Sorbi, Sandro; Garcia, Florentino Sanchez; Fox, Nick; Hardy, John; Naranjo, Maria Candida Deniz; Razquin, Cristina; Bosco, Paola; Clarke, Robert; Brayne, Carol; Galimberti, Daniela; Mancuso, Michelangelo; Moebus, Susanne; Mecocci, Patrizia; del Zompo, Maria; Maier, Wolfgang; Hampel, Harald; Pilotto, Alberto; Bullido, Maria; Panza, Francesco; Caffarra, Paolo; Nacmias, Benedetta; Gilbert, John R; Mayhaus, Manuel; Jessen, Frank; Dichgans, Martin; Lannfelt, Lars; Hakonarson, Hakon; Pichler, Sabrina; Carrasquillo, Minerva M; Ingelsson, Martin; Beekly, Duane; Alavarez, Victoria; Zou, Fanggeng; Valladares, Otto; Younkin, Steven G; Coto, Eliecer; Hamilton-Nelson, Kara L; Mateo, Ignacio; Owen, Michael J; Faber, Kelley M; Jonsson, Palmi V; Combarros, Onofre; O'Donovan, Michael C; Cantwell, Laura B; Soininen, Hilkka; Blacker, Deborah; Mead, Simon; Mosley, Thomas H; Bennett, David A; Harris, Tamara B; Fratiglioni, Laura; Holmes, Clive; de Bruijn, Renee FAG; Passmore, Peter; Montine, Thomas J; Bettens, Karolien; Rotter, Jerome I; Brice, Alexis; Morgan, Kevin; Foroud, Tatiana M; Kukull, Walter A; Hannequin, Didier; Powell, John F; Nalls, Michael A; Ritchie, Karen; Lunetta, Kathryn L; Kauwe, John SK; Boerwinkle, Eric; Riemenschneider, Matthias; Boada, Mercè; Hiltunen, Mikko; Martin, Eden R; Pastor, Pau; Schmidt, Reinhold; Rujescu, Dan; Dartigues, Jean-François; Mayeux, Richard; Tzourio, Christophe; Hofman, Albert; Nöthen, Markus M; Graff, Caroline; Psaty, Bruce M; Haines, Jonathan L; Lathrop, Mark; Pericak-Vance, Margaret A; Launer, Lenore J; Farrer, Lindsay A; van Duijn, Cornelia M; Van Broekhoven, Christine; Ramirez, Alfredo; Schellenberg, Gerard D; Seshadri, Sudha; Amouyel, Philippe; Holmans, Peter A
2015-01-01
Background Late–onset Alzheimer's disease (AD) is heritable with 20 genes showing genome wide association in the International Genomics of Alzheimer's Project (IGAP). To identify the biology underlying the disease we extended these genetic data in a pathway analysis. Methods The ALIGATOR and GSEA algorithms were used in the IGAP data to identify associated functional pathways and correlated gene expression networks in human brain. Results ALIGATOR identified an excess of curated biological pathways showing enrichment of association. Enriched areas of biology included the immune response (p = 3.27×10-12 after multiple testing correction for pathways), regulation of endocytosis (p = 1.31×10-11), cholesterol transport (p = 2.96 × 10-9) and proteasome-ubiquitin activity (p = 1.34×10-6). Correlated gene expression analysis identified four significant network modules, all related to the immune response (corrected p 0.002 – 0.05). Conclusions The immune response, regulation of endocytosis, cholesterol transport and protein ubiquitination represent prime targets for AD therapeutics. PMID:25533204
Convergent genetic and expression data implicate immunity in Alzheimer's disease.
2015-06-01
Late-onset Alzheimer's disease (AD) is heritable with 20 genes showing genome-wide association in the International Genomics of Alzheimer's Project (IGAP). To identify the biology underlying the disease, we extended these genetic data in a pathway analysis. The ALIGATOR and GSEA algorithms were used in the IGAP data to identify associated functional pathways and correlated gene expression networks in human brain. ALIGATOR identified an excess of curated biological pathways showing enrichment of association. Enriched areas of biology included the immune response (P = 3.27 × 10(-12) after multiple testing correction for pathways), regulation of endocytosis (P = 1.31 × 10(-11)), cholesterol transport (P = 2.96 × 10(-9)), and proteasome-ubiquitin activity (P = 1.34 × 10(-6)). Correlated gene expression analysis identified four significant network modules, all related to the immune response (corrected P = .002-.05). The immune response, regulation of endocytosis, cholesterol transport, and protein ubiquitination represent prime targets for AD therapeutics. Copyright © 2015. Published by Elsevier Inc.
Discovering Implicit Entity Relation with the Gene-Citation-Gene Network
Song, Min; Han, Nam-Gi; Kim, Yong-Hwan; Ding, Ying; Chambers, Tamy
2013-01-01
In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG) network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG) network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner. PMID:24358368
Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro
2015-01-01
Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. PMID:25505034
Gaponova, Anna V.; Deneka, Alexander Y.; Beck, Tim N.; Liu, Hanqing; Andrianov, Gregory; Nikonova, Anna S.; Nicolas, Emmanuelle; Einarson, Margret B.; Golemis, Erica A.; Serebriiskii, Ilya G.
2017-01-01
Ovarian, head and neck, and other cancers are commonly treated with cisplatin and other DNA damaging cytotoxic agents. Altered DNA damage response (DDR) contributes to resistance of these tumors to chemotherapies, some targeted therapies, and radiation. DDR involves multiple protein complexes and signaling pathways, some of which are evolutionarily ancient and involve protein orthologs conserved from yeast to humans. To identify new regulators of cisplatin-resistance in human tumors, we integrated high throughput and curated datasets describing yeast genes that regulate sensitivity to cisplatin and/or ionizing radiation. Next, we clustered highly validated genes based on chemogenomic profiling, and then mapped orthologs of these genes in expanded genomic networks for multiple metazoans, including humans. This approach identified an enriched candidate set of genes involved in the regulation of resistance to radiation and/or cisplatin in humans. Direct functional assessment of selected candidate genes using RNA interference confirmed their activity in influencing cisplatin resistance, degree of γH2AX focus formation and ATR phosphorylation, in ovarian and head and neck cancer cell lines, suggesting impaired DDR signaling as the driving mechanism. This work enlarges the set of genes that may contribute to chemotherapy resistance and provides a new contextual resource for interpreting next generation sequencing (NGS) genomic profiling of tumors. PMID:27863405
Han, Xuelei; Jiang, Tengfei; Yang, Huawei; Zhang, Qingde; Wang, Weimin; Fan, Bin; Liu, Bang
2012-06-01
Meat quality traits are economically important traits of swine, and are controlled by multiple genes as complex quantitative traits. In the present study four genes, H-FABP (heart fatty acid-binding protein), MASTR (MEF2 activating motif and SAP domain containing transcriptional regulator), UCP3 (uncoupling protein 3) and MYOD1 (myogenic differentiation 1) were researched in Large White pigs. The polymorphisms H-FABP T/C of 5'UTR, MYOD1 g.257 A>C, UCP3 g.1406 G>A in exon 3 and MASTR c.187 C>T have been reported to be associated with meat quality traits in pigs. The aim of this study was to analyze the effect of single and multiple markers for single traits in Large White pigs. The single marker association analysis showed that the H-FABP and MASTR genes were associated with IMF (intramuscular fat content) (P < 0.05), and that the g.257 A>C of MYOD1 gene was most significantly related to muscle pH value (P < 0.01). The multiple markers for IMF were analyzed by combining the markers and quantitative trait modes into the linear regression. The results revealed that H-FABP and MASTR integrate gene networks for IMF. Thus, our study results suggested that H-FABP and MASTR polymorphisms could be used as genetic markers in the marker-assisted selection towards the improvement of IMF in Large White pigs.
Pajic, Marina; Froio, Danielle; Daly, Sheridan; Doculara, Louise; Millar, Ewan; Graham, Peter H; Drury, Alison; Steinmann, Angela; de Bock, Charles E; Boulghourjian, Alice; Zaratzian, Anaiis; Carroll, Susan; Toohey, Joanne; O'Toole, Sandra A; Harris, Adrian L; Buffa, Francesca M; Gee, Harriet E; Hollway, Georgina E; Molloy, Timothy J
2018-01-15
Radiotherapy is essential to the treatment of most solid tumors and acquired or innate resistance to this therapeutic modality is a major clinical problem. Here we show that miR-139-5p is a potent modulator of radiotherapy response in breast cancer via its regulation of genes involved in multiple DNA repair and reactive oxygen species defense pathways. Treatment of breast cancer cells with a miR-139-5p mimic strongly synergized with radiation both in vitro and in vivo , resulting in significantly increased oxidative stress, accumulation of unrepaired DNA damage, and induction of apoptosis. Several miR-139-5p target genes were also strongly predictive of outcome in radiotherapy-treated patients across multiple independent breast cancer cohorts. These prognostically relevant miR-139-5p target genes were used as companion biomarkers to identify radioresistant breast cancer xenografts highly amenable to sensitization by cotreatment with a miR-139-5p mimetic. Significance: The microRNA described in this study offers a potentially useful predictive biomarker of radiosensitivity in solid tumors and a generally applicable druggable target for tumor radiosensitization. Cancer Res; 78(2); 501-15. ©2017 AACR . ©2017 American Association for Cancer Research.
CoPub: a literature-based keyword enrichment tool for microarray data analysis.
Frijters, Raoul; Heupers, Bart; van Beek, Pieter; Bouwhuis, Maurice; van Schaik, René; de Vlieg, Jacob; Polman, Jan; Alkema, Wynand
2008-07-01
Medline is a rich information source, from which links between genes and keywords describing biological processes, pathways, drugs, pathologies and diseases can be extracted. We developed a publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of keywords from several biomedical thesauri that are significantly correlated with the set of input genes. These lists link to Medline abstracts in which the co-occurring input genes and correlated keywords are highlighted. Furthermore, CoPub can graphically visualize differentially expressed genes and over-represented keywords in a network, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs. CoPub is freely accessible at http://services.nbic.nl/cgi-bin/copub/CoPub.pl.
Bipartite Community Structure of eQTLs.
Platig, John; Castaldi, Peter J; DeMeo, Dawn; Quackenbush, John
2016-09-01
Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network "hub" SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community ("core SNPs") and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits.
Systematic reconstruction of autism biology from massive genetic mutation profiles
Zhang, Chaolin; Jiang, Yong-hui
2018-01-01
Autism spectrum disorder (ASD) affects 1% of world population and has become a pressing medical and social problem worldwide. As a paradigmatic complex genetic disease, ASD has been intensively studied and thousands of gene mutations have been reported. Because these mutations rarely recur, it is difficult to (i) pinpoint the fewer disease-causing versus majority random events and (ii) replicate or verify independent studies. A coherent and systematic understanding of autism biology has not been achieved. We analyzed 3392 and 4792 autism-related mutations from two large-scale whole-exome studies across multiple resolution levels, that is, variants (single-nucleotide), genes (protein-coding unit), and pathways (molecular module). These mutations do not recur or replicate at the variant level, but significantly and increasingly do so at gene and pathway levels. Genetic association reveals a novel gene + pathway dual-hit model, where the mutation burden becomes less relevant. In multiple independent analyses, hundreds of variants or genes repeatedly converge to several canonical pathways, either novel or literature-supported. These pathways define recurrent and systematic ASD biology, distinct from previously reported gene groups or networks. They also present a catalog of novel ASD risk factors including 118 variants and 72 genes. At a subpathway level, most variants disrupt the pathway-related gene functions, and in the same gene, they tend to hit residues extremely close to each other and in the same domain. Multiple interacting variants spotlight key modules, including the cAMP (adenosine 3′,5′-monophosphate) second-messenger system and mGluR (metabotropic glutamate receptor) signaling regulation by GRKs (G protein–coupled receptor kinases). At a superpathway level, distinct pathways further interconnect and converge to three biology themes: synaptic function, morphology, and plasticity. PMID:29651456
Systematic reconstruction of autism biology from massive genetic mutation profiles.
Luo, Weijun; Zhang, Chaolin; Jiang, Yong-Hui; Brouwer, Cory R
2018-04-01
Autism spectrum disorder (ASD) affects 1% of world population and has become a pressing medical and social problem worldwide. As a paradigmatic complex genetic disease, ASD has been intensively studied and thousands of gene mutations have been reported. Because these mutations rarely recur, it is difficult to (i) pinpoint the fewer disease-causing versus majority random events and (ii) replicate or verify independent studies. A coherent and systematic understanding of autism biology has not been achieved. We analyzed 3392 and 4792 autism-related mutations from two large-scale whole-exome studies across multiple resolution levels, that is, variants (single-nucleotide), genes (protein-coding unit), and pathways (molecular module). These mutations do not recur or replicate at the variant level, but significantly and increasingly do so at gene and pathway levels. Genetic association reveals a novel gene + pathway dual-hit model, where the mutation burden becomes less relevant. In multiple independent analyses, hundreds of variants or genes repeatedly converge to several canonical pathways, either novel or literature-supported. These pathways define recurrent and systematic ASD biology, distinct from previously reported gene groups or networks. They also present a catalog of novel ASD risk factors including 118 variants and 72 genes. At a subpathway level, most variants disrupt the pathway-related gene functions, and in the same gene, they tend to hit residues extremely close to each other and in the same domain. Multiple interacting variants spotlight key modules, including the cAMP (adenosine 3',5'-monophosphate) second-messenger system and mGluR (metabotropic glutamate receptor) signaling regulation by GRKs (G protein-coupled receptor kinases). At a superpathway level, distinct pathways further interconnect and converge to three biology themes: synaptic function, morphology, and plasticity.
2009-01-01
Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286
Badr, Eman; ElHefnawi, Mahmoud; Heath, Lenwood S
2016-01-01
Alternative splicing is a vital process for regulating gene expression and promoting proteomic diversity. It plays a key role in tissue-specific expressed genes. This specificity is mainly regulated by splicing factors that bind to specific sequences called splicing regulatory elements (SREs). Here, we report a genome-wide analysis to study alternative splicing on multiple tissues, including brain, heart, liver, and muscle. We propose a pipeline to identify differential exons across tissues and hence tissue-specific SREs. In our pipeline, we utilize the DEXSeq package along with our previously reported algorithms. Utilizing the publicly available RNA-Seq data set from the Human BodyMap project, we identified 28,100 differentially used exons across the four tissues. We identified tissue-specific exonic splicing enhancers that overlap with various previously published experimental and computational databases. A complicated exonic enhancer regulatory network was revealed, where multiple exonic enhancers were found across multiple tissues while some were found only in specific tissues. Putative combinatorial exonic enhancers and silencers were discovered as well, which may be responsible for exon inclusion or exclusion across tissues. Some of the exonic enhancers are found to be co-occurring with multiple exonic silencers and vice versa, which demonstrates a complicated relationship between tissue-specific exonic enhancers and silencers.
Genomic analysis of regulatory network dynamics reveals large topological changes
NASA Astrophysics Data System (ADS)
Luscombe, Nicholas M.; Madan Babu, M.; Yu, Haiyuan; Snyder, Michael; Teichmann, Sarah A.; Gerstein, Mark
2004-09-01
Network analysis has been applied widely, providing a unifying language to describe disparate systems ranging from social interactions to power grids. It has recently been used in molecular biology, but so far the resulting networks have only been analysed statically. Here we present the dynamics of a biological network on a genomic scale, by integrating transcriptional regulatory information and gene-expression data for multiple conditions in Saccharomyces cerevisiae. We develop an approach for the statistical analysis of network dynamics, called SANDY, combining well-known global topological measures, local motifs and newly derived statistics. We uncover large changes in underlying network architecture that are unexpected given current viewpoints and random simulations. In response to diverse stimuli, transcription factors alter their interactions to varying degrees, thereby rewiring the network. A few transcription factors serve as permanent hubs, but most act transiently only during certain conditions. By studying sub-network structures, we show that environmental responses facilitate fast signal propagation (for example, with short regulatory cascades), whereas the cell cycle and sporulation direct temporal progression through multiple stages (for example, with highly inter-connected transcription factors). Indeed, to drive the latter processes forward, phase-specific transcription factors inter-regulate serially, and ubiquitously active transcription factors layer above them in a two-tiered hierarchy. We anticipate that many of the concepts presented here-particularly the large-scale topological changes and hub transience-will apply to other biological networks, including complex sub-systems in higher eukaryotes.
Mach, Núria; Ramayo-Caldas, Yuliaxis; Clark, Allison; Moroldo, Marco; Robert, Céline; Barrey, Eric; López, Jesús Maria; Le Moyec, Laurence
2017-02-17
Endurance exercise in horses requires adaptive processes involving physiological, biochemical, and cognitive-behavioral responses in an attempt to regain homeostasis. We hypothesized that the identification of the relationships between blood metabolome, transcriptome, and miRNome during endurance exercise in horses could provide significant insights into the molecular response to endurance exercise. For this reason, the serum metabolome and whole-blood transcriptome and miRNome data were obtained from ten horses before and after a 160 km endurance competition. We obtained a global regulatory network based on 11 unique metabolites, 263 metabolic genes and 5 miRNAs whose expression was significantly altered at T1 (post- endurance competition) relative to T0 (baseline, pre-endurance competition). This network provided new insights into the cross talk between the distinct molecular pathways (e.g. energy and oxygen sensing, oxidative stress, and inflammation) that were not detectable when analyzing single metabolites or transcripts alone. Single metabolites and transcripts were carrying out multiple roles and thus sharing several biochemical pathways. Using a regulatory impact factor metric analysis, this regulatory network was further confirmed at the transcription factor and miRNA levels. In an extended cohort of 31 independent animals, multiple factor analysis confirmed the strong associations between lactate, methylene derivatives, miR-21-5p, miR-16-5p, let-7 family and genes that coded proteins involved in metabolic reactions primarily related to energy, ubiquitin proteasome and lipopolysaccharide immune responses after the endurance competition. Multiple factor analysis also identified potential biomarkers at T0 for an increased likelihood for failure to finish an endurance competition. To the best of our knowledge, the present study is the first to provide a comprehensive and integrated overview of the metabolome, transcriptome, and miRNome co-regulatory networks that may have a key role in regulating the metabolic and immune response to endurance exercise in horses.
Abnormal metabolic brain networks in Parkinson's disease from blackboard to bedside.
Tang, Chris C; Eidelberg, David
2010-01-01
Metabolic imaging in the rest state has provided valuable information concerning the abnormalities of regional brain function that underlie idiopathic Parkinson's disease (PD). Moreover, network modeling procedures, such as spatial covariance analysis, have further allowed for the quantification of these changes at the systems level. In recent years, we have utilized this strategy to identify and validate three discrete metabolic networks in PD associated with the motor and cognitive manifestations of the disease. In this chapter, we will review and compare the specific functional topographies underlying parkinsonian akinesia/rigidity, tremor, and cognitive disturbance. While network activity progressed over time, the rate of change for each pattern was distinctive and paralleled the development of the corresponding clinical symptoms in early-stage patients. This approach is already showing great promise in identifying individuals with prodromal manifestations of PD and in assessing the rate of progression before clinical onset. Network modulation was found to correlate with the clinical effects of dopaminergic treatment and surgical interventions, such as subthalamic nucleus (STN) deep brain stimulation (DBS) and gene therapy. Abnormal metabolic networks have also been identified for atypical parkinsonian syndromes, such as multiple system atrophy (MSA) and progressive supranuclear palsy (PSP). Using multiple disease-related networks for PD, MSA, and PSP, we have developed a novel, fully automated algorithm for accurate classification at the single-patient level, even at early disease stages. Copyright © 2010 Elsevier B.V. All rights reserved.
Yue, Xun; Li, Xing Guo; Gao, Xin-Qi; Zhao, Xiang Yu; Dong, Yu Xiu; Zhou, Chao
2016-09-02
Phytohormone synergies and signaling interdependency are important topics in plant developmental biology. Physiological and genetic experimental evidence for phytohormone crosstalk has been accumulating and a genome-scale enzyme correlation model representing the Arabidopsis metabolic pathway has been published. However, an integrated molecular characterization of phytohormone crosstalk is still not available. A novel modeling methodology and advanced computational approaches were used to construct an enzyme-based Arabidopsis phytohormone crosstalk network (EAPCN) at the biosynthesis level. The EAPCN provided the structural connectivity architecture of phytohormone biosynthesis pathways and revealed a surprising result; that enzymes localized at the highly connected nodes formed a consecutive metabolic route. Furthermore, our analysis revealed that the transcription factors (TFs) that regulate enzyme-encoding genes in the consecutive metabolic route formed structures, which we describe as circular control units operating at the transcriptional level. Furthermore, the downstream TFs in phytohormone signal transduction pathways were found to be involved in the circular control units that included the TFs regulating enzyme-encoding genes. In addition, multiple functional enzymes in the EAPCN were found to be involved in ion and pH homeostasis, environmental signal perception, cellular redox homeostasis, and circadian clocks. Last, publicly available transcriptional profiles and a protein expression map of the Arabidopsis root apical meristem were used as a case study to validate the proposed framework. Our results revealed multiple scales of coupled mechanisms in that hormonal crosstalk networks that play a central role in coordinating internal developmental processes with environmental signals, and give a broader view of Arabidopsis phytohormone crosstalk. We also uncovered potential key regulators that can be further analyzed in future studies.
SZDB: A Database for Schizophrenia Genetic Research
Wu, Yong; Yao, Yong-Gang
2017-01-01
Abstract Schizophrenia (SZ) is a debilitating brain disorder with a complex genetic architecture. Genetic studies, especially recent genome-wide association studies (GWAS), have identified multiple variants (loci) conferring risk to SZ. However, how to efficiently extract meaningful biological information from bulk genetic findings of SZ remains a major challenge. There is a pressing need to integrate multiple layers of data from various sources, eg, genetic findings from GWAS, copy number variations (CNVs), association and linkage studies, gene expression, protein–protein interaction (PPI), co-expression, expression quantitative trait loci (eQTL), and Encyclopedia of DNA Elements (ENCODE) data, to provide a comprehensive resource to facilitate the translation of genetic findings into SZ molecular diagnosis and mechanism study. Here we developed the SZDB database (http://www.szdb.org/), a comprehensive resource for SZ research. SZ genetic data, gene expression data, network-based data, brain eQTL data, and SNP function annotation information were systematically extracted, curated and deposited in SZDB. In-depth analyses and systematic integration were performed to identify top prioritized SZ genes and enriched pathways. Multiple types of data from various layers of SZ research were systematically integrated and deposited in SZDB. In-depth data analyses and integration identified top prioritized SZ genes and enriched pathways. We further showed that genes implicated in SZ are highly co-expressed in human brain and proteins encoded by the prioritized SZ risk genes are significantly interacted. The user-friendly SZDB provides high-confidence candidate variants and genes for further functional characterization. More important, SZDB provides convenient online tools for data search and browse, data integration, and customized data analyses. PMID:27451428
Caberlotto, Laura; Lauria, Mario; Nguyen, Thanh-Phuong; Scotti, Marco
2013-01-01
Alzheimer's disease is the most common cause of dementia worldwide, affecting the elderly population. It is characterized by the hallmark pathology of amyloid-β deposition, neurofibrillary tangle formation, and extensive neuronal degeneration in the brain. Wealth of data related to Alzheimer's disease has been generated to date, nevertheless, the molecular mechanism underlying the etiology and pathophysiology of the disease is still unknown. Here we described a method for the combined analysis of multiple types of genome-wide data aimed at revealing convergent evidence interest that would not be captured by a standard molecular approach. Lists of Alzheimer-related genes (seed genes) were obtained from different sets of data on gene expression, SNPs, and molecular targets of drugs. Network analysis was applied for identifying the regions of the human protein-protein interaction network showing a significant enrichment in seed genes, and ultimately, in genes associated to Alzheimer's disease, due to the cumulative effect of different combinations of the starting data sets. The functional properties of these enriched modules were characterized, effectively considering the role of both Alzheimer-related seed genes and genes that closely interact with them. This approach allowed us to present evidence in favor of one of the competing theories about AD underlying processes, specifically evidence supporting a predominant role of metabolism-associated biological process terms, including autophagy, insulin and fatty acid metabolic processes in Alzheimer, with a focus on AMP-activated protein kinase. This central regulator of cellular energy homeostasis regulates a series of brain functions altered in Alzheimer's disease and could link genetic perturbation with neuronal transmission and energy regulation, representing a potential candidate to be targeted by therapy.
Pluripotency gene network dynamics: System views from parametric analysis.
Akberdin, Ilya R; Omelyanchuk, Nadezda A; Fadeev, Stanislav I; Leskova, Natalya E; Oschepkova, Evgeniya A; Kazantsev, Fedor V; Matushkin, Yury G; Afonnikov, Dmitry A; Kolchanov, Nikolay A
2018-01-01
Multiple experimental data demonstrated that the core gene network orchestrating self-renewal and differentiation of mouse embryonic stem cells involves activity of Oct4, Sox2 and Nanog genes by means of a number of positive feedback loops among them. However, recent studies indicated that the architecture of the core gene network should also incorporate negative Nanog autoregulation and might not include positive feedbacks from Nanog to Oct4 and Sox2. Thorough parametric analysis of the mathematical model based on this revisited core regulatory circuit identified that there are substantial changes in model dynamics occurred depending on the strength of Oct4 and Sox2 activation and molecular complexity of Nanog autorepression. The analysis showed the existence of four dynamical domains with different numbers of stable and unstable steady states. We hypothesize that these domains can constitute the checkpoints in a developmental progression from naïve to primed pluripotency and vice versa. During this transition, parametric conditions exist, which generate an oscillatory behavior of the system explaining heterogeneity in expression of pluripotent and differentiation factors in serum ESC cultures. Eventually, simulations showed that addition of positive feedbacks from Nanog to Oct4 and Sox2 leads mainly to increase of the parametric space for the naïve ESC state, in which pluripotency factors are strongly expressed while differentiation ones are repressed.
Visscher, Anne M.; Belfield, Eric J.; Vlad, Daniela; Irani, Niloufer; Moore, Ian; Harberd, Nicholas P.
2015-01-01
A subset of genes in Arabidopsis thaliana is known to be up-regulated in response to a wide range of different environmental stress factors. However, not all of these genes are characterized as yet with respect to their functions. In this study, we used transgenic knockout, overexpression and reporter gene approaches to try to elucidate the biological roles of five unknown multiple-stress responsive genes in Arabidopsis. The selected genes have the following locus identifiers: At1g18740, At1g74450, At4g27652, At4g29780 and At5g12010. Firstly, T-DNA insertion knockout lines were identified for each locus and screened for altered phenotypes. None of the lines were found to be visually different from wildtype Col-0. Secondly, 35S-driven overexpression lines were generated for each open reading frame. Analysis of these transgenic lines showed altered phenotypes for lines overexpressing the At1g74450 ORF. Plants overexpressing the multiple-stress responsive gene At1g74450 are stunted in height and have reduced male fertility. Alexander staining of anthers from flowers at developmental stage 12–13 showed either an absence or a reduction in viable pollen compared to wildtype Col-0 and At1g74450 knockout lines. Interestingly, the effects of stress on crop productivity are most severe at developmental stages such as male gametophyte development. However, the molecular factors and regulatory networks underlying environmental stress-induced male gametophytic alterations are still largely unknown. Our results indicate that the At1g74450 gene provides a potential link between multiple environmental stresses, plant height and pollen development. In addition, ruthenium red staining analysis showed that At1g74450 may affect the composition of the inner seed coat mucilage layer. Finally, C-terminal GFP fusion proteins for At1g74450 were shown to localise to the cytosol. PMID:26485022
Boolean dynamics of genetic regulatory networks inferred from microarray time series data
Martin, Shawn; Zhang, Zhaoduo; Martino, Anthony; ...
2007-01-31
Methods available for the inference of genetic regulatory networks strive to produce a single network, usually by optimizing some quantity to fit the experimental observations. In this paper we investigate the possibility that multiple networks can be inferred, all resulting in similar dynamics. This idea is motivated by theoretical work which suggests that biological networks are robust and adaptable to change, and that the overall behavior of a genetic regulatory network might be captured in terms of dynamical basins of attraction. We have developed and implemented a method for inferring genetic regulatory networks for time series microarray data. Our methodmore » first clusters and discretizes the gene expression data using k-means and support vector regression. We then enumerate Boolean activation–inhibition networks to match the discretized data. In conclusion, the dynamics of the Boolean networks are examined. We have tested our method on two immunology microarray datasets: an IL-2-stimulated T cell response dataset and a LPS-stimulated macrophage response dataset. In both cases, we discovered that many networks matched the data, and that most of these networks had similar dynamics.« less
Du, Qingzhang; Tian, Jiaxing; Yang, Xiaohui; Pan, Wei; Xu, Baohua; Li, Bailian; Ingvarsson, Pär K.; Zhang, Deqiang
2015-01-01
Economically important traits in many species generally show polygenic, quantitative inheritance. The components of genetic variation (additive, dominant and epistatic effects) of these traits conferred by multiple genes in shared biological pathways remain to be defined. Here, we investigated 11 full-length genes in cellulose biosynthesis, on 10 growth and wood-property traits, within a population of 460 unrelated Populus tomentosa individuals, via multi-gene association. To validate positive associations, we conducted single-marker analysis in a linkage population of 1,200 individuals. We identified 118, 121, and 43 associations (P< 0.01) corresponding to additive, dominant, and epistatic effects, respectively, with low to moderate proportions of phenotypic variance (R2). Epistatic interaction models uncovered a combination of three non-synonymous sites from three unique genes, representing a significant epistasis for diameter at breast height and stem volume. Single-marker analysis validated 61 associations (false discovery rate, Q ≤ 0.10), representing 38 SNPs from nine genes, and its average effect (R2 = 3.8%) nearly 2-fold higher than that identified with multi-gene association, suggesting that multi-gene association can capture smaller individual variants. Moreover, a structural gene–gene network based on tissue-specific transcript abundances provides a better understanding of the multi-gene pathway affecting tree growth and lignocellulose biosynthesis. Our study highlights the importance of pathway-based multiple gene associations to uncover the nature of genetic variance for quantitative traits and may drive novel progress in molecular breeding. PMID:25428896
Ferrari, Raffaele; Forabosco, Paola; Vandrovcova, Jana; Botía, Juan A; Guelfi, Sebastian; Warren, Jason D; Momeni, Parastoo; Weale, Michael E; Ryten, Mina; Hardy, John
2016-02-24
In frontotemporal dementia (FTD) there is a critical lack in the understanding of biological and molecular mechanisms involved in disease pathogenesis. The heterogeneous genetic features associated with FTD suggest that multiple disease-mechanisms are likely to contribute to the development of this neurodegenerative condition. We here present a systems biology approach with the scope of i) shedding light on the biological processes potentially implicated in the pathogenesis of FTD and ii) identifying novel potential risk factors for FTD. We performed a gene co-expression network analysis of microarray expression data from 101 individuals without neurodegenerative diseases to explore regional-specific co-expression patterns in the frontal and temporal cortices for 12 genes (MAPT, GRN, CHMP2B, CTSC, HLA-DRA, TMEM106B, C9orf72, VCP, UBQLN2, OPTN, TARDBP and FUS) associated with FTD and we then carried out gene set enrichment and pathway analyses, and investigated known protein-protein interactors (PPIs) of FTD-genes products. Gene co-expression networks revealed that several FTD-genes (such as MAPT and GRN, CTSC and HLA-DRA, TMEM106B, and C9orf72, VCP, UBQLN2 and OPTN) were clustering in modules of relevance in the frontal and temporal cortices. Functional annotation and pathway analyses of such modules indicated enrichment for: i) DNA metabolism, i.e. transcription regulation, DNA protection and chromatin remodelling (MAPT and GRN modules); ii) immune and lysosomal processes (CTSC and HLA-DRA modules), and; iii) protein meta/catabolism (C9orf72, VCP, UBQLN2 and OPTN, and TMEM106B modules). PPI analysis supported the results of the functional annotation and pathway analyses. This work further characterizes known FTD-genes and elaborates on their biological relevance to disease: not only do we indicate likely impacted regional-specific biological processes driven by FTD-genes containing modules, but also do we suggest novel potential risk factors among the FTD-genes interactors as targets for further mechanistic characterization in hypothesis driven cell biology work.
Suo, Chen; Hrydziuszko, Olga; Lee, Donghwan; Pramana, Setia; Saputra, Dhany; Joshi, Himanshu; Calza, Stefano; Pawitan, Yudi
2015-08-15
Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores (P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/. yudi.pawitan@ki.se Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A reproducible approach to high-throughput biological data acquisition and integration
Rahnavard, Gholamali; Waldron, Levi; McIver, Lauren; Shafquat, Afrah; Franzosa, Eric A.; Miropolsky, Larissa; Sweeney, Christopher
2015-01-01
Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa. PMID:26157642
Gene network inference by fusing data from diverse distributions
Žitnik, Marinka; Zupan, Blaž
2015-01-01
Motivation: Markov networks are undirected graphical models that are widely used to infer relations between genes from experimental data. Their state-of-the-art inference procedures assume the data arise from a Gaussian distribution. High-throughput omics data, such as that from next generation sequencing, often violates this assumption. Furthermore, when collected data arise from multiple related but otherwise nonidentical distributions, their underlying networks are likely to have common features. New principled statistical approaches are needed that can deal with different data distributions and jointly consider collections of datasets. Results: We present FuseNet, a Markov network formulation that infers networks from a collection of nonidentically distributed datasets. Our approach is computationally efficient and general: given any number of distributions from an exponential family, FuseNet represents model parameters through shared latent factors that define neighborhoods of network nodes. In a simulation study, we demonstrate good predictive performance of FuseNet in comparison to several popular graphical models. We show its effectiveness in an application to breast cancer RNA-sequencing and somatic mutation data, a novel application of graphical models. Fusion of datasets offers substantial gains relative to inference of separate networks for each dataset. Our results demonstrate that network inference methods for non-Gaussian data can help in accurate modeling of the data generated by emergent high-throughput technologies. Availability and implementation: Source code is at https://github.com/marinkaz/fusenet. Contact: blaz.zupan@fri.uni-lj.si Supplementary information: Supplementary information is available at Bioinformatics online. PMID:26072487
Network representations of immune system complexity
Subramanian, Naeha; Torabi-Parizi, Parizad; Gottschalk, Rachel A.; Germain, Ronald N.; Dutta, Bhaskar
2015-01-01
The mammalian immune system is a dynamic multi-scale system composed of a hierarchically organized set of molecular, cellular and organismal networks that act in concert to promote effective host defense. These networks range from those involving gene regulatory and protein-protein interactions underlying intracellular signaling pathways and single cell responses to increasingly complex networks of in vivo cellular interaction, positioning and migration that determine the overall immune response of an organism. Immunity is thus not the product of simple signaling events but rather non-linear behaviors arising from dynamic, feedback-regulated interactions among many components. One of the major goals of systems immunology is to quantitatively measure these complex multi-scale spatial and temporal interactions, permitting development of computational models that can be used to predict responses to perturbation. Recent technological advances permit collection of comprehensive datasets at multiple molecular and cellular levels while advances in network biology support representation of the relationships of components at each level as physical or functional interaction networks. The latter facilitate effective visualization of patterns and recognition of emergent properties arising from the many interactions of genes, molecules, and cells of the immune system. We illustrate the power of integrating ‘omics’ and network modeling approaches for unbiased reconstruction of signaling and transcriptional networks with a focus on applications involving the innate immune system. We further discuss future possibilities for reconstruction of increasingly complex cellular and organism-level networks and development of sophisticated computational tools for prediction of emergent immune behavior arising from the concerted action of these networks. PMID:25625853
Tanimizu, Toshiyuki; Kenney, Justin W; Okano, Emiko; Kadoma, Kazune; Frankland, Paul W; Kida, Satoshi
2017-04-12
Social recognition memory is an essential and basic component of social behavior that is used to discriminate familiar and novel animals/humans. Previous studies have shown the importance of several brain regions for social recognition memories; however, the mechanisms underlying the consolidation of social recognition memory at the molecular and anatomic levels remain unknown. Here, we show a brain network necessary for the generation of social recognition memory in mice. A mouse genetic study showed that cAMP-responsive element-binding protein (CREB)-mediated transcription is required for the formation of social recognition memory. Importantly, significant inductions of the CREB target immediate-early genes c-fos and Arc were observed in the hippocampus (CA1 and CA3 regions), medial prefrontal cortex (mPFC), anterior cingulate cortex (ACC), and amygdala (basolateral region) when social recognition memory was generated. Pharmacological experiments using a microinfusion of the protein synthesis inhibitor anisomycin showed that protein synthesis in these brain regions is required for the consolidation of social recognition memory. These findings suggested that social recognition memory is consolidated through the activation of CREB-mediated gene expression in the hippocampus/mPFC/ACC/amygdala. Network analyses suggested that these four brain regions show functional connectivity with other brain regions and, more importantly, that the hippocampus functions as a hub to integrate brain networks and generate social recognition memory, whereas the ACC and amygdala are important for coordinating brain activity when social interaction is initiated by connecting with other brain regions. We have found that a brain network composed of the hippocampus/mPFC/ACC/amygdala is required for the consolidation of social recognition memory. SIGNIFICANCE STATEMENT Here, we identify brain networks composed of multiple brain regions for the consolidation of social recognition memory. We found that social recognition memory is consolidated through CREB-meditated gene expression in the hippocampus, medial prefrontal cortex, anterior cingulate cortex (ACC), and amygdala. Importantly, network analyses based on c-fos expression suggest that functional connectivity of these four brain regions with other brain regions is increased with time spent in social investigation toward the generation of brain networks to consolidate social recognition memory. Furthermore, our findings suggest that hippocampus functions as a hub to integrate brain networks and generate social recognition memory, whereas ACC and amygdala are important for coordinating brain activity when social interaction is initiated by connecting with other brain regions. Copyright © 2017 the authors 0270-6474/17/374103-14$15.00/0.
Voss, Joachim G.; Dobra, Adrian; Morse, Caryn; Kovacs, Joseph A.; Danner, Robert L.; Munson, Peter J.; Logan, Carolea; Rangel, Zoila; Adelsberger, Joseph W.; McLaughlin, Mary; Adams, Larry D.; Raju, Raghavan; Dalakas, Marinos C.
2016-01-01
Purpose Human immunodeficiency virus (HIV)–related fatigue (HRF) is multicausal and potentially related to mitochondrial dysfunction caused by antiretroviral therapy with nucleoside reverse transcriptase inhibitors (NRTIs). Methodology The authors compared gene expression profiles of CD14+ cells of low versus high fatigued, NRTI-treated HIV patients to healthy controls (n = 5/group). The authors identified 32 genes predictive of low versus high fatigue and 33 genes predictive of healthy versus HIV infection. The authors constructed genetic networks to further elucidate the possible biological pathways in which these genes are involved. Relevance for nursing practice Genes including the actin cytoskeletal regulatory proteins Prokineticin 2 and Cofilin 2 along with mitochondrial inner membrane proteins are involved in multiple pathways and were predictors of fatigue status. Previously identified inflammatory and signaling genes were predictive of HIV status, clearly confirming our results and suggesting a possible further connection between mitochondrial function and HIV. Isolated CD14+ cells are easily accessible cells that could be used for further study of the connection between fatigue and mitochondrial function of HIV patients. Implication for Practice The findings from this pilot study take us one step closer to identifying biomarker targets for fatigue status and mitochondrial dysfunction. Specific biomarkers will be pertinent to the development of methodologies to diagnosis, monitor, and treat fatigue and mitochondrial dysfunction. PMID:23324479
WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data
Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M
2006-01-01
Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281
2011-01-01
Background To make sense out of gene expression profiles, such analyses must be pushed beyond the mere listing of affected genes. For example, if a group of genes persistently display similar changes in expression levels under particular experimental conditions, and the proteins encoded by these genes interact and function in the same cellular compartments, this could be taken as very strong indicators for co-regulated protein complexes. One of the key requirements is having appropriate tools to detect such regulatory patterns. Results We have analyzed the global adaptations in gene expression patterns in the budding yeast when the Hsp90 molecular chaperone complex is perturbed either pharmacologically or genetically. We integrated these results with publicly accessible expression, protein-protein interaction and intracellular localization data. But most importantly, all experimental conditions were simultaneously and dynamically visualized with an animation. This critically facilitated the detection of patterns of gene expression changes that suggested underlying regulatory networks that a standard analysis by pairwise comparison and clustering could not have revealed. Conclusions The results of the animation-assisted detection of changes in gene regulatory patterns make predictions about the potential roles of Hsp90 and its co-chaperone p23 in regulating whole sets of genes. The simultaneous dynamic visualization of microarray experiments, represented in networks built by integrating one's own experimental with publicly accessible data, represents a powerful discovery tool that allows the generation of new interpretations and hypotheses. PMID:21672238
Moschen, Sebastián; Higgins, Janet; Di Rienzo, Julio A; Heinz, Ruth A; Paniego, Norma; Fernandez, Paula
2016-06-06
In recent years, high throughput technologies have led to an increase of datasets from omics disciplines allowing the understanding of the complex regulatory networks associated with biological processes. Leaf senescence is a complex mechanism controlled by multiple genetic and environmental variables, which has a strong impact on crop yield. Transcription factors (TFs) are key proteins in the regulation of gene expression, regulating different signaling pathways; their function is crucial for triggering and/or regulating different aspects of the leaf senescence process. The study of TF interactions and their integration with metabolic profiles under different developmental conditions, especially for a non-model organism such as sunflower, will open new insights into the details of gene regulation of leaf senescence. Weighted Gene Correlation Network Analysis (WGCNA) and BioSignature Discoverer (BioSD, Gnosis Data Analysis, Heraklion, Greece) were used to integrate transcriptomic and metabolomic data. WGCNA allowed the detection of 10 metabolites and 13 TFs whereas BioSD allowed the detection of 1 metabolite and 6 TFs as potential biomarkers. The comparative analysis demonstrated that three transcription factors were detected through both methodologies, highlighting them as potentially robust biomarkers associated with leaf senescence in sunflower. The complementary use of network and BioSignature Discoverer analysis of transcriptomic and metabolomic data provided a useful tool for identifying candidate genes and metabolites which may have a role during the triggering and development of the leaf senescence process. The WGCNA tool allowed us to design and test a hypothetical network in order to infer relationships across selected transcription factor and metabolite candidate biomarkers involved in leaf senescence, whereas BioSignature Discoverer selected transcripts and metabolites which discriminate between different ages of sunflower plants. The methodology presented here would help to elucidate and predict novel networks and potential biomarkers of leaf senescence in sunflower.
Shirdel, Elize A.; Xie, Wing; Mak, Tak W.; Jurisica, Igor
2011-01-01
Background MicroRNAs are a class of small RNAs known to regulate gene expression at the transcript level, the protein level, or both. Since microRNA binding is sequence-based but possibly structure-specific, work in this area has resulted in multiple databases storing predicted microRNA:target relationships computed using diverse algorithms. We integrate prediction databases, compare predictions to in vitro data, and use cross-database predictions to model the microRNA:transcript interactome – referred to as the micronome – to study microRNA involvement in well-known signalling pathways as well as associations with disease. We make this data freely available with a flexible user interface as our microRNA Data Integration Portal — mirDIP (http://ophid.utoronto.ca/mirDIP). Results mirDIP integrates prediction databases to elucidate accurate microRNA:target relationships. Using NAViGaTOR to produce interaction networks implicating microRNAs in literature-based, KEGG-based and Reactome-based pathways, we find these signalling pathway networks have significantly more microRNA involvement compared to chance (p<0.05), suggesting microRNAs co-target many genes in a given pathway. Further examination of the micronome shows two distinct classes of microRNAs; universe microRNAs, which are involved in many signalling pathways; and intra-pathway microRNAs, which target multiple genes within one signalling pathway. We find universe microRNAs to have more targets (p<0.0001), to be more studied (p<0.0002), and to have higher degree in the KEGG cancer pathway (p<0.0001), compared to intra-pathway microRNAs. Conclusions Our pathway-based analysis of mirDIP data suggests microRNAs are involved in intra-pathway signalling. We identify two distinct classes of microRNAs, suggesting a hierarchical organization of microRNAs co-targeting genes both within and between pathways, and implying differential involvement of universe and intra-pathway microRNAs at the disease level. PMID:21364759
Gene panel testing for inherited cancer risk.
Hall, Michael J; Forman, Andrea D; Pilarski, Robert; Wiesner, Georgia; Giri, Veda N
2014-09-01
Next-generation sequencing technologies have ushered in the capability to assess multiple genes in parallel for genetic alterations that may contribute to inherited risk for cancers in families. Thus, gene panel testing is now an option in the setting of genetic counseling and testing for cancer risk. This article describes the many gene panel testing options clinically available to assess inherited cancer susceptibility, the potential advantages and challenges associated with various types of panels, clinical scenarios in which gene panels may be particularly useful in cancer risk assessment, and testing and counseling considerations. Given the potential issues for patients and their families, gene panel testing for inherited cancer risk is recommended to be offered in conjunction or consultation with an experienced cancer genetic specialist, such as a certified genetic counselor or geneticist, as an integral part of the testing process. Copyright © 2014 by the National Comprehensive Cancer Network.
EnRICH: Extraction and Ranking using Integration and Criteria Heuristics.
Zhang, Xia; Greenlee, M Heather West; Serb, Jeanne M
2013-01-15
High throughput screening technologies enable biologists to generate candidate genes at a rate that, due to time and cost constraints, cannot be studied by experimental approaches in the laboratory. Thus, it has become increasingly important to prioritize candidate genes for experiments. To accomplish this, researchers need to apply selection requirements based on their knowledge, which necessitates qualitative integration of heterogeneous data sources and filtration using multiple criteria. A similar approach can also be applied to putative candidate gene relationships. While automation can assist in this routine and imperative procedure, flexibility of data sources and criteria must not be sacrificed. A tool that can optimize the trade-off between automation and flexibility to simultaneously filter and qualitatively integrate data is needed to prioritize candidate genes and generate composite networks from heterogeneous data sources. We developed the java application, EnRICH (Extraction and Ranking using Integration and Criteria Heuristics), in order to alleviate this need. Here we present a case study in which we used EnRICH to integrate and filter multiple candidate gene lists in order to identify potential retinal disease genes. As a result of this procedure, a candidate pool of several hundred genes was narrowed down to five candidate genes, of which four are confirmed retinal disease genes and one is associated with a retinal disease state. We developed a platform-independent tool that is able to qualitatively integrate multiple heterogeneous datasets and use different selection criteria to filter each of them, provided the datasets are tables that have distinct identifiers (required) and attributes (optional). With the flexibility to specify data sources and filtering criteria, EnRICH automatically prioritizes candidate genes or gene relationships for biologists based on their specific requirements. Here, we also demonstrate that this tool can be effectively and easily used to apply highly specific user-defined criteria and can efficiently identify high quality candidate genes from relatively sparse datasets.
Multidimensional adaptive evolution of a feed-forward network and the illusion of compensation
Bullaughey, Kevin
2016-01-01
When multiple substitutions affect a trait in opposing ways, they are often assumed to be compensatory, not only with respect to the trait, but also with respect to fitness. This type of compensatory evolution has been suggested to underlie the evolution of protein structures and interactions, RNA secondary structures, and gene regulatory modules and networks. The possibility for compensatory evolution results from epistasis. Yet if epistasis is widespread, then it is also possible that the opposing substitutions are individually adaptive. I term this possibility an adaptive reversal. Although possible for arbitrary phenotype-fitness mappings, it has not yet been investigated whether such epistasis is prevalent in a biologically-realistic setting. I investigate a particular regulatory circuit, the type I coherent feed-forward loop, which is ubiquitous in natural systems and is accurately described by a simple mathematical model. I show that such reversals are common during adaptive evolution, can result solely from the topology of the fitness landscape, and can occur even when adaptation follows a modest environmental change and the network was well adapted to the original environment. The possibility of adaptive reversals warrants a systems perspective when interpreting substitution patterns in gene regulatory networks. PMID:23289561
Essential protein discovery based on a combination of modularity and conservatism.
Zhao, Bihai; Wang, Jianxin; Li, Xueyong; Wu, Fang-Xiang
2016-11-01
Essential proteins are indispensable for the survival of a living organism and play important roles in the emerging field of synthetic biology. Many computational methods have been proposed to identify essential proteins by using the topological features of interactome networks. However, most of these methods ignored intrinsic biological meaning of proteins. Researches show that essentiality is tied not only to the protein or gene itself, but also to the molecular modules to which that protein belongs. The results of this study reveal the modularity of essential proteins. On the other hand, essential proteins are more evolutionarily conserved than nonessential proteins and frequently bind each other. That is to say, conservatism is another important feature of essential proteins. Multiple networks are constructed by integrating protein-protein interaction (PPI) networks, time course gene expression data and protein domain information. Based on these networks, a new essential protein identification method is proposed based on a combination of modularity and conservatism of proteins. Experimental results show that the proposed method outperforms other essential protein identification methods in terms of a number essential protein out of top ranked candidates. Copyright © 2016. Published by Elsevier Inc.
An Evolutionarily Conserved Innate Immunity Protein Interaction Network*
De Arras, Lesly; Seng, Amara; Lackford, Brad; Keikhaee, Mohammad R.; Bowerman, Bruce; Freedman, Jonathan H.; Schwartz, David A.; Alper, Scott
2013-01-01
The innate immune response plays a critical role in fighting infection; however, innate immunity also can affect the pathogenesis of a variety of diseases, including sepsis, asthma, cancer, and atherosclerosis. To identify novel regulators of innate immunity, we performed comparative genomics RNA interference screens in the nematode Caenorhabditis elegans and mouse macrophages. These screens have uncovered many candidate regulators of the response to lipopolysaccharide (LPS), several of which interact physically in multiple species to form an innate immunity protein interaction network. This protein interaction network contains several proteins in the canonical LPS-responsive TLR4 pathway as well as many novel interacting proteins. Using RNAi and overexpression studies, we show that almost every gene in this network can modulate the innate immune response in mouse cell lines. We validate the importance of this network in innate immunity regulation in vivo using available mutants in C. elegans and mice. PMID:23209288
Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.
Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun
2017-12-21
Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
Ghosh, Sujoy; Vivar, Juan; Nelson, Christopher P; Willenborg, Christina; Segrè, Ayellet V; Mäkinen, Ville-Petteri; Nikpay, Majid; Erdmann, Jeannette; Blankenberg, Stefan; O'Donnell, Christopher; März, Winfried; Laaksonen, Reijo; Stewart, Alexandre F R; Epstein, Stephen E; Shah, Svati H; Granger, Christopher B; Hazen, Stanley L; Kathiresan, Sekar; Reilly, Muredach P; Yang, Xia; Quertermous, Thomas; Samani, Nilesh J; Schunkert, Heribert; Assimes, Themistocles L; McPherson, Ruth
2015-07-01
Genome-wide association studies have identified multiple genetic variants affecting the risk of coronary artery disease (CAD). However, individually these explain only a small fraction of the heritability of CAD and for most, the causal biological mechanisms remain unclear. We sought to obtain further insights into potential causal processes of CAD by integrating large-scale GWA data with expertly curated databases of core human pathways and functional networks. Using pathways (gene sets) from Reactome, we carried out a 2-stage gene set enrichment analysis strategy. From a meta-analyzed discovery cohort of 7 CAD genome-wide association study data sets (9889 cases/11 089 controls), nominally significant gene sets were tested for replication in a meta-analysis of 9 additional studies (15 502 cases/55 730 controls) from the Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIoGRAM) Consortium. A total of 32 of 639 Reactome pathways tested showed convincing association with CAD (replication P<0.05). These pathways resided in 9 of 21 core biological processes represented in Reactome, and included pathways relevant to extracellular matrix (ECM) integrity, innate immunity, axon guidance, and signaling by PDRF (platelet-derived growth factor), NOTCH, and the transforming growth factor-β/SMAD receptor complex. Many of these pathways had strengths of association comparable to those observed in lipid transport pathways. Network analysis of unique genes within the replicated pathways further revealed several interconnected functional and topologically interacting modules representing novel associations (eg, semaphoring-regulated axonal guidance pathway) besides confirming known processes (lipid metabolism). The connectivity in the observed networks was statistically significant compared with random networks (P<0.001). Network centrality analysis (degree and betweenness) further identified genes (eg, NCAM1, FYN, FURIN, etc) likely to play critical roles in the maintenance and functioning of several of the replicated pathways. These findings provide novel insights into how genetic variation, interpreted in the context of biological processes and functional interactions among genes, may help define the genetic architecture of CAD. © 2015 American Heart Association, Inc.
Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A
2017-04-01
Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Chemical-Gene Interactions from ToxCast Bioactivity Data ...
Characterizing the effects of chemicals in biological systems is often summarized by chemical-gene interactions, which have sparse coverage in the literature. The ToxCast chemical screening program has produced bioactivity data for nearly 2000 chemicals and over 450 gene targets. To evaluate the information gained from the ToxCast project, a ToxCast bioactivity network was created comprising ToxCast chemical-gene interactions based on assay data and compared to a chemical-gene association network from literature. The literature network was compiled from PubMed articles, excluding ToxCast publications, mapped to genes and chemicals. Genes were identified by curated associations available from NCBI while chemicals were identified by PubChem submissions. The frequencies of chemical-gene associations from the literature network were log-scaled and then compared to the ToxCast bioactivity network. In total, 140 times more chemical-gene associations were present in the ToxCast network in comparison to the literature-derived network highlighting the vast increase in chemical-gene interactions putatively elucidated by the ToxCast research program. There were 165 associations found in the literature network that were reproduced by ToxCast bioactivity data, and 336 associations in the literature network were not reproduced by the ToxCast bioactivity network. The literature network relies on the assumption that chemical-gene associations represent a true chemical-gene inte
A Systems Level, Functional Genomics Analysis of Chronic Epilepsy
Bragin, Anatol; Kudo, Lili C.; Gehman, Lauren; Ruidera, Josephine; Geschwind, Daniel H.; Engel, Jerome
2011-01-01
Neither the molecular basis of the pathologic tendency of neuronal circuits to generate spontaneous seizures (epileptogenicity) nor anti-epileptogenic mechanisms that maintain a seizure-free state are well understood. Here, we performed transcriptomic analysis in the intrahippocampal kainate model of temporal lobe epilepsy in rats using both Agilent and Codelink microarray platforms to characterize the epileptic processes. The experimental design allowed subtraction of the confounding effects of the lesion, identification of expression changes associated with epileptogenicity, and genes upregulated by seizures with potential homeostatic anti-epileptogenic effects. Using differential expression analysis, we identified several hundred expression changes in chronic epilepsy, including candidate genes associated with epileptogenicity such as Bdnf and Kcnj13. To analyze these data from a systems perspective, we applied weighted gene co-expression network analysis (WGCNA) to identify groups of co-expressed genes (modules) and their central (hub) genes. One such module contained genes upregulated in the epileptogenic region, including multiple epileptogenicity candidate genes, and was found to be involved the protection of glial cells against oxidative stress, implicating glial oxidative stress in epileptogenicity. Another distinct module corresponded to the effects of chronic seizures and represented changes in neuronal synaptic vesicle trafficking. We found that the network structure and connectivity of one hub gene, Sv2a, showed significant changes between normal and epileptogenic tissue, becoming more highly connected in epileptic brain. Since Sv2a is a target of the antiepileptic levetiracetam, this module may be important in controlling seizure activity. Bioinformatic analysis of this module also revealed a potential mechanism for the observed transcriptional changes via generation of longer alternatively polyadenlyated transcripts through the upregulation of the RNA binding protein HuD. In summary, combining conventional statistical methods and network analysis allowed us to interpret the differentially regulated genes from a systems perspective, yielding new insight into several biological pathways underlying homeostatic anti-epileptogenic effects and epileptogenicity. PMID:21695113
Roy, Sarah H; Tobin, David V; Memar, Nadin; Beltz, Eleanor; Holmen, Jenna; Clayton, Joseph E; Chiu, Daniel J; Young, Laura D; Green, Travis H; Lubin, Isabella; Liu, Yuying; Conradt, Barbara; Saito, R Mako
2014-02-28
The development and homeostasis of multicellular animals requires precise coordination of cell division and differentiation. We performed a genome-wide RNA interference screen in Caenorhabditis elegans to reveal the components of a regulatory network that promotes developmentally programmed cell-cycle quiescence. The 107 identified genes are predicted to constitute regulatory networks that are conserved among higher animals because almost half of the genes are represented by clear human orthologs. Using a series of mutant backgrounds to assess their genetic activities, the RNA interference clones displaying similar properties were clustered to establish potential regulatory relationships within the network. This approach uncovered four distinct genetic pathways controlling cell-cycle entry during intestinal organogenesis. The enhanced phenotypes observed for animals carrying compound mutations attest to the collaboration between distinct mechanisms to ensure strict developmental regulation of cell cycles. Moreover, we characterized ubc-25, a gene encoding an E2 ubiquitin-conjugating enzyme whose human ortholog, UBE2Q2, is deregulated in several cancers. Our genetic analyses suggested that ubc-25 acts in a linear pathway with cul-1/Cul1, in parallel to pathways employing cki-1/p27 and lin-35/pRb to promote cell-cycle quiescence. Further investigation of the potential regulatory mechanism demonstrated that ubc-25 activity negatively regulates CYE-1/cyclin E protein abundance in vivo. Together, our results show that the ubc-25-mediated pathway acts within a complex network that integrates the actions of multiple molecular mechanisms to control cell cycles during development. Copyright © 2014 Roy et al.
Zhou, Xionghui; Liu, Juan
2014-01-01
Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for phenotypic change.
Regenbogen, Sam; Wilkins, Angela D; Lichtarge, Olivier
2016-01-01
Biomedicine produces copious information it cannot fully exploit. Specifically, there is considerable need to integrate knowledge from disparate studies to discover connections across domains. Here, we used a Collaborative Filtering approach, inspired by online recommendation algorithms, in which non-negative matrix factorization (NMF) predicts interactions among chemicals, genes, and diseases only from pairwise information about their interactions. Our approach, applied to matrices derived from the Comparative Toxicogenomics Database, successfully recovered Chemical-Disease, Chemical-Gene, and Disease-Gene networks in 10-fold cross-validation experiments. Additionally, we could predict each of these interaction matrices from the other two. Integrating all three CTD interaction matrices with NMF led to good predictions of STRING, an independent, external network of protein-protein interactions. Finally, this approach could integrate the CTD and STRING interaction data to improve Chemical-Gene cross-validation performance significantly, and, in a time-stamped study, it predicted information added to CTD after a given date, using only data prior to that date. We conclude that collaborative filtering can integrate information across multiple types of biological entities, and that as a first step towards precision medicine it can compute drug repurposing hypotheses.
REGENBOGEN, SAM; WILKINS, ANGELA D.; LICHTARGE, OLIVIER
2015-01-01
Biomedicine produces copious information it cannot fully exploit. Specifically, there is considerable need to integrate knowledge from disparate studies to discover connections across domains. Here, we used a Collaborative Filtering approach, inspired by online recommendation algorithms, in which non-negative matrix factorization (NMF) predicts interactions among chemicals, genes, and diseases only from pairwise information about their interactions. Our approach, applied to matrices derived from the Comparative Toxicogenomics Database, successfully recovered Chemical-Disease, Chemical-Gene, and Disease-Gene networks in 10-fold cross-validation experiments. Additionally, we could predict each of these interaction matrices from the other two. Integrating all three CTD interaction matrices with NMF led to good predictions of STRING, an independent, external network of protein-protein interactions. Finally, this approach could integrate the CTD and STRING interaction data to improve Chemical-Gene cross-validation performance significantly, and, in a time-stamped study, it predicted information added to CTD after a given date, using only data prior to that date. We conclude that collaborative filtering can integrate information across multiple types of biological entities, and that as a first step towards precision medicine it can compute drug repurposing hypotheses. PMID:26776170
Multiple interactions amongst floral homeotic MADS box proteins.
Davies, B; Egea-Cortines, M; de Andrade Silva, E; Saedler, H; Sommer, H
1996-01-01
Most known floral homeotic genes belong to the MADS box family and their products act in combination to specify floral organ identity by an unknown mechanism. We have used a yeast two-hybrid system to investigate the network of interactions between the Antirrhinum organ identity gene products. Selective heterodimerization is observed between MADS box factors. Exclusive interactions are detected between two factors, DEFICIENS (DEF) and GLOBOSA (GLO), previously known to heterodimerize and control development of petals and stamens. In contrast, a third factor, PLENA (PLE), which is required for reproductive organ development, can interact with the products of MADS box genes expressed at early, intermediate and late stages. We also demonstrate that heterodimerization of DEF and GLO requires the K box, a domain not found in non-plant MADS box factors, indicating that the plant MADS box factors may have different criteria for interaction. The association of PLENA and the temporally intermediate MADS box factors suggests that part of their function in mediating between the meristem and organ identity genes is accomplished through direct interaction. These data reveal an unexpectedly complex network of interactions between the factors controlling flower development and have implications for the determination of organ identity. Images PMID:8861961
Xiong, Kun; Long, Lingling; Zhang, Xudong; Qu, Hongke; Deng, Haixiao; Ding, Yanjun; Cai, Jifeng; Wang, Shuchao; Wang, Mi; Liao, Lvshuang; Huang, Jufang; Yi, Chun-Xia; Yan, Jie
2017-10-01
Long non-coding RNAs (lncRNAs) display multiple functions including regulation of neuronal injury. However, their impact in methamphetamine (METH)-induced neurotoxicity has rarely been reported. Here, using microarray analysis, we investigated the expression profiling of lncRNAs and mRNAs in primary cultured prefrontal cortical neurons after METH treatment. We observed a difference in lncRNA and mRNA expression between the experimental and sham control groups. Using bioinformatics, we analyzed the highest enriched gene ontology (GO) terms of biological process, cellular component, and molecular function, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and pathway network analysis. Furthermore, an lncRNA-mRNA co-expression sub-network for aberrantly expressed terms revealed possible interactions of lncRNA NR_110713 and NR_027943 with their related genes. Afterwards, three lncRNAs (NR_110713, NR_027943, GAS5) and two mRNAs (Ddit3, Casp12) were targeted to validate the microarray data by qRT-PCR. This presented an overview of lncRNA and mRNA expression profiling and indicated that lncRNA might participate in METH-induced neuronal apoptosis by regulating the coding genes of neurons. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ohyanagi, Hajime; Takano, Tomoyuki; Terashima, Shin; Kobayashi, Masaaki; Kanno, Maasa; Morimoto, Kyoko; Kanegae, Hiromi; Sasaki, Yohei; Saito, Misa; Asano, Satomi; Ozaki, Soichi; Kudo, Toru; Yokoyama, Koji; Aya, Koichiro; Suwabe, Keita; Suzuki, Go; Aoki, Koh; Kubo, Yasutaka; Watanabe, Masao; Matsuoka, Makoto; Yano, Kentaro
2015-01-01
Comprehensive integration of large-scale omics resources such as genomes, transcriptomes and metabolomes will provide deeper insights into broader aspects of molecular biology. For better understanding of plant biology, we aim to construct a next-generation sequencing (NGS)-derived gene expression network (GEN) repository for a broad range of plant species. So far we have incorporated information about 745 high-quality mRNA sequencing (mRNA-Seq) samples from eight plant species (Arabidopsis thaliana, Oryza sativa, Solanum lycopersicum, Sorghum bicolor, Vitis vinifera, Solanum tuberosum, Medicago truncatula and Glycine max) from the public short read archive, digitally profiled the entire set of gene expression profiles, and drawn GENs by using correspondence analysis (CA) to take advantage of gene expression similarities. In order to understand the evolutionary significance of the GENs from multiple species, they were linked according to the orthology of each node (gene) among species. In addition to other gene expression information, functional annotation of the genes will facilitate biological comprehension. Currently we are improving the given gene annotations with natural language processing (NLP) techniques and manual curation. Here we introduce the current status of our analyses and the web database, PODC (Plant Omics Data Center; http://bioinf.mind.meiji.ac.jp/podc/), now open to the public, providing GENs, functional annotations and additional comprehensive omics resources. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Bălăcescu, Loredana; Bălăcescu, O; Crişan, N; Fetica, B; Petruţ, B; Bungărdean, Cătălina; Rus, Meda; Tudoran, Oana; Meurice, G; Irimie, Al; Dragoş, N; Berindan-Neagoe, Ioana
2011-01-01
Prostate cancer represents the first leading cause of cancer among western male population, with different clinical behavior ranging from indolent to metastatic disease. Although many molecules and deregulated pathways are known, the molecular mechanisms involved in the development of prostate cancer are not fully understood. The aim of this study was to explore the molecular variation underlying the prostate cancer, based on microarray analysis and bioinformatics approaches. Normal and prostate cancer tissues were collected by macrodissection from prostatectomy pieces. All prostate cancer specimens used in our study were Gleason score 7. Gene expression microarray (Agilent Technologies) was used for Whole Human Genome evaluation. The bioinformatics and functional analysis were based on Limma and Ingenuity software. The microarray analysis identified 1119 differentially expressed genes between prostate cancer and normal prostate, which were up- or down-regulated at least 2-fold. P-values were adjusted for multiple testing using Benjamini-Hochberg method with a false discovery rate of 0.01. These genes were analyzed with Ingenuity Pathway Analysis software and were established 23 genetic networks. Our microarray results provide new information regarding the molecular networks in prostate cancer stratified as Gleason 7. These data highlighted gene expression profiles for better understanding of prostate cancer progression.
Plasticity of genetic interactions in metabolic networks of yeast.
Harrison, Richard; Papp, Balázs; Pál, Csaba; Oliver, Stephen G; Delneri, Daniela
2007-02-13
Why are most genes dispensable? The impact of gene deletions may depend on the environment (plasticity), the presence of compensatory mechanisms (mutational robustness), or both. Here, we analyze the interaction between these two forces by exploring the condition-dependence of synthetic genetic interactions that define redundant functions and alternative pathways. We performed systems-level flux balance analysis of the yeast (Saccharomyces cerevisiae) metabolic network to identify genetic interactions and then tested the model's predictions with in vivo gene-deletion studies. We found that the majority of synthetic genetic interactions are restricted to certain environmental conditions, partly because of the lack of compensation under some (but not all) nutrient conditions. Moreover, the phylogenetic cooccurrence of synthetically interacting pairs is not significantly different from random expectation. These findings suggest that these gene pairs have at least partially independent functions, and, hence, compensation is only a byproduct of their evolutionary history. Experimental analyses that used multiple gene deletion strains not only confirmed predictions of the model but also showed that investigation of false predictions may both improve functional annotation within the model and also lead to the discovery of higher-order genetic interactions. Our work supports the view that functional redundancy may be more apparent than real, and it offers a unified framework for the evolution of environmental adaptation and mutational robustness.
Recent advances and versatility of MAGE towards industrial applications.
Singh, Vijai; Braddick, Darren
2015-12-01
The genome engineering toolkit has expanded significantly in recent years, allowing us to study the functions of genes in cellular networks and assist in over-production of proteins, drugs, chemicals and biofuels. Multiplex automated genome engineering (MAGE) has been recently developed and gained more scientific interest towards strain engineering. MAGE is a simple, rapid and efficient tool for manipulating genes simultaneously in multiple loci, assigning genetic codes and integrating non-natural amino acids. MAGE can be further expanded towards the engineering of fast, robust and over-producing strains for chemicals, drugs and biofuels at industrial scales.
The evolution of dorsal-ventral patterning mechanisms in insects.
Lynch, Jeremy A; Roth, Siegfried
2011-01-15
The gene regulatory network (GRN) underpinning dorsal-ventral (DV) patterning of the Drosophila embryo is among the most thoroughly understood GRNs, making it an ideal system for comparative studies seeking to understand the evolution of development. With the emergence of widely applicable techniques for testing gene function, species with sequenced genomes, and multiple tractable species with diverse developmental modes, a phylogenetically broad and molecularly deep understanding of the evolution of DV axis formation in insects is feasible. Here, we review recent progress made in this field, compare our emerging molecular understanding to classical embryological experiments, and suggest future directions of inquiry.
Metabolic networks in motion: 13C-based flux analysis
Sauer, Uwe
2006-01-01
Many properties of complex networks cannot be understood from monitoring the components—not even when comprehensively monitoring all protein or metabolite concentrations—unless such information is connected and integrated through mathematical models. The reason is that static component concentrations, albeit extremely informative, do not contain functional information per se. The functional behavior of a network emerges only through the nonlinear gene, protein, and metabolite interactions across multiple metabolic and regulatory layers. I argue here that intracellular reaction rates are the functional end points of these interactions in metabolic networks, hence are highly relevant for systems biology. Methods for experimental determination of metabolic fluxes differ fundamentally from component concentration measurements; that is, intracellular reaction rates cannot be detected directly, but must be estimated through computer model-based interpretation of stable isotope patterns in products of metabolism. PMID:17102807
Jinawath, Natini; Bunbanjerdsuk, Sacarin; Chayanupatkul, Maneerat; Ngamphaiboon, Nuttapong; Asavapanumas, Nithi; Svasti, Jisnuson; Charoensawan, Varodom
2016-11-22
With the wealth of data accumulated from completely sequenced genomes and other high-throughput experiments, global studies of biological systems, by simultaneously investigating multiple biological entities (e.g. genes, transcripts, proteins), has become a routine. Network representation is frequently used to capture the presence of these molecules as well as their relationship. Network biology has been widely used in molecular biology and genetics, where several network properties have been shown to be functionally important. Here, we discuss how such methodology can be useful to translational biomedical research, where scientists traditionally focus on one or a small set of genes, diseases, and drug candidates at any one time. We first give an overview of network representation frequently used in biology: what nodes and edges represent, and review its application in preclinical research to date. Using cancer as an example, we review how network biology can facilitate system-wide approaches to identify targeted small molecule inhibitors. These types of inhibitors have the potential to be more specific, resulting in high efficacy treatments with less side effects, compared to the conventional treatments such as chemotherapy. Global analysis may provide better insight into the overall picture of human diseases, as well as identify previously overlooked problems, leading to rapid advances in medicine. From the clinicians' point of view, it is necessary to bridge the gap between theoretical network biology and practical biomedical research, in order to improve the diagnosis, prevention, and treatment of the world's major diseases.
Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks
Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina
2017-01-01
Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD. PMID:29262568
Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks.
Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina
2017-11-28
Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.
Saul, M C; Majdak, P; Perez, S; Reilly, M; Garland, T; Rhodes, J S
2017-03-01
Although exercise is critical for health, many lack the motivation to exercise, and it is unclear how motivation might be increased. To uncover the molecular underpinnings of increased motivation for exercise, we analyzed the transcriptome of the striatum in four mouse lines selectively bred for high voluntary wheel running and four non-selected control lines. The striatum was dissected and RNA was extracted and sequenced from four individuals of each line. We found multiple genes and gene systems with strong relationships to both selection and running history over the previous 6 days. Among these genes were Htr1b, a serotonin receptor subunit and Slc38a2, a marker for both glutamatergic and γ-aminobutyric acid (GABA)-ergic signaling. System analysis of the raw results found enrichment of transcriptional regulation and kinase genes. Further, we identified a splice variant affecting the Wnt-related Golgi signaling gene Tmed5. Using coexpression network analysis, we found a cluster of interrelated coexpression modules with relationships to running behavior. From these modules, we built a network correlated with running that predicts a mechanistic relationship between transcriptional regulation by nucleosome structure and Htr1b expression. The Library of Integrated Network-Based Cellular Signatures identified the protein kinase C δ inhibitor, rottlerin, the tyrosine kinase inhibitor, Linifanib and the delta-opioid receptor antagonist 7-benzylidenenaltrexone as potential compounds for increasing the motivation to run. Taken together, our findings support a neurobiological framework of exercise motivation where chromatin state leads to differences in dopamine signaling through modulation of both the primary neurotransmitters glutamate and GABA, and by neuromodulators such as serotonin. © 2016 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
A high resolution atlas of gene expression in the domestic sheep (Ovis aries)
Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.
2017-01-01
Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238
A high resolution atlas of gene expression in the domestic sheep (Ovis aries).
Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A
2017-09-01
Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.
Gene Expression Profile of Human Cytokines in Response to B.pseudomallei Infection
2017-04-19
responses 81 to an infection (6). Activation of leukocytes and cytokine networks are prominent 82 features of inflammation and the septic response (7...Nationwide active surveillance for melioidosis was established in multiple state and 106 private hospitals throughout Sri Lanka, with ethics...time of recruitment all melioidosis 122 patients were undergoing antibacterial treatment. 123 We also recruited healthy donors and patients fitting
Sharma, Rita; Cao, Peijian; Jung, Ki-Hong; Sharma, Manoj K.; Ronald, Pamela C.
2013-01-01
Glycoside hydrolases (GH) catalyze the hydrolysis of glycosidic bonds in cell wall polymers and can have major effects on cell wall architecture. Taking advantage of the massive datasets available in public databases, we have constructed a rice phylogenomic database of GHs (http://ricephylogenomics.ucdavis.edu/cellwalls/gh/). This database integrates multiple data types including the structural features, orthologous relationships, mutant availability, and gene expression patterns for each GH family in a phylogenomic context. The rice genome encodes 437 GH genes classified into 34 families. Based on pairwise comparison with eight dicot and four monocot genomes, we identified 138 GH genes that are highly diverged between monocots and dicots, 57 of which have diverged further in rice as compared with four monocot genomes scanned in this study. Chromosomal localization and expression analysis suggest a role for both whole-genome and localized gene duplications in expansion and diversification of GH families in rice. We examined the meta-profiles of expression patterns of GH genes in twenty different anatomical tissues of rice. Transcripts of 51 genes exhibit tissue or developmental stage-preferential expression, whereas, seventeen other genes preferentially accumulate in actively growing tissues. When queried in RiceNet, a probabilistic functional gene network that facilitates functional gene predictions, nine out of seventeen genes form a regulatory network with the well-characterized genes involved in biosynthesis of cell wall polymers including cellulose synthase and cellulose synthase-like genes of rice. Two-thirds of the GH genes in rice are up regulated in response to biotic and abiotic stress treatments indicating a role in stress adaptation. Our analyses identify potential GH targets for cell wall modification. PMID:23986771
ARNetMiT R Package: association rules based gene co-expression networks of miRNA targets.
Özgür Cingiz, M; Biricik, G; Diri, B
2017-03-31
miRNAs are key regulators that bind to target genes to suppress their gene expression level. The relations between miRNA-target genes enable users to derive co-expressed genes that may be involved in similar biological processes and functions in cells. We hypothesize that target genes of miRNAs are co-expressed, when they are regulated by multiple miRNAs. With the usage of these co-expressed genes, we can theoretically construct co-expression networks (GCNs) related to 152 diseases. In this study, we introduce ARNetMiT that utilize a hash based association rule algorithm in a novel way to infer the GCNs on miRNA-target genes data. We also present R package of ARNetMiT, which infers and visualizes GCNs of diseases that are selected by users. Our approach assumes miRNAs as transactions and target genes as their items. Support and confidence values are used to prune association rules on miRNA-target genes data to construct support based GCNs (sGCNs) along with support and confidence based GCNs (scGCNs). We use overlap analysis and the topological features for the performance analysis of GCNs. We also infer GCNs with popular GNI algorithms for comparison with the GCNs of ARNetMiT. Overlap analysis results show that ARNetMiT outperforms the compared GNI algorithms. We see that using high confidence values in scGCNs increase the ratio of the overlapped gene-gene interactions between the compared methods. According to the evaluation of the topological features of ARNetMiT based GCNs, the degrees of nodes have power-law distribution. The hub genes discovered by ARNetMiT based GCNs are consistent with the literature.
Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.
Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina
2015-01-01
Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Sato, Masanao; Tsuda, Kenichi; Wang, Lin; Coller, John; Watanabe, Yuichiro; Glazebrook, Jane; Katagiri, Fumiaki
2010-01-01
Biological signaling processes may be mediated by complex networks in which network components and network sectors interact with each other in complex ways. Studies of complex networks benefit from approaches in which the roles of individual components are considered in the context of the network. The plant immune signaling network, which controls inducible responses to pathogen attack, is such a complex network. We studied the Arabidopsis immune signaling network upon challenge with a strain of the bacterial pathogen Pseudomonas syringae expressing the effector protein AvrRpt2 (Pto DC3000 AvrRpt2). This bacterial strain feeds multiple inputs into the signaling network, allowing many parts of the network to be activated at once. mRNA profiles for 571 immune response genes of 22 Arabidopsis immunity mutants and wild type were collected 6 hours after inoculation with Pto DC3000 AvrRpt2. The mRNA profiles were analyzed as detailed descriptions of changes in the network state resulting from the genetic perturbations. Regulatory relationships among the genes corresponding to the mutations were inferred by recursively applying a non-linear dimensionality reduction procedure to the mRNA profile data. The resulting static network model accurately predicted 23 of 25 regulatory relationships reported in the literature, suggesting that predictions of novel regulatory relationships are also accurate. The network model revealed two striking features: (i) the components of the network are highly interconnected; and (ii) negative regulatory relationships are common between signaling sectors. Complex regulatory relationships, including a novel negative regulatory relationship between the early microbe-associated molecular pattern-triggered signaling sectors and the salicylic acid sector, were further validated. We propose that prevalent negative regulatory relationships among the signaling sectors make the plant immune signaling network a “sector-switching” network, which effectively balances two apparently conflicting demands, robustness against pathogenic perturbations and moderation of negative impacts of immune responses on plant fitness. PMID:20661428
Kitchen, James L.; Allaby, Robin G.
2013-01-01
Selection and adaptation of individuals to their underlying environments are highly dynamical processes, encompassing interactions between the individual and its seasonally changing environment, synergistic or antagonistic interactions between individuals and interactions amongst the regulatory genes within the individual. Plants are useful organisms to study within systems modeling because their sedentary nature simplifies interactions between individuals and the environment, and many important plant processes such as germination or flowering are dependent on annual cycles which can be disrupted by climate behavior. Sedentism makes plants relevant candidates for spatially explicit modeling that is tied in with dynamical environments. We propose that in order to fully understand the complexities behind plant adaptation, a system that couples aspects from systems biology with population and landscape genetics is required. A suitable system could be represented by spatially explicit individual-based models where the virtual individuals are located within time-variable heterogeneous environments and contain mutable regulatory gene networks. These networks could directly interact with the environment, and should provide a useful approach to studying plant adaptation. PMID:27137364
Rozengurt, Enrique; Sinnett-Smith, James; Eibl, Guido
2018-01-01
Pancreatic ductal adenocarcinoma (PDAC) is generally a fatal disease with no efficacious treatment modalities. Elucidation of signaling mechanisms that will lead to the identification of novel targets for therapy and chemoprevention is urgently needed. Here, we review the role of Yes-associated protein (YAP) and WW-domain-containing Transcriptional co-Activator with a PDZ-binding motif (TAZ) in the development of PDAC. These oncogenic proteins are at the center of a signaling network that involves multiple upstream signals and downstream YAP-regulated genes. We also discuss the clinical significance of the YAP signaling network in PDAC using a recently published interactive open-access database (www.proteinatlas.org/pathology) that allows genome-wide exploration of the impact of individual proteins on survival outcomes. Multiple YAP/TEAD-regulated genes, including AJUBA , ANLN , AREG , ARHGAP29 , AURKA , BUB1 , CCND1 , CDK6, CXCL5 , EDN2 , DKK1 , FOSL1,FOXM1 , HBEGF , IGFBP2 , JAG1 , NOTCH2 , RHAMM , RRM2 , SERP1 , and ZWILCH , are associated with unfavorable survival of PDAC patients. Similarly, components of AP-1 that synergize with YAP ( FOSL1 ), growth factors (TGFα, EPEG, and HBEGF), a specific integrin ( ITGA2 ), heptahelical receptors ( P2Y 2 R , GPR87 ) and an inhibitor of the Hippo pathway ( MUC1 ), all of which stimulate YAP activity, are associated with unfavorable survival of PDAC patients. By contrast, YAP inhibitory pathways (STRAD/LKB-1/AMPK, PKA/LATS, and TSC/mTORC1) indicate a favorable prognosis. These associations emphasize that the YAP signaling network correlates with poor survival of pancreatic cancer patients. We conclude that the YAP pathway is a major determinant of clinical aggressiveness in PDAC patients and a target for therapeutic and preventive strategies in this disease.
Carré, Clément; Mas, André; Krouk, Gabriel
2017-01-01
Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.
From gene networks to drugs: systems pharmacology approaches for AUD.
Ferguson, Laura B; Harris, R Adron; Mayfield, Roy Dayne
2018-06-01
The alcohol research field has amassed an impressive number of gene expression datasets spanning key brain areas for addiction, species (humans as well as multiple animal models), and stages in the addiction cycle (binge/intoxication, withdrawal/negative effect, and preoccupation/anticipation). These data have improved our understanding of the molecular adaptations that eventually lead to dysregulation of brain function and the chronic, relapsing disorder of addiction. Identification of new medications to treat alcohol use disorder (AUD) will likely benefit from the integration of genetic, genomic, and behavioral information included in these important datasets. Systems pharmacology considers drug effects as the outcome of the complex network of interactions a drug has rather than a single drug-molecule interaction. Computational strategies based on this principle that integrate gene expression signatures of pharmaceuticals and disease states have shown promise for identifying treatments that ameliorate disease symptoms (called in silico gene mapping or connectivity mapping). In this review, we suggest that gene expression profiling for in silico mapping is critical to improve drug repurposing and discovery for AUD and other psychiatric illnesses. We highlight studies that successfully apply gene mapping computational approaches to identify or repurpose pharmaceutical treatments for psychiatric illnesses. Furthermore, we address important challenges that must be overcome to maximize the potential of these strategies to translate to the clinic and improve healthcare outcomes.
Nayak, Renuka R.; Kearns, Michael; Spielman, Richard S.; Cheung, Vivian G.
2009-01-01
Genes interact in networks to orchestrate cellular processes. Analysis of these networks provides insights into gene interactions and functions. Here, we took advantage of normal variation in human gene expression to infer gene networks, which we constructed using correlations in expression levels of more than 8.5 million gene pairs in immortalized B cells from three independent samples. The resulting networks allowed us to identify biological processes and gene functions. Among the biological pathways, we found processes such as translation and glycolysis that co-occur in the same subnetworks. We predicted the functions of poorly characterized genes, including CHCHD2 and TMEM111, and provided experimental evidence that TMEM111 is part of the endoplasmic reticulum-associated secretory pathway. We also found that IFIH1, a susceptibility gene of type 1 diabetes, interacts with YES1, which plays a role in glucose transport. Furthermore, genes that predispose to the same diseases are clustered nonrandomly in the coexpression network, suggesting that networks can provide candidate genes that influence disease susceptibility. Therefore, our analysis of gene coexpression networks offers information on the role of human genes in normal and disease processes. PMID:19797678
NIBBS-search for fast and accurate prediction of phenotype-biased metabolic systems.
Schmidt, Matthew C; Rocha, Andrea M; Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R; Samatova, Nagiza F
2012-01-01
Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS.
NIBBS-Search for Fast and Accurate Prediction of Phenotype-Biased Metabolic Systems
Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R.; Samatova, Nagiza F.
2012-01-01
Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS. PMID:22589706
Refining Pathways: A Model Comparison Approach
Moffa, Giusi; Erdmann, Gerrit; Voloshanenko, Oksana; Hundsrucker, Christian; Sadeh, Mohammad J.; Boutros, Michael; Spang, Rainer
2016-01-01
Cellular signalling pathways consolidate multiple molecular interactions into working models of signal propagation, amplification, and modulation. They are described and visualized as networks. Adjusting network topologies to experimental data is a key goal of systems biology. While network reconstruction algorithms like nested effects models are well established tools of computational biology, their data requirements can be prohibitive for their practical use. In this paper we suggest focussing on well defined aspects of a pathway and develop the computational tools to do so. We adapt the framework of nested effect models to focus on a specific aspect of activated Wnt signalling in HCT116 colon cancer cells: Does the activation of Wnt target genes depend on the secretion of Wnt ligands or do mutations in the signalling molecule β-catenin make this activation independent from them? We framed this question into two competing classes of models: Models that depend on Wnt ligands secretion versus those that do not. The model classes translate into restrictions of the pathways in the network topology. Wnt dependent models are more flexible than Wnt independent models. Bayes factors are the standard Bayesian tool to compare different models fairly on the data evidence. In our analysis, the Bayes factors depend on the number of potential Wnt signalling target genes included in the models. Stability analysis with respect to this number showed that the data strongly favours Wnt ligands dependent models for all realistic numbers of target genes. PMID:27248690
Temperature dependence of the multistability of lactose utilization network of Escherichia coli
NASA Astrophysics Data System (ADS)
Nepal, Sudip; Kumar, Pradeep
Biological systems are capable of producing multiple states out of a single set of inputs. Multistability acts like a biological switch that allows organisms to respond differently to different environmental conditions and hence plays an important role in adaptation to changing environment. One of the widely studied gene regulatory networks underlying the metabolism of bacteria is the lactose utilization network, which exhibits a multistable behavior as a function of lactose concentration. We have studied the effect of temperature on multistability of the lactose utilization network at various concentrations of thio-methylgalactoside (TMG), a synthetic lactose. We find that while the lactose utilization network exhibits a bistable behavior for temperature T >20° C , a graded response arises for temperature T <=20° C. Furthermore, we construct a phase diagram of the graded and bistable response of lactose utilization network as a function of temperature and TMG concentration. Our results suggest that environmental conditions, in this case temperature, can alter the nature of cellular regulation of metabolism.
2012-01-01
Background Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. Results Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. Conclusions The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher’s discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions. PMID:22849515
Wang, Ping; Lin, Mingyan; Pedrosa, Erika; Hrabovsky, Anastasia; Zhang, Zheng; Guo, Wenjun; Lachman, Herbert M; Zheng, Deyou
2015-01-01
Disruptive mutation in the CHD8 gene is one of the top genetic risk factors in autism spectrum disorders (ASDs). Previous analyses of genome-wide CHD8 occupancy and reduced expression of CHD8 by shRNA knockdown in committed neural cells showed that CHD8 regulates multiple cell processes critical for neural functions, and its targets are enriched with ASD-associated genes. To further understand the molecular links between CHD8 functions and ASD, we have applied the CRISPR/Cas9 technology to knockout one copy of CHD8 in induced pluripotent stem cells (iPSCs) to better mimic the loss-of-function status that would exist in the developing human embryo prior to neuronal differentiation. We then carried out transcriptomic and bioinformatic analyses of neural progenitors and neurons derived from the CHD8 mutant iPSCs. Transcriptome profiling revealed that CHD8 hemizygosity (CHD8 (+/-)) affected the expression of several thousands of genes in neural progenitors and early differentiating neurons. The differentially expressed genes were enriched for functions of neural development, β-catenin/Wnt signaling, extracellular matrix, and skeletal system development. They also exhibited significant overlap with genes previously associated with autism and schizophrenia, as well as the downstream transcriptional targets of multiple genes implicated in autism. Providing important insight into how CHD8 mutations might give rise to macrocephaly, we found that seven of the twelve genes associated with human brain volume or head size by genome-wide association studies (e.g., HGMA2) were dysregulated in CHD8 (+/-) neural progenitors or neurons. We have established a renewable source of CHD8 (+/-) iPSC lines that would be valuable for investigating the molecular and cellular functions of CHD8. Transcriptomic profiling showed that CHD8 regulates multiple genes implicated in ASD pathogenesis and genes associated with brain volume.
Topology association analysis in weighted protein interaction network for gene prioritization
NASA Astrophysics Data System (ADS)
Wu, Shunyao; Shao, Fengjing; Zhang, Qi; Ji, Jun; Xu, Shaojie; Sun, Rencheng; Sun, Gengxin; Du, Xiangjun; Sui, Yi
2016-11-01
Although lots of algorithms for disease gene prediction have been proposed, the weights of edges are rarely taken into account. In this paper, the strengths of topology associations between disease and essential genes are analyzed in weighted protein interaction network. Empirical analysis demonstrates that compared to other genes, disease genes are weakly connected with essential genes in protein interaction network. Based on this finding, a novel global distance measurement for gene prioritization with weighted protein interaction network is proposed in this paper. Positive and negative flow is allocated to disease and essential genes, respectively. Additionally network propagation model is extended for weighted network. Experimental results on 110 diseases verify the effectiveness and potential of the proposed measurement. Moreover, weak links play more important role than strong links for gene prioritization, which is meaningful to deeply understand protein interaction network.
Kulmuni, J; Westram, A M
2017-06-01
The possibility of intrinsic barriers to gene flow is often neglected in empirical research on local adaptation and speciation with gene flow, for example when interpreting patterns observed in genome scans. However, we draw attention to the fact that, even with gene flow, divergent ecological selection may generate intrinsic barriers involving both ecologically selected and other interacting loci. Mechanistically, the link between the two types of barriers may be generated by genes that have multiple functions (i.e., pleiotropy), and/or by gene interaction networks. Because most genes function in complex networks, and their evolution is not independent of other genes, changes evolving in response to ecological selection can generate intrinsic barriers as a by-product. A crucial question is to what extent such by-product barriers contribute to divergence and speciation-that is whether they stably reduce gene flow. We discuss under which conditions by-product barriers may increase isolation. However, we also highlight that, depending on the conditions (e.g., the amount of gene flow and the strength of selection acting on the intrinsic vs. the ecological barrier component), the intrinsic incompatibility may actually destabilize barriers to gene flow. In practice, intrinsic barriers generated as a by-product of divergent ecological selection may generate peaks in genome scans that cannot easily be interpreted. We argue that empirical studies on divergence with gene flow should consider the possibility of both ecological and intrinsic barriers. Future progress will likely come from work combining population genomic studies, experiments quantifying fitness and molecular studies on protein function and interactions. © 2017 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Feng, Fan; Qi, Weiwei; Lv, Yuanda; Yan, Shumei; Xu, Liming; Yang, Wenyao; Yuan, Yue; Chen, Yihan
2018-01-01
Maize (Zea mays) endosperm is a primary tissue for nutrient storage and is highly differentiated during development. However, the regulatory networks of endosperm development and nutrient metabolism remain largely unknown. Maize opaque11 (o11) is a classic seed mutant with a small and opaque endosperm showing decreased starch and protein accumulation. We cloned O11 and found that it encodes an endosperm-specific bHLH transcription factor (TF). Loss of function of O11 significantly affected transcription of carbohydrate/amino acid metabolism and stress response genes. Genome-wide binding site analysis revealed 9885 O11 binding sites distributed over 6033 genes. Using chromatin immunoprecipitation sequencing (ChIP-seq) coupled with RNA sequencing (RNA-seq) assays, we identified 259 O11-modulated target genes. O11 was found to directly regulate key TFs in endosperm development (NKD2 and ZmDOF3) and nutrient metabolism (O2 and PBF). Moreover, O11 directly regulates cyPPDKs and multiple carbohydrate metabolic enzymes. O11 is an activator of ZmYoda, suggesting its regulatory function through the MAPK pathway in endosperm development. Many stress-response genes are also direct targets of O11. In addition, 11 O11-interacting proteins were identified, including ZmIce1, which coregulates stress response targets and ZmYoda with O11. Therefore, this study reveals an endosperm regulatory network centered around O11, which coordinates endosperm development, metabolism and stress responses. PMID:29436476
Walker, Emily; Chang, Wing Y.; Hunkapiller, Julie; Cagney, Gerard; Garcha, Kamal; Torchia, Joseph; Krogan, Nevan J.; Reiter, Jeremy F.; Stanford, William L.
2010-01-01
Summary Polycomb group (PcG) proteins are conserved epigenetic transcriptional repressors that control numerous developmental gene expression programs and have recently been implicated in modulating embryonic stem cell (ESC) fate. We identified the PcG protein PCL2 (polycomb-like 2) in a genome-wide screen for regulators of self-renewal and pluripotency and predicted that it would play an important role in mouse ESC fate determination. Using multiple biochemical strategies, we provide evidence that PCL2 is a Polycomb Repressive Complex 2 (PRC2)-associated protein in mouse ESCs. Knockdown of Pcl2 in ESCs resulted in heightened self-renewal characteristics, defects in differentiation and altered patterns of histone methylation. Integration of global gene expression and promoter occupancy analyses allowed us to identify PCL2 and PRC2 transcriptional targets and draft regulatory networks. We describe the role of PCL2 in both modulating transcription of ESC self-renewal genes in undifferentiated ESCs as well as developmental regulators during early commitment and differentiation. PMID:20144788
Culyba, Matthew J; Kubiak, Jeffrey M; Mo, Charlie Y; Goulian, Mark; Kohli, Rahul M
2018-06-01
Biochemical pathways are often genetically encoded as simple transcription regulation networks, where one transcription factor regulates the expression of multiple genes in a pathway. The relative timing of each promoter's activation and shut-off within the network can impact physiology. In the DNA damage repair pathway (known as the SOS response) of Escherichia coli, approximately 40 genes are regulated by the LexA repressor. After a DNA damaging event, LexA degradation triggers SOS gene transcription, which is temporally separated into subsets of 'early', 'middle', and 'late' genes. Although this feature plays an important role in regulating the SOS response, both the range of this separation and its underlying mechanism are not experimentally defined. Here we show that, at low doses of DNA damage, the timing of promoter activities is not separated. Instead, timing differences only emerge at higher levels of DNA damage and increase as a function of DNA damage dose. To understand mechanism, we derived a series of synthetic SOS gene promoters which vary in LexA-operator binding kinetics, but are otherwise identical, and then studied their activity over a large dose-range of DNA damage. In distinction to established models based on rapid equilibrium assumptions, the data best fit a kinetic model of repressor occupancy at promoters, where the drop in cellular LexA levels associated with higher doses of DNA damage leads to non-equilibrium binding kinetics of LexA at operators. Operators with slow LexA binding kinetics achieve their minimal occupancy state at later times than operators with fast binding kinetics, resulting in a time separation of peak promoter activity between genes. These data provide insight into this remarkable feature of the SOS pathway by demonstrating how a single transcription factor can be employed to control the relative timing of each gene's transcription as a function of stimulus dose.
Uncovering disease mechanisms through network biology in the era of Next Generation Sequencing
NASA Astrophysics Data System (ADS)
Piñero, Janet; Berenstein, Ariel; Gonzalez-Perez, Abel; Chernomoretz, Ariel; Furlong, Laura I.
2016-04-01
Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules.
Uncovering disease mechanisms through network biology in the era of Next Generation Sequencing
Piñero, Janet; Berenstein, Ariel; Gonzalez-Perez, Abel; Chernomoretz, Ariel; Furlong, Laura I.
2016-01-01
Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules. PMID:27080396
Gupta, Sanjay K.; Dahiya, Saurabh; Lundy, Robert F.; Kumar, Ashok
2010-01-01
Background Skeletal muscle wasting is a debilitating consequence of large number of disease states and conditions. Tumor necrosis factor-α (TNF-α) is one of the most important muscle-wasting cytokine, elevated levels of which cause significant muscular abnormalities. However, the underpinning molecular mechanisms by which TNF-α causes skeletal muscle wasting are less well-understood. Methodology/Principal Findings We have used microarray, quantitative real-time PCR (QRT-PCR), Western blot, and bioinformatics tools to study the effects of TNF-α on various molecular pathways and gene networks in C2C12 cells (a mouse myoblastic cell line). Microarray analyses of C2C12 myotubes treated with TNF-α (10 ng/ml) for 18h showed differential expression of a number of genes involved in distinct molecular pathways. The genes involved in nuclear factor-kappa B (NF-kappaB) signaling, 26s proteasome pathway, Notch1 signaling, and chemokine networks are the most important ones affected by TNF-α. The expression of some of the genes in microarray dataset showed good correlation in independent QRT-PCR and Western blot assays. Analysis of TNF-treated myotubes showed that TNF-α augments the activity of both canonical and alternative NF-κB signaling pathways in myotubes. Bioinformatics analyses of microarray dataset revealed that TNF-α affects the activity of several important pathways including those involved in oxidative stress, hepatic fibrosis, mitochondrial dysfunction, cholesterol biosynthesis, and TGF-β signaling. Furthermore, TNF-α was found to affect the gene networks related to drug metabolism, cell cycle, cancer, neurological disease, organismal injury, and abnormalities in myotubes. Conclusions TNF-α regulates the expression of multiple genes involved in various toxic pathways which may be responsible for TNF-induced muscle loss in catabolic conditions. Our study suggests that TNF-α activates both canonical and alternative NF-κB signaling pathways in a time-dependent manner in skeletal muscle cells. The study provides novel insight into the mechanisms of action of TNF-α in skeletal muscle cells. PMID:20967264
Putnam, Christopher D.; Srivatsan, Anjana; Nene, Rahul V.; Martinez, Sandra L.; Clotfelter, Sarah P.; Bell, Sara N.; Somach, Steven B.; E.S. de Souza, Jorge; Fonseca, André F.; de Souza, Sandro J.; Kolodner, Richard D.
2016-01-01
Gross chromosomal rearrangements (GCRs) play an important role in human diseases, including cancer. The identity of all Genome Instability Suppressing (GIS) genes is not currently known. Here multiple Saccharomyces cerevisiae GCR assays and query mutations were crossed into arrays of mutants to identify progeny with increased GCR rates. One hundred eighty two GIS genes were identified that suppressed GCR formation. Another 438 cooperatively acting GIS genes were identified that were not GIS genes, but suppressed the increased genome instability caused by individual query mutations. Analysis of TCGA data using the human genes predicted to act in GIS pathways revealed that a minimum of 93% of ovarian and 66% of colorectal cancer cases had defects affecting one or more predicted GIS gene. These defects included loss-of-function mutations, copy-number changes associated with reduced expression, and silencing. In contrast, acute myeloid leukaemia cases did not appear to have defects affecting the predicted GIS genes. PMID:27071721
Integrative FourD omics approach profiles the target network of the carbon storage regulatory system
Sowa, Steven W.; Gelderman, Grant; Leistra, Abigail N.; Buvanendiran, Aishwarya; Lipp, Sarah; Pitaktong, Areen; Vakulskas, Christopher A.; Romeo, Tony; Baldea, Michael
2017-01-01
Abstract Multi-target regulators represent a largely untapped area for metabolic engineering and anti-bacterial development. These regulators are complex to characterize because they often act at multiple levels, affecting proteins, transcripts and metabolites. Therefore, single omics experiments cannot profile their underlying targets and mechanisms. In this work, we used an Integrative FourD omics approach (INFO) that consists of collecting and analyzing systems data throughout multiple time points, using multiple genetic backgrounds, and multiple omics approaches (transcriptomics, proteomics and high throughput sequencing crosslinking immunoprecipitation) to evaluate simultaneous changes in gene expression after imposing an environmental stress that accentuates the regulatory features of a network. Using this approach, we profiled the targets and potential regulatory mechanisms of a global regulatory system, the well-studied carbon storage regulatory (Csr) system of Escherichia coli, which is widespread among bacteria. Using 126 sets of proteomics and transcriptomics data, we identified 136 potential direct CsrA targets, including 50 novel ones, categorized their behaviors into distinct regulatory patterns, and performed in vivo fluorescence-based follow up experiments. The results of this work validate 17 novel mRNAs as authentic direct CsrA targets and demonstrate a generalizable strategy to integrate multiple lines of omics data to identify a core pool of regulator targets. PMID:28126921
IL-17A Mediates a Selective Gene Expression Profile in Asthmatic Human Airway Smooth Muscle Cells
Dragon, Stéphane; Hirst, Stuart J.; Lee, Tak H.
2014-01-01
Airway smooth muscle (ASM) cells are thought to contribute to the pathogenesis of allergic asthma by orchestrating and perpetuating airway inflammation and remodeling responses. In this study, we evaluated the IL-17RA signal transduction and gene expression profile in ASM cells from subjects with mild asthma and healthy individuals. Human primary ASM cells were treated with IL-17A and probed by the Affymetrix GeneChip array, and gene targets were validated by real-time quantitative RT-PCR. Genomic analysis underlined the proinflammatory nature of IL-17A, as multiple NF-κB regulatory factors and chemokines were induced in ASM cells. Transcriptional regulators consisting of primary response genes were overrepresented and displayed dynamic expression profiles. IL-17A poorly enhanced IL-1β or IL-22 gene responses in ASM cells from both subjects with mild asthma and healthy donors. Interestingly, protein modifications to the NF-κB regulatory network were not observed after IL-17A stimulation, although oscillations in IκBε expression were detected. ASM cells from subjects with mild asthma up-regulated more genes with greater overall variability in response to IL-17A than from healthy donors. Finally, in response to IL-17A, ASM cells displayed rapid activation of the extracellular signal–regulated kinase/ribosomal S6 kinase signaling pathway and increased nuclear levels of phosphorylated extracellular signal–regulated kinase. Taken together, our results suggest that IL-17A mediated modest gene expression response, which, in cooperation with the NF-κB signaling network, may regulate the gene expression profile in ASM cells. PMID:24393021
Hamby, Mary E.; Coppola, Giovanni; Ao, Yan; Geschwind, Daniel H.; Khakh, Baljit S.; Sofroniew, Michael V.
2012-01-01
Inflammation features in CNS disorders such as stroke, trauma, neurodegeneration, infection, and autoimmunity in which astrocytes play critical roles. To elucidate how inflammatory mediators alter astrocyte functions, we examined effects of transforming growth factor-β1 (TGF-β1), lipopolysaccharide (LPS), and interferon-gamma (IFNγ), alone and in combination, on purified, mouse primary cortical astrocyte cultures. We used microarrays to conduct whole-genome expression profiling, and measured calcium signaling, which is implicated in mediating dynamic astrocyte functions. Combinatorial exposure to TGF-β1, LPS, and IFNγ significantly modulated astrocyte expression of >6800 gene probes, including >380 synergistic changes not predicted by summing individual treatment effects. Bioinformatic analyses revealed significantly and markedly upregulated molecular networks and pathways associated in particular with immune signaling and regulation of cell injury, death, growth, and proliferation. Highly regulated genes included chemokines, growth factors, enzymes, channels, transporters, and intercellular and intracellular signal transducers. Notably, numerous genes for G-protein-coupled receptors (GPCRs) and G-protein effectors involved in calcium signaling were significantly regulated, mostly down (for example, Cxcr4, Adra2a, Ednra, P2y1, Gnao1, Gng7), but some up (for example, P2y14, P2y6, Ccrl2, Gnb4). We tested selected cases and found that changes in GPCR gene expression were accompanied by significant, parallel changes in astrocyte calcium signaling evoked by corresponding GPCR-specific ligands. These findings identify pronounced changes in the astrocyte transcriptome induced by TGF-β1, LPS, and IFNγ, and show that these inflammatory stimuli upregulate astrocyte molecular networks associated with immune- and injury-related functions and significantly alter astrocyte calcium signaling stimulated by multiple GPCRs. PMID:23077035
Robust Learning of High-dimensional Biological Networks with Bayesian Networks
NASA Astrophysics Data System (ADS)
Nägele, Andreas; Dejori, Mathäus; Stetter, Martin
Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.
Kcne2 Deletion Creates a Multisystem Syndrome Predisposing to Sudden Cardiac Death
Hu, Zhaoyang; Kant, Ritu; Anand, Marie; King, Elizabeth C.; Krogh-Madsen, Trine; Christini, David J.; Abbott, Geoffrey W.
2014-01-01
Background Sudden cardiac death (SCD) is the leading global cause of mortality, exhibiting increased incidence in diabetics. Ion channel gene perturbations provide a well-established ventricular arrhythmogenic substrate for SCD. However, most arrhythmia susceptibility genes - including the KCNE2 K+ channel β subunit - are expressed in multiple tissues, suggesting potential multiplex SCD substrates. Methods and Results Using “whole transcript” transcriptomics, we uncovered cardiac angiotensinogen upregulation and remodeling of cardiac angiotensinogen interaction networks in P21 Kcne2−/− mouse pups, and adrenal remodeling consistent with metabolic syndrome in adult Kcne2−/− mice. This led to the discovery that Kcne2 disruption causes multiple acknowledged SCD substrates of extracardiac origin: diabetes, hypercholesterolemia, hyperkalemia, anemia and elevated angiotensin II. Kcne2 deletion was also prerequisite for aging-dependent QT prolongation, ventricular fibrillation and SCD immediately following transient ischemia, and fasting-dependent hypoglycemia, myocardial ischemia and atrioventricular block. Conclusions Disruption of a single, widely expressed arrhythmia susceptibility gene can generate a multisystem syndrome comprising manifold electrical and systemic substrates and triggers of SCD. This paradigm is expected to apply to other arrhythmia susceptibility genes, the majority of which encode ubiquitously expressed ion channel subunits or regulatory proteins. PMID:24403551
Genetic association of impulsivity in young adults: a multivariate study
Khadka, S; Narayanan, B; Meda, S A; Gelernter, J; Han, S; Sawyer, B; Aslanzadeh, F; Stevens, M C; Hawkins, K A; Anticevic, A; Potenza, M N; Pearlson, G D
2014-01-01
Impulsivity is a heritable, multifaceted construct with clinically relevant links to multiple psychopathologies. We assessed impulsivity in young adult (N~2100) participants in a longitudinal study, using self-report questionnaires and computer-based behavioral tasks. Analysis was restricted to the subset (N=426) who underwent genotyping. Multivariate association between impulsivity measures and single-nucleotide polymorphism data was implemented using parallel independent component analysis (Para-ICA). Pathways associated with multiple genes in components that correlated significantly with impulsivity phenotypes were then identified using a pathway enrichment analysis. Para-ICA revealed two significantly correlated genotype–phenotype component pairs. One impulsivity component included the reward responsiveness subscale and behavioral inhibition scale of the Behavioral-Inhibition System/Behavioral-Activation System scale, and the second impulsivity component included the non-planning subscale of the Barratt Impulsiveness Scale and the Experiential Discounting Task. Pathway analysis identified processes related to neurogenesis, nervous system signal generation/amplification, neurotransmission and immune response. We identified various genes and gene regulatory pathways associated with empirically derived impulsivity components. Our study suggests that gene networks implicated previously in brain development, neurotransmission and immune response are related to impulsive tendencies and behaviors. PMID:25268255
Kwiatkowska, Rachel M.; Platt, Naomi; Poupardin, Rodolphe; Irving, Helen; Dabire, Roch K.; Mitchell, Sara; Jones, Christopher M.; Diabaté, Abdoulaye; Ranson, Hilary; Wondji, Charles S.
2013-01-01
With the exception of target site mutations, insecticide resistance mechanisms in the principle malaria vector Anopheles gambiae, remains largely uncharacterized in Burkina Faso. Here we detected high prevalence of resistance in Vallée du Kou (VK) to pyrethroids, DDT and dieldrin, moderate level for carbamates and full susceptibility to organophosphates. High frequencies of L1014F kdr (75%) and Rdl (87%) mutations were observed showing strong correlation with pyrethroids/DDT and dieldrin resistance. The frequency of ace1R mutation was low even in carbamate resistant mosquitoes. Microarray analysis identified genes significantly over-transcribed in VK. These include the cytochrome P450 genes, CYP6P3 and CYP6Z2, previously associated with pyrethroid resistance. Gene Ontology (GO) enrichment analysis suggested that elevated neurotransmitter activity is associated with resistance, with the over-transcription of target site resistance genes such as acetylcholinesterase and the GABA receptor. A rhodopsin receptor gene previously associated with pyrethroid resistance in Culex pipiens pallens was also over-transcribed in VK. This study highlights the complex network of mechanisms conferring multiple resistance in malaria vectors and such information should be taken into account when designing and implementing resistance control strategies. PMID:23380570
Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming
2015-01-01
In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Wang, Chun-Hua; Zhong, Yi; Zhang, Yan; Liu, Jin-Ping; Wang, Yue-Fei; Jia, Wei-Na; Wang, Guo-Cai; Li, Zheng; Zhu, Yan; Gao, Xiu-Mei
2016-02-01
Chinese medicine is known to treat complex diseases with multiple components and multiple targets. However, the main effective components and their related key targets and functions remain to be identified. Herein, a network analysis method was developed to identify the main effective components and key targets of a Chinese medicine, Lianhua-Qingwen Formula (LQF). The LQF is commonly used for the prevention and treatment of viral influenza in China. It is composed of 11 herbs, gypsum and menthol with 61 compounds being identified in our previous work. In this paper, these 61 candidate compounds were used to find their related targets and construct the predicted-target (PT) network. An influenza-related protein-protein interaction (PPI) network was constructed and integrated with the PT network. Then the compound-effective target (CET) network and compound-ineffective target network (CIT) were extracted, respectively. A novel approach was developed to identify effective components by comparing CET and CIT networks. As a result, 15 main effective components were identified along with 61 corresponding targets. 7 of these main effective components were further experimentally validated to have antivirus efficacy in vitro. The main effective component-target (MECT) network was further constructed with main effective components and their key targets. Gene Ontology (GO) analysis of the MECT network predicted key functions such as NO production being modulated by the LQF. Interestingly, five effective components were experimentally tested and exhibited inhibitory effects on NO production in the LPS induced RAW 264.7 cell. In summary, we have developed a novel approach to identify the main effective components in a Chinese medicine LQF and experimentally validated some of the predictions.
A hybrid network-based method for the detection of disease-related genes
NASA Astrophysics Data System (ADS)
Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene
2018-02-01
Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.
Gene network biological validity based on gene-gene interaction relevance.
Gómez-Vela, Francisco; Díaz-Díaz, Norberto
2014-01-01
In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown.
Ritchie, Marylyn D; White, Bill C; Parker, Joel S; Hahn, Lance W; Moore, Jason H
2003-01-01
Background Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases. Results Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present. Conclusion This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases. PMID:12846935
Genome-wide network analysis of Wnt signaling in three pediatric cancers
NASA Astrophysics Data System (ADS)
Bao, Ju; Lee, Ho-Jin; Zheng, Jie J.
2013-10-01
Genomic structural alteration is common in pediatric cancers, and analysis of data generated by the Pediatric Cancer Genome Project reveals such tumor-related alterations in many Wnt signaling-associated genes. Most pediatric cancers are thought to arise within developing tissues that undergo substantial expansion during early organ formation, growth and maturation, and Wnt signaling plays an important role in this development. We examined three pediatric tumors--medullobastoma, early T-cell precursor acute lymphoblastic leukemia, and retinoblastoma--that show multiple genomic structural variations within Wnt signaling pathways. We mathematically modeled this pathway to investigate the effects of cancer-related structural variations on Wnt signaling. Surprisingly, we found that an outcome measure of canonical Wnt signaling was consistently similar in matched cancer cells and normal cells, even in the context of different cancers, different mutations, and different Wnt-related genes. Our results suggest that the cancer cells maintain a normal level of Wnt signaling by developing multiple mutations.
Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall.
A physical mechanism of cancer heterogeneity
NASA Astrophysics Data System (ADS)
Chen, Cong; Wang, Jin
2016-02-01
We studied a core cancer gene regulatory network motif to uncover possible source of cancer heterogeneity from epigenetic sources. When the time scale of the protein regulation to the gene is faster compared to the protein synthesis and degradation (adiabatic regime), normal state, cancer state and an intermediate premalignant state emerge. Due to the epigenetics such as DNA methylation and histone remodification, the time scale of the protein regulation to the gene can be slower or comparable to the protein synthesis and degradation (non-adiabatic regime). In this case, many more states emerge as possible phenotype alternations. This gives the origin of the heterogeneity. The cancer heterogeneity is reflected from the emergence of more phenotypic states, larger protein concentration fluctuations, wider kinetic distributions and multiplicity of kinetic paths from normal to cancer state, higher energy cost per gene switching, and weaker stability.
Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi
2013-01-01
Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.
Prediction of Oncogenic Interactions and Cancer-Related Signaling Networks Based on Network Topology
Acencio, Marcio Luis; Bovolenta, Luiz Augusto; Camilo, Esther; Lemke, Ney
2013-01-01
Cancer has been increasingly recognized as a systems biology disease since many investigators have demonstrated that this malignant phenotype emerges from abnormal protein-protein, regulatory and metabolic interactions induced by simultaneous structural and regulatory changes in multiple genes and pathways. Therefore, the identification of oncogenic interactions and cancer-related signaling networks is crucial for better understanding cancer. As experimental techniques for determining such interactions and signaling networks are labor-intensive and time-consuming, the development of a computational approach capable to accomplish this task would be of great value. For this purpose, we present here a novel computational approach based on network topology and machine learning capable to predict oncogenic interactions and extract relevant cancer-related signaling subnetworks from an integrated network of human genes interactions (INHGI). This approach, called graph2sig, is twofold: first, it assigns oncogenic scores to all interactions in the INHGI and then these oncogenic scores are used as edge weights to extract oncogenic signaling subnetworks from INHGI. Regarding the prediction of oncogenic interactions, we showed that graph2sig is able to recover 89% of known oncogenic interactions with a precision of 77%. Moreover, the interactions that received high oncogenic scores are enriched in genes for which mutations have been causally implicated in cancer. We also demonstrated that graph2sig is potentially useful in extracting oncogenic signaling subnetworks: more than 80% of constructed subnetworks contain more than 50% of original interactions in their corresponding oncogenic linear pathways present in the KEGG PATHWAY database. In addition, the potential oncogenic signaling subnetworks discovered by graph2sig are supported by experimental evidence. Taken together, these results suggest that graph2sig can be a useful tool for investigators involved in cancer research interested in detecting signaling networks most prone to contribute with the emergence of malignant phenotype. PMID:24204854
Spectraplakins: Master orchestrators of cytoskeletal dynamics
Suozzi, Kathleen C.; Wu, Xiaoyang
2012-01-01
The dynamics of different cytoskeletal networks are coordinated to bring about many fundamental cellular processes, from neuronal pathfinding to cell division. Increasing evidence points to the importance of spectraplakins in integrating cytoskeletal networks. Spectraplakins are evolutionarily conserved giant cytoskeletal cross-linkers, which belong to the spectrin superfamily. Their genes consist of multiple promoters and many exons, yielding a vast array of differential splice forms with distinct functions. Spectraplakins are also unique in their ability to associate with all three elements of the cytoskeleton: F-actin, microtubules, and intermediate filaments. Recent studies have begun to unveil their role in a wide range of processes, from cell migration to tissue integrity. PMID:22584905
Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi
2017-01-01
We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Adebola, Adijat A; Di Castri, Theo; He, Chui-Zhen; Salvatierra, Laura A; Zhao, Jian; Brown, Kristy; Lin, Chyuan-Sheng; Worman, Howard J; Liem, Ronald K H
2015-04-15
Charcot-Marie-Tooth disease (CMT) is the most commonly inherited neurological disorder with a prevalence of 1 in 2500 people worldwide. Patients suffer from degeneration of the peripheral nerves that control sensory information of the foot/leg and hand/arm. Multiple mutations in the neurofilament light polypeptide gene, NEFL, cause CMT2E. Previous studies in transfected cells showed that expression of disease-associated neurofilament light chain variants results in abnormal intermediate filament networks associated with defects in axonal transport. We have now generated knock-in mice with two different point mutations in Nefl: P8R that has been reported in multiple families with variable age of onset and N98S that has been described as an early-onset, sporadic mutation in multiple individuals. Nefl(P8R/+) and Nefl(P8R/P8R) mice were indistinguishable from Nefl(+/+) in terms of behavioral phenotype. In contrast, Nefl(N98S/+) mice had a noticeable tremor, and most animals showed a hindlimb clasping phenotype. Immunohistochemical analysis revealed multiple inclusions in the cell bodies and proximal axons of spinal cord neurons, disorganized processes in the cerebellum and abnormal processes in the cerebral cortex and pons. Abnormal processes were observed as early as post-natal day 7. Electron microscopic analysis of sciatic nerves showed a reduction in the number of neurofilaments, an increase in the number of microtubules and a decrease in the axonal diameters. The Nefl(N98S/+) mice provide an excellent model to study the pathogenesis of CMT2E and should prove useful for testing potential therapies. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Transcriptional Analysis of Aggressiveness and Heterogeneity across Grades of Astrocytomas
Wang, Chunjing; Funk, Cory C.; Eddy, James A.; Price, Nathan D.
2013-01-01
Astrocytoma is the most common glioma, accounting for half of all primary brain and spinal cord tumors. Late detection and the aggressive nature of high-grade astrocytomas contribute to high mortality rates. Though many studies identify candidate biomarkers using high-throughput transcriptomic profiling to stratify grades and subtypes, few have resulted in clinically actionable results. This shortcoming can be attributed, in part, to pronounced lab effects that reduce signature robustness and varied individual gene expression among patients with the same tumor. We addressed these issues by uniformly preprocessing publicly available transcriptomic data, comprising 306 tumor samples from three astrocytoma grades (Grade 2, 3, and 4) and 30 non-tumor samples (normal brain as control tissues). Utilizing Differential Rank Conservation (DIRAC), a network-based classification approach, we examined the global and individual patterns of network regulation across tumor grades. Additionally, we applied gene-based approaches to identify genes whose expression changed consistently with increasing tumor grade and evaluated their robustness across multiple studies using statistical sampling. Applying DIRAC, we observed a global trend of greater network dysregulation with increasing tumor aggressiveness. Individual networks displaying greater differences in regulation between adjacent grades play well-known roles in calcium/PKC, EGF, and transcription signaling. Interestingly, many of the 90 individual genes found to monotonically increase or decrease with astrocytoma grade are implicated in cancer-affected processes such as calcium signaling, mitochondrial metabolism, and apoptosis. The fact that specific genes monotonically increase or decrease with increasing astrocytoma grade may reflect shared oncogenic mechanisms among phenotypically similar tumors. This work presents statistically significant results that enable better characterization of different human astrocytoma grades and hopefully can contribute towards improvements in diagnosis and therapy choices. Our results also identify a number of testable hypotheses relating to astrocytoma etiology that may prove helpful in developing much-needed biomarkers for earlier disease detection. PMID:24146911
Transcriptional analysis of aggressiveness and heterogeneity across grades of astrocytomas.
Wang, Chunjing; Funk, Cory C; Eddy, James A; Price, Nathan D
2013-01-01
Astrocytoma is the most common glioma, accounting for half of all primary brain and spinal cord tumors. Late detection and the aggressive nature of high-grade astrocytomas contribute to high mortality rates. Though many studies identify candidate biomarkers using high-throughput transcriptomic profiling to stratify grades and subtypes, few have resulted in clinically actionable results. This shortcoming can be attributed, in part, to pronounced lab effects that reduce signature robustness and varied individual gene expression among patients with the same tumor. We addressed these issues by uniformly preprocessing publicly available transcriptomic data, comprising 306 tumor samples from three astrocytoma grades (Grade 2, 3, and 4) and 30 non-tumor samples (normal brain as control tissues). Utilizing Differential Rank Conservation (DIRAC), a network-based classification approach, we examined the global and individual patterns of network regulation across tumor grades. Additionally, we applied gene-based approaches to identify genes whose expression changed consistently with increasing tumor grade and evaluated their robustness across multiple studies using statistical sampling. Applying DIRAC, we observed a global trend of greater network dysregulation with increasing tumor aggressiveness. Individual networks displaying greater differences in regulation between adjacent grades play well-known roles in calcium/PKC, EGF, and transcription signaling. Interestingly, many of the 90 individual genes found to monotonically increase or decrease with astrocytoma grade are implicated in cancer-affected processes such as calcium signaling, mitochondrial metabolism, and apoptosis. The fact that specific genes monotonically increase or decrease with increasing astrocytoma grade may reflect shared oncogenic mechanisms among phenotypically similar tumors. This work presents statistically significant results that enable better characterization of different human astrocytoma grades and hopefully can contribute towards improvements in diagnosis and therapy choices. Our results also identify a number of testable hypotheses relating to astrocytoma etiology that may prove helpful in developing much-needed biomarkers for earlier disease detection.
Yutin, Natalya; Raoult, Didier; Koonin, Eugene V
2013-05-23
Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor. We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknown virus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements. The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements.
2013-01-01
Background Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor. Results We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknownvirus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements. Conclusions The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements. PMID:23701946
Turan, Nil; Soulet, Fabienne; Mohd Zahari, Maihafizah; Ryan, Katie R.; Durant, Sarah; He, Shan; Herbert, John; Ankers, John; Heath, John K.; Bjerkvig, Rolf; Bicknell, Roy; Hotchin, Neil A.; Bikfalvi, Andreas; Falciani, Francesco
2015-01-01
Gliomas are a highly heterogeneous group of brain tumours that are refractory to treatment, highly invasive and pro-angiogenic. Glioblastoma patients have an average survival time of less than 15 months. Understanding the molecular basis of different grades of glioma, from well differentiated, low-grade tumours to high-grade tumours, is a key step in defining new therapeutic targets. Here we use a data-driven approach to learn the structure of gene regulatory networks from observational data and use the resulting models to formulate hypothesis on the molecular determinants of glioma stage. Remarkably, integration of available knowledge with functional genomics datasets representing clinical and pre-clinical studies reveals important properties within the regulatory circuits controlling low and high-grade glioma. Our analyses first show that low and high-grade gliomas are characterised by a switch in activity of two subsets of Rho GTPases. The first one is involved in maintaining normal glial cell function, while the second is linked to the establishment of multiple hallmarks of cancer. Next, the development and application of a novel data integration methodology reveals novel functions of RND3 in controlling glioma cell migration, invasion, proliferation, angiogenesis and clinical outcome. PMID:26132659
Lobel, Lior; Herskovits, Anat A.
2016-01-01
Bacteria sense and respond to many environmental cues, rewiring their regulatory network to facilitate adaptation to new conditions/niches. Global transcription factors that co-regulate multiple pathways simultaneously are essential to this regulatory rewiring. CodY is one such global regulator, controlling expression of both metabolic and virulence genes in Gram-positive bacteria. Branch chained amino acids (BCAAs) serve as a ligand for CodY and modulate its activity. Classically, CodY was considered to function primarily as a repressor under rich growth conditions. However, our previous studies of the bacterial pathogen Listeria monocytogenes revealed that CodY is active also when the bacteria are starved for BCAAs. Under these conditions, CodY loses the ability to repress genes (e.g., metabolic genes) and functions as a direct activator of the master virulence regulator gene, prfA. This observation raised the possibility that CodY possesses multiple functions that allow it to coordinate gene expression across a wide spectrum of metabolic growth conditions, and thus better adapt bacteria to the mammalian niche. To gain a deeper understanding of CodY’s regulatory repertoire and identify direct target genes, we performed a genome wide analysis of the CodY regulon and DNA binding under both rich and minimal growth conditions, using RNA-Seq and ChIP-Seq techniques. We demonstrate here that CodY is indeed active (i.e., binds DNA) under both conditions, serving as a repressor and activator of different genes. Further, we identified new genes and pathways that are directly regulated by CodY (e.g., sigB, arg, his, actA, glpF, gadG, gdhA, poxB, glnR and fla genes), integrating metabolism, stress responses, motility and virulence in L. monocytogenes. This study establishes CodY as a multifaceted factor regulating L. monocytogenes physiology in a highly versatile manner. PMID:26895237
Degrees of separation as a statistical tool for evaluating candidate genes.
Nelson, Ronald M; Pettersson, Mats E
2014-12-01
Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wang, Yinxiao; Wang, Wensheng; Zhao, Xiuqin; Zhang, Shilai; Zhang, Jing; Hu, Fengyi; Li, Zhikang
2017-01-01
Rice (Oryza sativa) is very sensitive to chilling stress at seedling and reproductive stages, whereas wild rice, O. longistaminata, tolerates non-freezing cold temperatures and has overwintering ability. Elucidating the molecular mechanisms of chilling tolerance (CT) in O. longistaminata should thus provide a basis for rice CT improvement through molecular breeding. In this study, high-throughput RNA sequencing was performed to profile global transcriptome alterations and crucial genes involved in response to long-term low temperature in O. longistaminata shoots and rhizomes subjected to 7 days of chilling stress. A total of 605 and 403 genes were respectively identified as up- and down-regulated in O. longistaminata under 7 days of chilling stress, with 354 and 371 differentially expressed genes (DEGs) found exclusively in shoots and rhizomes, respectively. GO enrichment and KEGG pathway analyses revealed that multiple transcriptional regulatory pathways were enriched in commonly induced genes in both tissues; in contrast, only the photosynthesis pathway was prevalent in genes uniquely induced in shoots, whereas several key metabolic pathways and the programmed cell death process were enriched in genes induced only in rhizomes. Further analysis of these tissue-specific DEGs showed that the CBF/DREB1 regulon and other transcription factors (TFs), including AP2/EREBPs, MYBs, and WRKYs, were synergistically involved in transcriptional regulation of chilling stress response in shoots. Different sets of TFs, such as OsERF922, OsNAC9, OsWRKY25, and WRKY74, and eight genes encoding antioxidant enzymes were exclusively activated in rhizomes under long-term low-temperature treatment. Furthermore, several cis-regulatory elements, including the ICE1-binding site, the GATA element for phytochrome regulation, and the W-box for WRKY binding, were highly abundant in both tissues, confirming the involvement of multiple regulatory genes and complex networks in the transcriptional regulation of CT in O. longistaminata. Finally, most chilling-induced genes with alternative splicing exclusive to shoots were associated with photosynthesis and regulation of gene expression, while those enriched in rhizomes were primarily related to stress signal transduction; this indicates that tissue-specific transcriptional and post-transcriptional regulation mechanisms synergistically contribute to O. longistaminata long-term CT. Our findings provide an overview of the complex regulatory networks of CT in O. longistaminata. PMID:29190752
Zhang, Ting; Huang, Liyu; Wang, Yinxiao; Wang, Wensheng; Zhao, Xiuqin; Zhang, Shilai; Zhang, Jing; Hu, Fengyi; Fu, Binying; Li, Zhikang
2017-01-01
Rice (Oryza sativa) is very sensitive to chilling stress at seedling and reproductive stages, whereas wild rice, O. longistaminata, tolerates non-freezing cold temperatures and has overwintering ability. Elucidating the molecular mechanisms of chilling tolerance (CT) in O. longistaminata should thus provide a basis for rice CT improvement through molecular breeding. In this study, high-throughput RNA sequencing was performed to profile global transcriptome alterations and crucial genes involved in response to long-term low temperature in O. longistaminata shoots and rhizomes subjected to 7 days of chilling stress. A total of 605 and 403 genes were respectively identified as up- and down-regulated in O. longistaminata under 7 days of chilling stress, with 354 and 371 differentially expressed genes (DEGs) found exclusively in shoots and rhizomes, respectively. GO enrichment and KEGG pathway analyses revealed that multiple transcriptional regulatory pathways were enriched in commonly induced genes in both tissues; in contrast, only the photosynthesis pathway was prevalent in genes uniquely induced in shoots, whereas several key metabolic pathways and the programmed cell death process were enriched in genes induced only in rhizomes. Further analysis of these tissue-specific DEGs showed that the CBF/DREB1 regulon and other transcription factors (TFs), including AP2/EREBPs, MYBs, and WRKYs, were synergistically involved in transcriptional regulation of chilling stress response in shoots. Different sets of TFs, such as OsERF922, OsNAC9, OsWRKY25, and WRKY74, and eight genes encoding antioxidant enzymes were exclusively activated in rhizomes under long-term low-temperature treatment. Furthermore, several cis-regulatory elements, including the ICE1-binding site, the GATA element for phytochrome regulation, and the W-box for WRKY binding, were highly abundant in both tissues, confirming the involvement of multiple regulatory genes and complex networks in the transcriptional regulation of CT in O. longistaminata. Finally, most chilling-induced genes with alternative splicing exclusive to shoots were associated with photosynthesis and regulation of gene expression, while those enriched in rhizomes were primarily related to stress signal transduction; this indicates that tissue-specific transcriptional and post-transcriptional regulation mechanisms synergistically contribute to O. longistaminata long-term CT. Our findings provide an overview of the complex regulatory networks of CT in O. longistaminata.
Mezlini, Aziz M; Goldenberg, Anna
2017-10-01
Discovering genetic mechanisms driving complex diseases is a hard problem. Existing methods often lack power to identify the set of responsible genes. Protein-protein interaction networks have been shown to boost power when detecting gene-disease associations. We introduce a Bayesian framework, Conflux, to find disease associated genes from exome sequencing data using networks as a prior. There are two main advantages to using networks within a probabilistic graphical model. First, networks are noisy and incomplete, a substantial impediment to gene discovery. Incorporating networks into the structure of a probabilistic models for gene inference has less impact on the solution than relying on the noisy network structure directly. Second, using a Bayesian framework we can keep track of the uncertainty of each gene being associated with the phenotype rather than returning a fixed list of genes. We first show that using networks clearly improves gene detection compared to individual gene testing. We then show consistently improved performance of Conflux compared to the state-of-the-art diffusion network-based method Hotnet2 and a variety of other network and variant aggregation methods, using randomly generated and literature-reported gene sets. We test Hotnet2 and Conflux on several network configurations to reveal biases and patterns of false positives and false negatives in each case. Our experiments show that our novel Bayesian framework Conflux incorporates many of the advantages of the current state-of-the-art methods, while offering more flexibility and improved power in many gene-disease association scenarios.
Discovering disease-associated genes in weighted protein-protein interaction networks
NASA Astrophysics Data System (ADS)
Cui, Ying; Cai, Meng; Stanley, H. Eugene
2018-04-01
Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ray, Anamika; Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078; Liu Jing
2010-10-15
Chlorpyrifos (CPF) is a widely used organophosphorus insecticide (OP) and putative developmental neurotoxicant in humans. The acute toxicity of CPF is elicited by acetylcholinesterase (AChE) inhibition. We characterized dose-related (0.1, 0.5, 1 and 2 mg/kg) gene expression profiles and changes in cell signaling pathways 24 h following acute CPF exposure in 7-day-old rats. Microarray experiments indicated that approximately 9% of the 44,000 genes were differentially expressed following either one of the four CPF dosages studied (546, 505, 522, and 3,066 genes with 0.1, 0.5, 1.0 and 2.0 mg/kg CPF). Genes were grouped according to dose-related expression patterns using K-means clusteringmore » while gene networks and canonical pathways were evaluated using Ingenuity Pathway Analysis (registered) . Twenty clusters were identified and differential expression of selected genes was verified by RT-PCR. The four largest clusters (each containing from 276 to 905 genes) constituted over 50% of all differentially expressed genes and exhibited up-regulation following exposure to the highest dosage (2 mg/kg CPF). The total number of gene networks affected by CPF also rose sharply with the highest dosage of CPF (18, 16, 18 and 50 with 0.1, 0.5, 1 and 2 mg/kg CPF). Forebrain cholinesterase (ChE) activity was significantly reduced (26%) only in the highest dosage group. Based on magnitude of dose-related changes in differentially expressed genes, relative numbers of gene clusters and signaling networks affected, and forebrain ChE inhibition only at 2 mg/kg CPF, we focused subsequent analyses on this treatment group. Six canonical pathways were identified that were significantly affected by 2 mg/kg CPF (MAPK, oxidative stress, NF{Kappa}B, mitochondrial dysfunction, arylhydrocarbon receptor and adrenergic receptor signaling). Evaluation of different cellular functions of the differentially expressed genes suggested changes related to olfactory receptors, cell adhesion/migration, synapse/synaptic transmission and transcription/translation. Nine genes were differentially affected in all four CPF dosing groups. We conclude that the most robust, consistent changes in differential gene expression in neonatal forebrain across a range of acute CPF dosages occurred at an exposure level associated with the classical marker of OP toxicity, AChE inhibition. Disruption of multiple cellular pathways, in particular cell adhesion, may contribute to the developmental neurotoxicity potential of this pesticide.« less
Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data.
Tong, Dong Ling; Schierz, Amanda C
2011-09-01
Suitable techniques for microarray analysis have been widely researched, particularly for the study of marker genes expressed to a specific type of cancer. Most of the machine learning methods that have been applied to significant gene selection focus on the classification ability rather than the selection ability of the method. These methods also require the microarray data to be preprocessed before analysis takes place. The objective of this study is to develop a hybrid genetic algorithm-neural network (GANN) model that emphasises feature selection and can operate on unpreprocessed microarray data. The GANN is a hybrid model where the fitness value of the genetic algorithm (GA) is based upon the number of samples correctly labelled by a standard feedforward artificial neural network (ANN). The model is evaluated by using two benchmark microarray datasets with different array platforms and differing number of classes (a 2-class oligonucleotide microarray data for acute leukaemia and a 4-class complementary DNA (cDNA) microarray dataset for SRBCTs (small round blue cell tumours)). The underlying concept of the GANN algorithm is to select highly informative genes by co-evolving both the GA fitness function and the ANN weights at the same time. The novel GANN selected approximately 50% of the same genes as the original studies. This may indicate that these common genes are more biologically significant than other genes in the datasets. The remaining 50% of the significant genes identified were used to build predictive models and for both datasets, the models based on the set of genes extracted by the GANN method produced more accurate results. The results also suggest that the GANN method not only can detect genes that are exclusively associated with a single cancer type but can also explore the genes that are differentially expressed in multiple cancer types. The results show that the GANN model has successfully extracted statistically significant genes from the unpreprocessed microarray data as well as extracting known biologically significant genes. We also show that assessing the biological significance of genes based on classification accuracy may be misleading and though the GANN's set of extra genes prove to be more statistically significant than those selected by other methods, a biological assessment of these genes is highly recommended to confirm their functionality. Copyright © 2011 Elsevier B.V. All rights reserved.
Qualitatively modelling and analysing genetic regulatory networks: a Petri net approach.
Steggles, L Jason; Banks, Richard; Shaw, Oliver; Wipat, Anil
2007-02-01
New developments in post-genomic technology now provide researchers with the data necessary to study regulatory processes in a holistic fashion at multiple levels of biological organization. One of the major challenges for the biologist is to integrate and interpret these vast data resources to gain a greater understanding of the structure and function of the molecular processes that mediate adaptive and cell cycle driven changes in gene expression. In order to achieve this biologists require new tools and techniques to allow pathway related data to be modelled and analysed as network structures, providing valuable insights which can then be validated and investigated in the laboratory. We propose a new technique for constructing and analysing qualitative models of genetic regulatory networks based on the Petri net formalism. We take as our starting point the Boolean network approach of treating genes as binary switches and develop a new Petri net model which uses logic minimization to automate the construction of compact qualitative models. Our approach addresses the shortcomings of Boolean networks by providing access to the wide range of existing Petri net analysis techniques and by using non-determinism to cope with incomplete and inconsistent data. The ideas we present are illustrated by a case study in which the genetic regulatory network controlling sporulation in the bacterium Bacillus subtilis is modelled and analysed. The Petri net model construction tool and the data files for the B. subtilis sporulation case study are available at http://bioinf.ncl.ac.uk/gnapn.
Network pharmacology of JAK inhibitors
Moodley, Devapregasan; Yoshida, Hideyuki; Mostafavi, Sara; Asinovski, Natasha; Ortiz-Lopez, Adriana; Symanowicz, Peter; Telliez, Jean-Baptiste; Hegen, Martin; Clark, James D.; Mathis, Diane; Benoist, Christophe
2016-01-01
Small-molecule inhibitors of the Janus kinase family (JAKis) are clinically efficacious in multiple autoimmune diseases, albeit with increased risk of certain infections. Their precise mechanism of action is unclear, with JAKs being signaling hubs for several cytokines. We assessed the in vivo impact of pan- and isoform-specific JAKi in mice by immunologic and genomic profiling. Effects were broad across the immunogenomic network, with overlap between inhibitors. Natural killer (NK) cell and macrophage homeostasis were most immediately perturbed, with network-level analysis revealing a rewiring of coregulated modules of NK cell transcripts. The repression of IFN signature genes after repeated JAKi treatment continued even after drug clearance, with persistent changes in chromatin accessibility and phospho-STAT responsiveness to IFN. Thus, clinical use and future development of JAKi might need to balance effects on immunological networks, rather than expect that JAKis affect a particular cytokine response and be cued to long-lasting epigenomic modifications rather than by short-term pharmacokinetics. PMID:27516546
Mapping to Irregular Torus Topologies and Other Techniques for Petascale Biomolecular Simulation
Phillips, James C.; Sun, Yanhua; Jain, Nikhil; Bohm, Eric J.; Kalé, Laxmikant V.
2014-01-01
Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison. PMID:25594075
Distinct tissue-specific transcriptional regulation revealed by gene regulatory networks in maize.
Huang, Ji; Zheng, Juefei; Yuan, Hui; McGinnis, Karen
2018-06-07
Transcription factors (TFs) are proteins that can bind to DNA sequences and regulate gene expression. Many TFs are master regulators in cells that contribute to tissue-specific and cell-type-specific gene expression patterns in eukaryotes. Maize has been a model organism for over one hundred years, but little is known about its tissue-specific gene regulation through TFs. In this study, we used a network approach to elucidate gene regulatory networks (GRNs) in four tissues (leaf, root, SAM and seed) in maize. We utilized GENIE3, a machine-learning algorithm combined with large quantity of RNA-Seq expression data to construct four tissue-specific GRNs. Unlike some other techniques, this approach is not limited by high-quality Position Weighed Matrix (PWM), and can therefore predict GRNs for over 2000 TFs in maize. Although many TFs were expressed across multiple tissues, a multi-tiered analysis predicted tissue-specific regulatory functions for many transcription factors. Some well-studied TFs emerged within the four tissue-specific GRNs, and the GRN predictions matched expectations based upon published results for many of these examples. Our GRNs were also validated by ChIP-Seq datasets (KN1, FEA4 and O2). Key TFs were identified for each tissue and matched expectations for key regulators in each tissue, including GO enrichment and identity with known regulatory factors for that tissue. We also found functional modules in each network by clustering analysis with the MCL algorithm. By combining publicly available genome-wide expression data and network analysis, we can uncover GRNs at tissue-level resolution in maize. Since ChIP-Seq and PWMs are still limited in several model organisms, our study provides a uniform platform that can be adapted to any species with genome-wide expression data to construct GRNs. We also present a publicly available database, maize tissue-specific GRN (mGRN, https://www.bio.fsu.edu/mcginnislab/mgrn/ ), for easy querying. All source code and data are available at Github ( https://github.com/timedreamer/maize_tissue-specific_GRN ).
Liu, Guiyou; Zhang, Fang; Jiang, Yongshuai; Hu, Yang; Gong, Zhongying; Liu, Shoufeng; Chen, Xiuju; Jiang, Qinghua; Hao, Junwei
2017-02-01
Much effort has been expended on identifying the genetic determinants of multiple sclerosis (MS). Existing large-scale genome-wide association study (GWAS) datasets provide strong support for using pathway and network-based analysis methods to investigate the mechanisms underlying MS. However, no shared genetic pathways have been identified to date. We hypothesize that shared genetic pathways may indeed exist in different MS-GWAS datasets. Here, we report results from a three-stage analysis of GWAS and expression datasets. In stage 1, we conducted multiple pathway analyses of two MS-GWAS datasets. In stage 2, we performed a candidate pathway analysis of the large-scale MS-GWAS dataset. In stage 3, we performed a pathway analysis using the dysregulated MS gene list from seven human MS case-control expression datasets. In stage 1, we identified 15 shared pathways. In stage 2, we successfully replicated 14 of these 15 significant pathways. In stage 3, we found that dysregulated MS genes were significantly enriched in 10 of 15 MS risk pathways identified in stages 1 and 2. We report shared genetic pathways in different MS-GWAS datasets and highlight some new MS risk pathways. Our findings provide new insights on the genetic determinants of MS.