Construction of ontology augmented networks for protein complex prediction.
Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian
2013-01-01
Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
On the robustness of complex heterogeneous gene expression networks.
Gómez-Gardeñes, Jesús; Moreno, Yamir; Floría, Luis M
2005-04-01
We analyze a continuous gene expression model on the underlying topology of a complex heterogeneous network. Numerical simulations aimed at studying the chaotic and periodic dynamics of the model are performed. The results clearly indicate that there is a region in which the dynamical and structural complexity of the system avoid chaotic attractors. However, contrary to what has been reported for Random Boolean Networks, the chaotic phase cannot be completely suppressed, which has important bearings on network robustness and gene expression modeling.
Nariai, N; Kim, S; Imoto, S; Miyano, S
2004-01-01
We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.
Stability and structural properties of gene regulation networks with coregulation rules.
Warrell, Jonathan; Mhlanga, Musa
2017-05-07
Coregulation of the expression of groups of genes has been extensively demonstrated empirically in bacterial and eukaryotic systems. Such coregulation can arise through the use of shared regulatory motifs, which allow the coordinated expression of modules (and module groups) of functionally related genes across the genome. Coregulation can also arise through the physical association of multi-gene complexes through chromosomal looping, which are then transcribed together. We present a general formalism for modeling coregulation rules in the framework of Random Boolean Networks (RBN), and develop specific models for transcription factor networks with modular structure (including module groups, and multi-input modules (MIM) with autoregulation) and multi-gene complexes (including hierarchical differentiation between multi-gene complex members). We develop a mean-field approach to analyse the dynamical stability of large networks incorporating coregulation, and show that autoregulated MIM and hierarchical gene-complex models can achieve greater stability than networks without coregulation whose rules have matching activation frequency. We provide further analysis of the stability of small networks of both kinds through simulations. We also characterize several general properties of the transients and attractors in the hierarchical coregulation model, and show using simulations that the steady-state distribution factorizes hierarchically as a Bayesian network in a Markov Jump Process analogue of the RBN model. Copyright © 2017. Published by Elsevier Ltd.
Safari-Alighiarloo, Nahid; Taghizadeh, Mohammad; Tabatabaei, Seyyed Mohammad; Namaki, Saeed
2016-01-01
Background The involvement of multiple genes and missing heritability, which are dominant in complex diseases such as multiple sclerosis (MS), entail using network biology to better elucidate their molecular basis and genetic factors. We therefore aimed to integrate interactome (protein–protein interaction (PPI)) and transcriptomes data to construct and analyze PPI networks for MS disease. Methods Gene expression profiles in paired cerebrospinal fluid (CSF) and peripheral blood mononuclear cells (PBMCs) samples from MS patients, sampled in relapse or remission and controls, were analyzed. Differentially expressed genes which determined only in CSF (MS vs. control) and PBMCs (relapse vs. remission) separately integrated with PPI data to construct the Query-Query PPI (QQPPI) networks. The networks were further analyzed to investigate more central genes, functional modules and complexes involved in MS progression. Results The networks were analyzed and high centrality genes were identified. Exploration of functional modules and complexes showed that the majority of high centrality genes incorporated in biological pathways driving MS pathogenesis. Proteasome and spliceosome were also noticeable in enriched pathways in PBMCs (relapse vs. remission) which were identified by both modularity and clique analyses. Finally, STK4, RB1, CDKN1A, CDK1, RAC1, EZH2, SDCBP genes in CSF (MS vs. control) and CDC37, MAP3K3, MYC genes in PBMCs (relapse vs. remission) were identified as potential candidate genes for MS, which were the more central genes involved in biological pathways. Discussion This study showed that network-based analysis could explicate the complex interplay between biological processes underlying MS. Furthermore, an experimental validation of candidate genes can lead to identification of potential therapeutic targets. PMID:28028462
A pathway-based network analysis of hypertension-related genes
NASA Astrophysics Data System (ADS)
Wang, Huan; Hu, Jing-Bo; Xu, Chuan-Yun; Zhang, De-Hai; Yan, Qian; Xu, Ming; Cao, Ke-Fei; Zhang, Xu-Sheng
2016-02-01
Complex network approach has become an effective way to describe interrelationships among large amounts of biological data, which is especially useful in finding core functions and global behavior of biological systems. Hypertension is a complex disease caused by many reasons including genetic, physiological, psychological and even social factors. In this paper, based on the information of biological pathways, we construct a network model of hypertension-related genes of the salt-sensitive rat to explore the interrelationship between genes. Statistical and topological characteristics show that the network has the small-world but not scale-free property, and exhibits a modular structure, revealing compact and complex connections among these genes. By the threshold of integrated centrality larger than 0.71, seven key hub genes are found: Jun, Rps6kb1, Cycs, Creb312, Cdk4, Actg1 and RT1-Da. These genes should play an important role in hypertension, suggesting that the treatment of hypertension should focus on the combination of drugs on multiple genes.
Identifying gene networks underlying the neurobiology of ethanol and alcoholism.
Wolen, Aaron R; Miles, Michael F
2012-01-01
For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.
"Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...
2010-01-01
Background In a recent study, two-dimensional (2D) network layouts were used to visualize and quantitatively analyze the relationship between chronic renal diseases and regulated genes. The results revealed complex relationships between disease type, gene specificity, and gene regulation type, which led to important insights about the underlying biological pathways. Here we describe an attempt to extend our understanding of these complex relationships by reanalyzing the data using three-dimensional (3D) network layouts, displayed through 2D and 3D viewing methods. Findings The 3D network layout (displayed through the 3D viewing method) revealed that genes implicated in many diseases (non-specific genes) tended to be predominantly down-regulated, whereas genes regulated in a few diseases (disease-specific genes) tended to be up-regulated. This new global relationship was quantitatively validated through comparison to 1000 random permutations of networks of the same size and distribution. Our new finding appeared to be the result of using specific features of the 3D viewing method to analyze the 3D renal network. Conclusions The global relationship between gene regulation and gene specificity is the first clue from human studies that there exist common mechanisms across several renal diseases, which suggest hypotheses for the underlying mechanisms. Furthermore, the study suggests hypotheses for why the 3D visualization helped to make salient a new regularity that was difficult to detect in 2D. Future research that tests these hypotheses should enable a more systematic understanding of when and how to use 3D network visualizations to reveal complex regularities in biological networks. PMID:21070623
Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia
2015-06-01
To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Mezlini, Aziz M; Goldenberg, Anna
2017-10-01
Discovering genetic mechanisms driving complex diseases is a hard problem. Existing methods often lack power to identify the set of responsible genes. Protein-protein interaction networks have been shown to boost power when detecting gene-disease associations. We introduce a Bayesian framework, Conflux, to find disease associated genes from exome sequencing data using networks as a prior. There are two main advantages to using networks within a probabilistic graphical model. First, networks are noisy and incomplete, a substantial impediment to gene discovery. Incorporating networks into the structure of a probabilistic models for gene inference has less impact on the solution than relying on the noisy network structure directly. Second, using a Bayesian framework we can keep track of the uncertainty of each gene being associated with the phenotype rather than returning a fixed list of genes. We first show that using networks clearly improves gene detection compared to individual gene testing. We then show consistently improved performance of Conflux compared to the state-of-the-art diffusion network-based method Hotnet2 and a variety of other network and variant aggregation methods, using randomly generated and literature-reported gene sets. We test Hotnet2 and Conflux on several network configurations to reveal biases and patterns of false positives and false negatives in each case. Our experiments show that our novel Bayesian framework Conflux incorporates many of the advantages of the current state-of-the-art methods, while offering more flexibility and improved power in many gene-disease association scenarios.
Fang, Lingzhao; Sørensen, Peter; Sahana, Goutam; Panitz, Frank; Su, Guosheng; Zhang, Shengli; Yu, Ying; Li, Bingjie; Ma, Li; Liu, George; Lund, Mogens Sandø; Thomsen, Bo
2018-06-19
MicroRNAs (miRNA) are key modulators of gene expression and so act as putative fine-tuners of complex phenotypes. Here, we hypothesized that causal variants of complex traits are enriched in miRNAs and miRNA-target networks. First, we conducted a genome-wide association study (GWAS) for seven functional and milk production traits using imputed sequence variants (13~15 million) and >10,000 animals from three dairy cattle breeds, i.e., Holstein (HOL), Nordic red cattle (RDC) and Jersey (JER). Second, we analyzed for enrichments of association signals in miRNAs and their miRNA-target networks. Our results demonstrated that genomic regions harboring miRNA genes were significantly (P < 0.05) enriched with GWAS signals for milk production traits and mastitis, and that enrichments within miRNA-target gene networks were significantly higher than in random gene-sets for the majority of traits. Furthermore, most between-trait and across-breed correlations of enrichments with miRNA-target networks were significantly greater than with random gene-sets, suggesting pleiotropic effects of miRNAs. Intriguingly, genes that were differentially expressed in response to mammary gland infections were significantly enriched in the miRNA-target networks associated with mastitis. All these findings were consistent across three breeds. Collectively, our observations demonstrate the importance of miRNAs and their targets for the expression of complex traits.
System Analysis of LWDH Related Genes Based on Text Mining in Biological Networks
Miao, Yingbo; Zhang, Liangcai; Wang, Yang; Feng, Rennan; Yang, Lei; Zhang, Shihua; Jiang, Yongshuai; Liu, Guiyou
2014-01-01
Liuwei-dihuang (LWDH) is widely used in traditional Chinese medicine (TCM), but its molecular mechanism about gene interactions is unclear. LWDH genes were extracted from the existing literatures based on text mining technology. To simulate the complex molecular interactions that occur in the whole body, protein-protein interaction networks (PPINs) were constructed and the topological properties of LWDH genes were analyzed. LWDH genes have higher centrality properties and may play important roles in the complex biological network environment. It was also found that the distances within LWDH genes are smaller than expected, which means that the communication of LWDH genes during the biological process is rapid and effectual. At last, a comprehensive network of LWDH genes, including the related drugs and regulatory pathways at both the transcriptional and posttranscriptional levels, was constructed and analyzed. The biological network analysis strategy used in this study may be helpful for the understanding of molecular mechanism of TCM. PMID:25243143
A global interaction network maps a wiring diagram of cellular function
Costanzo, Michael; VanderSluis, Benjamin; Koch, Elizabeth N.; Baryshnikova, Anastasia; Pons, Carles; Tan, Guihong; Wang, Wen; Usaj, Matej; Hanchard, Julia; Lee, Susan D.; Pelechano, Vicent; Styles, Erin B.; Billmann, Maximilian; van Leeuwen, Jolanda; van Dyk, Nydia; Lin, Zhen-Yuan; Kuzmin, Elena; Nelson, Justin; Piotrowski, Jeff S.; Srikumar, Tharan; Bahr, Sondra; Chen, Yiqun; Deshpande, Raamesh; Kurat, Christoph F.; Li, Sheena C.; Li, Zhijian; Usaj, Mojca Mattiazzi; Okada, Hiroki; Pascoe, Natasha; Luis, Bryan-Joseph San; Sharifpoor, Sara; Shuteriqi, Emira; Simpkins, Scott W.; Snider, Jamie; Suresh, Harsha Garadi; Tan, Yizhao; Zhu, Hongwei; Malod-Dognin, Noel; Janjic, Vuk; Przulj, Natasa; Troyanskaya, Olga G.; Stagljar, Igor; Xia, Tian; Ohya, Yoshikazu; Gingras, Anne-Claude; Raught, Brian; Boutros, Michael; Steinmetz, Lars M.; Moore, Claire L.; Rosebrock, Adam P.; Caudy, Amy A.; Myers, Chad L.; Andrews, Brenda; Boone, Charles
2017-01-01
We generated a global genetic interaction network for Saccharomyces cerevisiae, constructing over 23 million double mutants, identifying ~550,000 negative and ~350,000 positive genetic interactions. This comprehensive network maps genetic interactions for essential gene pairs, highlighting essential genes as densely connected hubs. Genetic interaction profiles enabled assembly of a hierarchical model of cell function, including modules corresponding to protein complexes and pathways, biological processes, and cellular compartments. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections among gene pairs, rather than shared functionality. The global network illustrates how coherent sets of genetic interactions connect protein complex and pathway modules to map a functional wiring diagram of the cell. PMID:27708008
GENIUS: web server to predict local gene networks and key genes for biological functions.
Puelma, Tomas; Araus, Viviana; Canales, Javier; Vidal, Elena A; Cabello, Juan M; Soto, Alvaro; Gutiérrez, Rodrigo A
2017-03-01
GENIUS is a user-friendly web server that uses a novel machine learning algorithm to infer functional gene networks focused on specific genes and experimental conditions that are relevant to biological functions of interest. These functions may have different levels of complexity, from specific biological processes to complex traits that involve several interacting processes. GENIUS also enriches the network with new genes related to the biological function of interest, with accuracies comparable to highly discriminative Support Vector Machine methods. GENIUS currently supports eight model organisms and is freely available for public use at http://networks.bio.puc.cl/genius . genius.psbl@gmail.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Jalili, Mahdi; Gebhardt, Tom; Wolkenhauer, Olaf; Salehzadeh-Yazdi, Ali
2018-06-01
Decoding health and disease phenotypes is one of the fundamental objectives in biomedicine. Whereas high-throughput omics approaches are available, it is evident that any single omics approach might not be adequate to capture the complexity of phenotypes. Therefore, integrated multi-omics approaches have been used to unravel genotype-phenotype relationships such as global regulatory mechanisms and complex metabolic networks in different eukaryotic organisms. Some of the progress and challenges associated with integrated omics studies have been reviewed previously in comprehensive studies. In this work, we highlight and review the progress, challenges and advantages associated with emerging approaches, integrating gene expression and protein-protein interaction networks to unravel network-based functional features. This includes identifying disease related genes, gene prioritization, clustering protein interactions, developing the modules, extract active subnetworks and static protein complexes or dynamic/temporal protein complexes. We also discuss how these approaches contribute to our understanding of the biology of complex traits and diseases. This article is part of a Special Issue entitled: Cardiac adaptations to obesity, diabetes and insulin resistance, edited by Professors Jan F.C. Glatz, Jason R.B. Dyck and Christine Des Rosiers. Copyright © 2018 Elsevier B.V. All rights reserved.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin
2017-01-01
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Application of network methods for understanding evolutionary dynamics in discrete habitats.
Greenbaum, Gili; Fefferman, Nina H
2017-06-01
In populations occupying discrete habitat patches, gene flow between habitat patches may form an intricate population structure. In such structures, the evolutionary dynamics resulting from interaction of gene-flow patterns with other evolutionary forces may be exceedingly complex. Several models describing gene flow between discrete habitat patches have been presented in the population-genetics literature; however, these models have usually addressed relatively simple settings of habitable patches and have stopped short of providing general methodologies for addressing nontrivial gene-flow patterns. In the last decades, network theory - a branch of discrete mathematics concerned with complex interactions between discrete elements - has been applied to address several problems in population genetics by modelling gene flow between habitat patches using networks. Here, we present the idea and concepts of modelling complex gene flows in discrete habitats using networks. Our goal is to raise awareness to existing network theory applications in molecular ecology studies, as well as to outline the current and potential contribution of network methods to the understanding of evolutionary dynamics in discrete habitats. We review the main branches of network theory that have been, or that we believe potentially could be, applied to population genetics and molecular ecology research. We address applications to theoretical modelling and to empirical population-genetic studies, and we highlight future directions for extending the integration of network science with molecular ecology. © 2017 John Wiley & Sons Ltd.
Laarits, T; Bordalo, P; Lemos, B
2016-08-01
Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Fine-tuning gene networks using simple sequence repeats
Egbert, Robert G.; Klavins, Eric
2012-01-01
The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Jiang, Peng; Scarpa, Joseph R.; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D.; Hao, Ke; Summa, Keith C.; Yang, He S.; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H.; Turek, Fred W.; Kasarskis, Andrew
2016-01-01
SUMMARY Sleep dysfunction and stress susceptibility are co-morbid complex traits, which often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multi-level organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J×A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests the interplay between sleep, stress, and neuropathology emerge from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework to interrogate the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. PMID:25921536
Wang, Jian; Xie, Dong; Lin, Hongfei; Yang, Zhihao; Zhang, Yijia
2012-06-21
Many biological processes recognize in particular the importance of protein complexes, and various computational approaches have been developed to identify complexes from protein-protein interaction (PPI) networks. However, high false-positive rate of PPIs leads to challenging identification. A protein semantic similarity measure is proposed in this study, based on the ontology structure of Gene Ontology (GO) terms and GO annotations to estimate the reliability of interactions in PPI networks. Interaction pairs with low GO semantic similarity are removed from the network as unreliable interactions. Then, a cluster-expanding algorithm is used to detect complexes with core-attachment structure on filtered network. Our method is applied to three different yeast PPI networks. The effectiveness of our method is examined on two benchmark complex datasets. Experimental results show that our method performed better than other state-of-the-art approaches in most evaluation metrics. The method detects protein complexes from large scale PPI networks by filtering GO semantic similarity. Removing interactions with low GO similarity significantly improves the performance of complex identification. The expanding strategy is also effective to identify attachment proteins of complexes.
Digital Signal Processing and Control for the Study of Gene Networks
NASA Astrophysics Data System (ADS)
Shin, Yong-Jun
2016-04-01
Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.
Digital Signal Processing and Control for the Study of Gene Networks.
Shin, Yong-Jun
2016-04-22
Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks.
Digital Signal Processing and Control for the Study of Gene Networks
Shin, Yong-Jun
2016-01-01
Thanks to the digital revolution, digital signal processing and control has been widely used in many areas of science and engineering today. It provides practical and powerful tools to model, simulate, analyze, design, measure, and control complex and dynamic systems such as robots and aircrafts. Gene networks are also complex dynamic systems which can be studied via digital signal processing and control. Unlike conventional computational methods, this approach is capable of not only modeling but also controlling gene networks since the experimental environment is mostly digital today. The overall aim of this article is to introduce digital signal processing and control as a useful tool for the study of gene networks. PMID:27102828
Gene expression links functional networks across cortex and striatum.
Anderson, Kevin M; Krienen, Fenna M; Choi, Eun Young; Reinen, Jenna M; Yeo, B T Thomas; Holmes, Avram J
2018-04-12
The human brain is comprised of a complex web of functional networks that link anatomically distinct regions. However, the biological mechanisms supporting network organization remain elusive, particularly across cortical and subcortical territories with vastly divergent cellular and molecular properties. Here, using human and primate brain transcriptional atlases, we demonstrate that spatial patterns of gene expression show strong correspondence with limbic and somato/motor cortico-striatal functional networks. Network-associated expression is consistent across independent human datasets and evolutionarily conserved in non-human primates. Genes preferentially expressed within the limbic network (encompassing nucleus accumbens, orbital/ventromedial prefrontal cortex, and temporal pole) relate to risk for psychiatric illness, chloride channel complexes, and markers of somatostatin neurons. Somato/motor associated genes are enriched for oligodendrocytes and markers of parvalbumin neurons. These analyses indicate that parallel cortico-striatal processing channels possess dissociable genetic signatures that recapitulate distributed functional networks, and nominate molecular mechanisms supporting cortico-striatal circuitry in health and disease.
Diaz-Montana, Juan J.; Diaz-Diaz, Norberto
2014-01-01
Gene networks are one of the main computational models used to study the interaction between different elements during biological processes being widely used to represent gene–gene, or protein–protein interaction complexes. We present GFD-Net, a Cytoscape app for visualizing and analyzing the functional dissimilarity of gene networks. PMID:25400907
Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A
2017-04-01
Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weighill, Deborah; Jones, Piet; Shah, Manesh
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
Weighill, Deborah; Jones, Piet; Shah, Manesh; ...
2018-05-11
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John
2016-04-01
Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.
Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M
2017-01-01
Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms and genes that lead to complex phenotypes, like meat quality, and the nutritional and healthfulness value of beef. Improvements in genome annotation and knowledge of gene function will contribute to more comprehensive analyses that will advance our ability to dissect the complex architecture of complex traits.
Yeast Phenomics: An Experimental Approach for Modeling Gene Interaction Networks that Buffer Disease
Hartman, John L.; Stisher, Chandler; Outlaw, Darryl A.; Guo, Jingyu; Shah, Najaf A.; Tian, Dehua; Santos, Sean M.; Rodgers, John W.; White, Richard A.
2015-01-01
The genome project increased appreciation of genetic complexity underlying disease phenotypes: many genes contribute each phenotype and each gene contributes multiple phenotypes. The aspiration of predicting common disease in individuals has evolved from seeking primary loci to marginal risk assignments based on many genes. Genetic interaction, defined as contributions to a phenotype that are dependent upon particular digenic allele combinations, could improve prediction of phenotype from complex genotype, but it is difficult to study in human populations. High throughput, systematic analysis of S. cerevisiae gene knockouts or knockdowns in the context of disease-relevant phenotypic perturbations provides a tractable experimental approach to derive gene interaction networks, in order to deduce by cross-species gene homology how phenotype is buffered against disease-risk genotypes. Yeast gene interaction network analysis to date has revealed biology more complex than previously imagined. This has motivated the development of more powerful yeast cell array phenotyping methods to globally model the role of gene interaction networks in modulating phenotypes (which we call yeast phenomic analysis). The article illustrates yeast phenomic technology, which is applied here to quantify gene X media interaction at higher resolution and supports use of a human-like media for future applications of yeast phenomics for modeling human disease. PMID:25668739
Multiscale Embedded Gene Co-expression Network Analysis
Song, Won-Min; Zhang, Bin
2015-01-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778
Multiscale Embedded Gene Co-expression Network Analysis.
Song, Won-Min; Zhang, Bin
2015-11-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Discovering disease-associated genes in weighted protein-protein interaction networks
NASA Astrophysics Data System (ADS)
Cui, Ying; Cai, Meng; Stanley, H. Eugene
2018-04-01
Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.
Liu, Lizhen; Sun, Xiaowu; Song, Wei; Du, Chao
2018-06-01
Predicting protein complexes from protein-protein interaction (PPI) network is of great significance to recognize the structure and function of cells. A protein may interact with different proteins under different time or conditions. Existing approaches only utilize static PPI network data that may lose much temporal biological information. First, this article proposed a novel method that combines gene expression data at different time points with traditional static PPI network to construct different dynamic subnetworks. Second, to further filter out the data noise, the semantic similarity based on gene ontology is regarded as the network weight together with the principal component analysis, which is introduced to deal with the weight computing by three traditional methods. Third, after building a dynamic PPI network, a predicting protein complexes algorithm based on "core-attachment" structural feature is applied to detect complexes from each dynamic subnetworks. Finally, it is revealed from the experimental results that our method proposed in this article performs well on detecting protein complexes from dynamic weighted PPI networks.
Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter
2014-09-24
Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data from genome-wide association studies, and will help in the understanding of how the associated genetic variants influence disease or quantitative phenotypes.
Detecting complexes from edge-weighted PPI networks via genes expression analysis.
Zhang, Zehua; Song, Jian; Tang, Jijun; Xu, Xinying; Guo, Fei
2018-04-24
Identifying complexes from PPI networks has become a key problem to elucidate protein functions and identify signal and biological processes in a cell. Proteins binding as complexes are important roles of life activity. Accurate determination of complexes in PPI networks is crucial for understanding principles of cellular organization. We propose a novel method to identify complexes on PPI networks, based on different co-expression information. First, we use Markov Cluster Algorithm with an edge-weighting scheme to calculate complexes on PPI networks. Then, we propose some significant features, such as graph information and gene expression analysis, to filter and modify complexes predicted by Markov Cluster Algorithm. To evaluate our method, we test on two experimental yeast PPI networks. On DIP network, our method has Precision and F-Measure values of 0.6004 and 0.5528. On MIPS network, our method has F-Measure and S n values of 0.3774 and 0.3453. Comparing to existing methods, our method improves Precision value by at least 0.1752, F-Measure value by at least 0.0448, S n value by at least 0.0771. Experiments show that our method achieves better results than some state-of-the-art methods for identifying complexes on PPI networks, with the prediction quality improved in terms of evaluation criteria.
Jiang, Peng; Scarpa, Joseph R; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D; Hao, Ke; Summa, Keith C; Yang, He S; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew
2015-05-05
Sleep dysfunction and stress susceptibility are comorbid complex traits that often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multilevel organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J × A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type-specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests that the interplay among sleep, stress, and neuropathology emerges from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework for interrogating the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases.
Berger, Seth I; Posner, Jeremy M; Ma'ayan, Avi
2007-10-04
In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
Network Medicine: From Cellular Networks to the Human Diseasome
NASA Astrophysics Data System (ADS)
Barabasi, Albert-Laszlo
2014-03-01
Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular network. The tools of network science offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identification of disease modules and pathways, but also the molecular relationships between apparently distinct (patho)phenotypes. Advances in this direction not only enrich our understanding of complex systems, but are also essential to identify new disease genes, to uncover the biological significance of disease-associated mutations identified by genome-wide association studies and full genome sequencing, and to identify drug targets and biomarkers for complex diseases.
Global Mapping of the Yeast Genetic Interaction Network
NASA Astrophysics Data System (ADS)
Tong, Amy Hin Yan; Lesage, Guillaume; Bader, Gary D.; Ding, Huiming; Xu, Hong; Xin, Xiaofeng; Young, James; Berriz, Gabriel F.; Brost, Renee L.; Chang, Michael; Chen, YiQun; Cheng, Xin; Chua, Gordon; Friesen, Helena; Goldberg, Debra S.; Haynes, Jennifer; Humphries, Christine; He, Grace; Hussein, Shamiza; Ke, Lizhu; Krogan, Nevan; Li, Zhijian; Levinson, Joshua N.; Lu, Hong; Ménard, Patrice; Munyana, Christella; Parsons, Ainslie B.; Ryan, Owen; Tonikian, Raffi; Roberts, Tania; Sdicu, Anne-Marie; Shapiro, Jesse; Sheikh, Bilal; Suter, Bernhard; Wong, Sharyl L.; Zhang, Lan V.; Zhu, Hongwei; Burd, Christopher G.; Munro, Sean; Sander, Chris; Rine, Jasper; Greenblatt, Jack; Peter, Matthias; Bretscher, Anthony; Bell, Graham; Roth, Frederick P.; Brown, Grant W.; Andrews, Brenda; Bussey, Howard; Boone, Charles
2004-02-01
A genetic interaction network containing ~1000 genes and ~4000 interactions was mapped by crossing mutations in 132 different query genes into a set of ~4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.
Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.
Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich
2004-03-01
By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Wu, Mengmeng; Zeng, Wanwen; Liu, Wenqiang; Lv, Hairong; Chen, Ting; Jiang, Rui
2018-06-03
Genome-wide association studies (GWAS) have successfully discovered a number of disease-associated genetic variants in the past decade, providing an unprecedented opportunity for deciphering genetic basis of human inherited diseases. However, it is still a challenging task to extract biological knowledge from the GWAS data, due to such issues as missing heritability and weak interpretability. Indeed, the fact that the majority of discovered loci fall into noncoding regions without clear links to genes has been preventing the characterization of their functions and appealing for a sophisticated approach to bridge genetic and genomic studies. Towards this problem, network-based prioritization of candidate genes, which performs integrated analysis of gene networks with GWAS data, has emerged as a promising direction and attracted much attention. However, most existing methods overlook the sparse and noisy properties of gene networks and thus may lead to suboptimal performance. Motivated by this understanding, we proposed a novel method called REGENT for integrating multiple gene networks with GWAS data to prioritize candidate genes for complex diseases. We leveraged a technique called the network representation learning to embed a gene network into a compact and robust feature space, and then designed a hierarchical statistical model to integrate features of multiple gene networks with GWAS data for the effective inference of genes associated with a disease of interest. We applied our method to six complex diseases and demonstrated the superior performance of REGENT over existing approaches in recovering known disease-associated genes. We further conducted a pathway analysis and showed that the ability of REGENT to discover disease-associated pathways. We expect to see applications of our method to a broad spectrum of diseases for post-GWAS analysis. REGENT is freely available at https://github.com/wmmthu/REGENT. Copyright © 2018 Elsevier Inc. All rights reserved.
Gene expression complex networks: synthesis, identification, and analysis.
Lopes, Fabrício M; Cesar, Roberto M; Costa, Luciano Da F
2011-10-01
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-01-01
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs. PMID:29113310
Yu, Bin; Xu, Jia-Meng; Li, Shan; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Zhang, Yan; Wang, Ming-Hui
2017-10-06
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli , and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Network Medicine: A Network-based Approach to Human Disease
Barabási, Albert-László; Gulbahce, Natali; Loscalzo, Joseph
2011-01-01
Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular network. The emerging tools of network medicine offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identification of disease modules and pathways, but also the molecular relationships between apparently distinct (patho)phenotypes. Advances in this direction are essential to identify new diseases genes, to uncover the biological significance of disease-associated mutations identified by genome-wide association studies and full genome sequencing, and to identify drug targets and biomarkers for complex diseases. PMID:21164525
Integration of biological networks and gene expression data using Cytoscape
Cline, Melissa S; Smoot, Michael; Cerami, Ethan; Kuchinsky, Allan; Landys, Nerius; Workman, Chris; Christmas, Rowan; Avila-Campilo, Iliana; Creech, Michael; Gross, Benjamin; Hanspers, Kristina; Isserlin, Ruth; Kelley, Ryan; Killcoyne, Sarah; Lotia, Samad; Maere, Steven; Morris, John; Ono, Keiichiro; Pavlovic, Vuk; Pico, Alexander R; Vailaya, Aditya; Wang, Peng-Liang; Adler, Annette; Conklin, Bruce R; Hood, Leroy; Kuiper, Martin; Sander, Chris; Schmulevich, Ilya; Schwikowski, Benno; Warner, Guy J; Ideker, Trey; Bader, Gary D
2013-01-01
Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape. PMID:17947979
Mohamed Salleh, Faridah Hani; Arif, Shereena Mohd; Zainudin, Suhaila; Firdaus-Raih, Mohd
2015-12-01
A gene regulatory network (GRN) is a large and complex network consisting of interacting elements that, over time, affect each other's state. The dynamics of complex gene regulatory processes are difficult to understand using intuitive approaches alone. To overcome this problem, we propose an algorithm for inferring the regulatory interactions from knock-out data using a Gaussian model combines with Pearson Correlation Coefficient (PCC). There are several problems relating to GRN construction that have been outlined in this paper. We demonstrated the ability of our proposed method to (1) predict the presence of regulatory interactions between genes, (2) their directionality and (3) their states (activation or suppression). The algorithm was applied to network sizes of 10 and 50 genes from DREAM3 datasets and network sizes of 10 from DREAM4 datasets. The predicted networks were evaluated based on AUROC and AUPR. We discovered that high false positive values were generated by our GRN prediction methods because the indirect regulations have been wrongly predicted as true relationships. We achieved satisfactory results as the majority of sub-networks achieved AUROC values above 0.5. Copyright © 2015 Elsevier Ltd. All rights reserved.
Systems Genetics as a Tool to Identify Master Genetic Regulators in Complex Disease.
Moreno-Moral, Aida; Pesce, Francesco; Behmoaras, Jacques; Petretto, Enrico
2017-01-01
Systems genetics stems from systems biology and similarly employs integrative modeling approaches to describe the perturbations and phenotypic effects observed in a complex system. However, in the case of systems genetics the main source of perturbation is naturally occurring genetic variation, which can be analyzed at the systems-level to explain the observed variation in phenotypic traits. In contrast with conventional single-variant association approaches, the success of systems genetics has been in the identification of gene networks and molecular pathways that underlie complex disease. In addition, systems genetics has proven useful in the discovery of master trans-acting genetic regulators of functional networks and pathways, which in many cases revealed unexpected gene targets for disease. Here we detail the central components of a fully integrated systems genetics approach to complex disease, starting from assessment of genetic and gene expression variation, linking DNA sequence variation to mRNA (expression QTL mapping), gene regulatory network analysis and mapping the genetic control of regulatory networks. By summarizing a few illustrative (and successful) examples, we highlight how different data-modeling strategies can be effectively integrated in a systems genetics study.
Li, Wan; Zhu, Lina; Huang, Hao; He, Yuehan; Lv, Junjie; Li, Weimin; Chen, Lina; He, Weiming
2017-10-01
Complex chronic diseases are caused by the effects of genetic and environmental factors. Single nucleotide polymorphisms (SNPs), one common type of genetic variations, played vital roles in diseases. We hypothesized that disease risk functional SNPs in coding regions and protein interaction network modules were more likely to contribute to the identification of disease susceptible genes for complex chronic diseases. This could help to further reveal the pathogenesis of complex chronic diseases. Disease risk SNPs were first recognized from public SNP data for coronary heart disease (CHD), hypertension (HT) and type 2 diabetes (T2D). SNPs in coding regions that were classified into nonsense and missense by integrating several SNP functional annotation databases were treated as functional SNPs. Then, regions significantly associated with each disease were screened using random permutations for disease risk functional SNPs. Corresponding to these regions, 155, 169 and 173 potential disease susceptible genes were identified for CHD, HT and T2D, respectively. A disease-related gene product interaction network in environmental context was constructed for interacting gene products of both disease genes and potential disease susceptible genes for these diseases. After functional enrichment analysis for disease associated modules, 5 CHD susceptible genes, 7 HT susceptible genes and 3 T2D susceptible genes were finally identified, some of which had pleiotropic effects. Most of these genes were verified to be related to these diseases in literature. This was similar for disease genes identified from another method proposed by Lee et al. from a different aspect. This research could provide novel perspectives for diagnosis and treatment of complex chronic diseases and susceptible genes identification for other diseases. Copyright © 2017 Elsevier Inc. All rights reserved.
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S.
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites. PMID:27588023
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites.
Abduallah, Yasser; Turki, Turki; Byron, Kevin; Du, Zongxuan; Cervantes-Cervantes, Miguel; Wang, Jason T L
2017-01-01
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network
Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes
2012-01-01
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms. PMID:22807664
Network Approach to Disease Diagnosis
NASA Astrophysics Data System (ADS)
Sharma, Amitabh; Bashan, Amir; Barabasi, Alber-Laszlo
2014-03-01
Human diseases could be viewed as perturbations of the underlying biological system. A thorough understanding of the topological and dynamical properties of the biological system is crucial to explain the mechanisms of many complex diseases. Recently network-based approaches have provided a framework for integrating multi-dimensional biological data that results in a better understanding of the pathophysiological state of complex diseases. Here we provide a network-based framework to improve the diagnosis of complex diseases. This framework is based on the integration of transcriptomics and the interactome. We analyze the overlap between the differentially expressed (DE) genes and disease genes (DGs) based on their locations in the molecular interaction network (''interactome''). Disease genes and their protein products tend to be much more highly connected than random, hence defining a disease sub-graph (called disease module) in the interactome. DE genes, even though different from the known set of DGs, may be significantly associated with the disease when considering their closeness to the disease module in the interactome. This new network approach holds the promise to improve the diagnosis of patients who cannot be diagnosed using conventional tools. Support was provided by HL066289 and HL105339 grants from the U.S. National Institutes of Health.
Bando, Silvia Yumi; Silva, Filipi Nascimento; Costa, Luciano da Fontoura; Silva, Alexandre V.; Pimentel-Silva, Luciana R.; Castro, Luiz HM.; Wen, Hung-Tzu; Amaro, Edson; Moreira-Filho, Carlos Alberto
2013-01-01
We previously described – studying transcriptional signatures of hippocampal CA3 explants – that febrile (FS) and afebrile (NFS) forms of refractory mesial temporal lobe epilepsy constitute two distinct genomic phenotypes. That network analysis was based on a limited number (hundreds) of differentially expressed genes (DE networks) among a large set of valid transcripts (close to two tens of thousands). Here we developed a methodology for complex network visualization (3D) and analysis that allows the categorization of network nodes according to distinct hierarchical levels of gene-gene connections (node degree) and of interconnection between node neighbors (concentric node degree). Hubs are highly connected nodes, VIPs have low node degree but connect only with hubs, and high-hubs have VIP status and high overall number of connections. Studying the whole set of CA3 valid transcripts we: i) obtained complete transcriptional networks (CO) for FS and NFS phenotypic groups; ii) examined how CO and DE networks are related; iii) characterized genomic and molecular mechanisms underlying FS and NFS phenotypes, identifying potential novel targets for therapeutic interventions. We found that: i) DE hubs and VIPs are evenly distributed inside the CO networks; ii) most DE hubs and VIPs are related to synaptic transmission and neuronal excitability whereas most CO hubs, VIPs and high hubs are related to neuronal differentiation, homeostasis and neuroprotection, indicating compensatory mechanisms. Complex network visualization and analysis is a useful tool for systems biology approaches to multifactorial diseases. Network centrality observed for hubs, VIPs and high hubs of CO networks, is consistent with the network disease model, where a group of nodes whose perturbation leads to a disease phenotype occupies a central position in the network. Conceivably, the chance for exerting therapeutic effects through the modulation of particular genes will be higher if these genes are highly interconnected in transcriptional networks. PMID:24278214
Insights into the Ecology and Evolution of Polyploid Plants through Network Analysis.
Gallagher, Joseph P; Grover, Corrinne E; Hu, Guanjing; Wendel, Jonathan F
2016-06-01
Polyploidy is a widespread phenomenon throughout eukaryotes, with important ecological and evolutionary consequences. Although genes operate as components of complex pathways and networks, polyploid changes in genes and gene expression have typically been evaluated as either individual genes or as a part of broad-scale analyses. Network analysis has been fruitful in associating genomic and other 'omic'-based changes with phenotype for many systems. In polyploid species, network analysis has the potential not only to facilitate a better understanding of the complex 'omic' underpinnings of phenotypic and ecological traits common to polyploidy, but also to provide novel insight into the interaction among duplicated genes and genomes. This adds perspective to the global patterns of expression (and other 'omic') change that accompany polyploidy and to the patterns of recruitment and/or loss of genes following polyploidization. While network analysis in polyploid species faces challenges common to other analyses of duplicated genomes, present technologies combined with thoughtful experimental design provide a powerful system to explore polyploid evolution. Here, we demonstrate the utility and potential of network analysis to questions pertaining to polyploidy with an example involving evolution of the transgressively superior cotton fibres found in polyploid Gossypium hirsutum. By combining network analysis with prior knowledge, we provide further insights into the role of profilins in fibre domestication and exemplify the potential for network analysis in polyploid species. © 2016 John Wiley & Sons Ltd.
Estimation of Dynamic Systems for Gene Regulatory Networks from Dependent Time-Course Data.
Kim, Yoonji; Kim, Jaejik
2018-06-15
Dynamic system consisting of ordinary differential equations (ODEs) is a well-known tool for describing dynamic nature of gene regulatory networks (GRNs), and the dynamic features of GRNs are usually captured through time-course gene expression data. Owing to high-throughput technologies, time-course gene expression data have complex structures such as heteroscedasticity, correlations between genes, and time dependence. Since gene experiments typically yield highly noisy data with small sample size, for a more accurate prediction of the dynamics, the complex structures should be taken into account in ODE models. Hence, this study proposes an ODE model considering such data structures and a fast and stable estimation method for the ODE parameters based on the generalized profiling approach with data smoothing techniques. The proposed method also provides statistical inference for the ODE estimator and it is applied to a zebrafish retina cell network.
Ritchie, Marylyn D; White, Bill C; Parker, Joel S; Hahn, Lance W; Moore, Jason H
2003-01-01
Background Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases. Results Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present. Conclusion This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases. PMID:12846935
Language Networks as Complex Systems
ERIC Educational Resources Information Center
Lee, Max Kueiming; Ou, Sheue-Jen
2008-01-01
Starting in the late eighties, with a growing discontent with analytical methods in science and the growing power of computers, researchers began to study complex systems such as living organisms, evolution of genes, biological systems, brain neural networks, epidemics, ecology, economy, social networks, etc. In the early nineties, the research…
Cellular and synaptic network defects in autism
Peça, João; Feng, Guoping
2012-01-01
Many candidate genes are now thought to confer susceptibility to autism spectrum disorder (ASD). Here we review four interrelated complexes, each composed of multiple families of genes that functionally coalesce on common cellular pathways. We illustrate a common thread in the organization of glutamatergic synapses and suggest a link between genes involved in Tuberous Sclerosis Complex, Fragile X syndrome, Angelman syndrome and several synaptic ASD candidate genes. When viewed in this context, progress in deciphering the molecular architecture of cellular protein-protein interactions together with the unraveling of synaptic dysfunction in neural networks may prove pivotal to advancing our understanding of ASDs. PMID:22440525
Wilczynski, Bartek; Furlong, Eileen E M
2010-04-15
Development is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative "coarse-grain" models operating at the gene level to very "fine-grain" quantitative models operating at the biophysical "transcription factor-DNA level". Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Detection of Significant Pneumococcal Meningitis Biomarkers by Ego Network.
Wang, Qian; Lou, Zhifeng; Zhai, Liansuo; Zhao, Haibin
2017-06-01
To identify significant biomarkers for detection of pneumococcal meningitis based on ego network. Based on the gene expression data of pneumococcal meningitis and global protein-protein interactions (PPIs) data recruited from open access databases, the authors constructed a differential co-expression network (DCN) to identify pneumococcal meningitis biomarkers in a network view. Here EgoNet algorithm was employed to screen the significant ego networks that could accurately distinguish pneumococcal meningitis from healthy controls, by sequentially seeking ego genes, searching candidate ego networks, refinement of candidate ego networks and significance analysis to identify ego networks. Finally, the functional inference of the ego networks was performed to identify significant pathways for pneumococcal meningitis. By differential co-expression analysis, the authors constructed the DCN that covered 1809 genes and 3689 interactions. From the DCN, a total of 90 ego genes were identified. Starting from these ego genes, three significant ego networks (Module 19, Module 70 and Module 71) that could predict clinical outcomes for pneumococcal meningitis were identified by EgoNet algorithm, and the corresponding ego genes were GMNN, MAD2L1 and TPX2, respectively. Pathway analysis showed that these three ego networks were related to CDT1 association with the CDC6:ORC:origin complex, inactivation of APC/C via direct inhibition of the APC/C complex pathway, and DNA strand elongation, respectively. The authors successfully screened three significant ego modules which could accurately predict the clinical outcomes for pneumococcal meningitis and might play important roles in host response to pathogen infection in pneumococcal meningitis.
Graph Curvature for Differentiating Cancer Networks
Sandhu, Romeil; Georgiou, Tryphon; Reznik, Ed; Zhu, Liangjia; Kolesov, Ivan; Senbabaoglu, Yasin; Tannenbaum, Allen
2015-01-01
Cellular interactions can be modeled as complex dynamical systems represented by weighted graphs. The functionality of such networks, including measures of robustness, reliability, performance, and efficiency, are intrinsically tied to the topology and geometry of the underlying graph. Utilizing recently proposed geometric notions of curvature on weighted graphs, we investigate the features of gene co-expression networks derived from large-scale genomic studies of cancer. We find that the curvature of these networks reliably distinguishes between cancer and normal samples, with cancer networks exhibiting higher curvature than their normal counterparts. We establish a quantitative relationship between our findings and prior investigations of network entropy. Furthermore, we demonstrate how our approach yields additional, non-trivial pair-wise (i.e. gene-gene) interactions which may be disrupted in cancer samples. The mathematical formulation of our approach yields an exact solution to calculating pair-wise changes in curvature which was computationally infeasible using prior methods. As such, our findings lay the foundation for an analytical approach to studying complex biological networks. PMID:26169480
Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice.
Smita, Shuchi; Katiyar, Amit; Chinnusamy, Viswanathan; Pandey, Dev M; Bansal, Kailash C
2015-01-01
MYB transcription factor (TF) is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by "top-down" and "guide-gene" approaches. More than 50% of OsMYBs were strongly correlated under 50 experimental conditions with 51 hub genes via "top-down" approach. Further, clusters were identified using Markov Clustering (MCL). To maximize the clustering performance, parameter evaluation of the MCL inflation score (I) was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by "guide-gene" approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought response in rice. Thus, the co-regulatory network analysis facilitated the identification of complex OsMYB regulatory networks, and candidate target regulon genes of selected guide MYB genes. The results contribute to the candidate gene screening, and experimentally testable hypotheses for potential regulatory MYB TFs, and their targets under stress conditions.
Wang, Quan; Jia, Peilin; Cuenco, Karen T.; Feingold, Eleanor; Marazita, Mary L.; Wang, Lily; Zhao, Zhongming
2013-01-01
A number of genetic studies have suggested numerous susceptibility genes for dental caries over the past decade with few definite conclusions. The rapid accumulation of relevant information, along with the complex architecture of the disease, provides a challenging but also unique opportunity to review and integrate the heterogeneous data for follow-up validation and exploration. In this study, we collected and curated candidate genes from four major categories: association studies, linkage scans, gene expression analyses, and literature mining. Candidate genes were prioritized according to the magnitude of evidence related to dental caries. We then searched for dense modules enriched with the prioritized candidate genes through their protein-protein interactions (PPIs). We identified 23 modules comprising of 53 genes. Functional analyses of these 53 genes revealed three major clusters: cytokine network relevant genes, matrix metalloproteinases (MMPs) family, and transforming growth factor-beta (TGF-β) family, all of which have been previously implicated to play important roles in tooth development and carious lesions. Through our extensive data collection and an integrative application of gene prioritization and PPI network analyses, we built a dental caries-specific sub-network for the first time. Our study provided insights into the molecular mechanisms underlying dental caries. The framework we proposed in this work can be applied to other complex diseases. PMID:24146904
Degrees of separation as a statistical tool for evaluating candidate genes.
Nelson, Ronald M; Pettersson, Mats E
2014-12-01
Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
Back to the biology in systems biology: what can we learn from biomolecular networks?
Huang, Sui
2004-02-01
Genome-scale molecular networks, including protein interaction and gene regulatory networks, have taken centre stage in the investigation of the burgeoning disciplines of systems biology and biocomplexity. What do networks tell us? Some see in networks simply the comprehensive, detailed description of all cellular pathways, others seek in networks simple, higher-order qualities that emerge from the collective action of the individual pathways. This paper discusses networks from an encompassing category of thinking that will hopefully help readers to bridge the gap between these polarised viewpoints. Systems biology so far has emphasised the characterisation of large pathway maps. Now one has to ask: where is the actual biology in 'systems biology'? As structures midway between genome and phenome, and by serving as an 'extended genotype' or an 'elementary phenotype', molecular networks open a new window to the study of evolution and gene function in complex living systems. For the study of evolution, features in network topology offer a novel starting point for addressing the old debate on the relative contributions of natural selection versus intrinsic constraints to a particular trait. To study the function of genes, it is necessary not only to see them in the context of gene networks, but also to reach beyond describing network topology and to embrace the global dynamics of networks that will reveal higher-order, collective behaviour of the interacting genes. This will pave the way to understanding how the complexity of genome-wide molecular networks collapses to produce a robust whole-cell behaviour that manifests as tightly-regulated switching between distinct cell fates - the basis for multicellular life.
Waliszewski, P; Molski, M; Konarski, J
1998-06-01
A keystone of the molecular reductionist approach to cellular biology is a specific deductive strategy relating genotype to phenotype-two distinct categories. This relationship is based on the assumption that the intermediary cellular network of actively transcribed genes and their regulatory elements is deterministic (i.e., a link between expression of a gene and a phenotypic trait can always be identified, and evolution of the network in time is predetermined). However, experimental data suggest that the relationship between genotype and phenotype is nonbijective (i.e., a gene can contribute to the emergence of more than just one phenotypic trait or a phenotypic trait can be determined by expression of several genes). This implies nonlinearity (i.e., lack of the proportional relationship between input and the outcome), complexity (i.e. emergence of the hierarchical network of multiple cross-interacting elements that is sensitive to initial conditions, possesses multiple equilibria, organizes spontaneously into different morphological patterns, and is controlled in dispersed rather than centralized manner), and quasi-determinism (i.e., coexistence of deterministic and nondeterministic events) of the network. Nonlinearity within the space of the cellular molecular events underlies the existence of a fractal structure within a number of metabolic processes, and patterns of tissue growth, which is measured experimentally as a fractal dimension. Because of its complexity, the same phenotype can be associated with a number of alternative sequences of cellular events. Moreover, the primary cause initiating phenotypic evolution of cells such as malignant transformation can be favored probabilistically, but not identified unequivocally. Thermodynamic fluctuations of energy rather than gene mutations, the material traits of the fluctuations alter both the molecular and informational structure of the network. Then, the interplay between deterministic chaos, complexity, self-organization, and natural selection drives formation of malignant phenotype. This concept offers a novel perspective for investigation of tumorigenesis without invalidating current molecular findings. The essay integrates the ideas of the sciences of complexity in a biological context.
An iterative network partition algorithm for accurate identification of dense network modules
Sun, Siqi; Dong, Xinran; Fu, Yao; Tian, Weidong
2012-01-01
A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks. PMID:22121225
Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T
2016-11-14
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
The Robustness of a Signaling Complex to Domain Rearrangements Facilitates Network Evolution
Sato, Paloma M.; Yoganathan, Kogulan; Jung, Jae H.; Peisajovich, Sergio G.
2014-01-01
The rearrangement of protein domains is known to have key roles in the evolution of signaling networks and, consequently, is a major tool used to synthetically rewire networks. However, natural mutational events leading to the creation of proteins with novel domain combinations, such as in frame fusions followed by domain loss, retrotranspositions, or translocations, to name a few, often simultaneously replace pre-existing genes. Thus, while proteins with new domain combinations may establish novel network connections, it is not clear how the concomitant deletions are tolerated. We investigated the mechanisms that enable signaling networks to tolerate domain rearrangement-mediated gene replacements. Using as a model system the yeast mitogen activated protein kinase (MAPK)-mediated mating pathway, we analyzed 92 domain-rearrangement events affecting 11 genes. Our results indicate that, while domain rearrangement events that result in the loss of catalytic activities within the signaling complex are not tolerated, domain rearrangements can drastically alter protein interactions without impairing function. This suggests that signaling complexes can maintain function even when some components are recruited to alternative sites within the complex. Furthermore, we also found that the ability of the complex to tolerate changes in interaction partners does not depend on long disordered linkers that often connect domains. Taken together, our results suggest that some signaling complexes are dynamic ensembles with loose spatial constraints that could be easily re-shaped by evolution and, therefore, are ideal targets for cellular engineering. PMID:25490747
Model-based design of RNA hybridization networks implemented in living cells
Rodrigo, Guillermo; Prakash, Satya; Shen, Shensi; Majer, Eszter
2017-01-01
Abstract Synthetic gene circuits allow the behavior of living cells to be reprogrammed, and non-coding small RNAs (sRNAs) are increasingly being used as programmable regulators of gene expression. However, sRNAs (natural or synthetic) are generally used to regulate single target genes, while complex dynamic behaviors would require networks of sRNAs regulating each other. Here, we report a strategy for implementing such networks that exploits hybridization reactions carried out exclusively by multifaceted sRNAs that are both targets of and triggers for other sRNAs. These networks are ultimately coupled to the control of gene expression. We relied on a thermodynamic model of the different stable conformational states underlying this system at the nucleotide level. To test our model, we designed five different RNA hybridization networks with a linear architecture, and we implemented them in Escherichia coli. We validated the network architecture at the molecular level by native polyacrylamide gel electrophoresis, as well as the network function at the bacterial population and single-cell levels with a fluorescent reporter. Our results suggest that it is possible to engineer complex cellular programs based on RNA from first principles. Because these networks are mainly based on physical interactions, our designs could be expanded to other organisms as portable regulatory resources or to implement biological computations. PMID:28934501
Bauer-Mehren, Anna; Bundschus, Markus; Rautschka, Michael; Mayer, Miguel A.; Sanz, Ferran; Furlong, Laura I.
2011-01-01
Background Scientists have been trying to understand the molecular mechanisms of diseases to design preventive and therapeutic strategies for a long time. For some diseases, it has become evident that it is not enough to obtain a catalogue of the disease-related genes but to uncover how disruptions of molecular networks in the cell give rise to disease phenotypes. Moreover, with the unprecedented wealth of information available, even obtaining such catalogue is extremely difficult. Principal Findings We developed a comprehensive gene-disease association database by integrating associations from several sources that cover different biomedical aspects of diseases. In particular, we focus on the current knowledge of human genetic diseases including mendelian, complex and environmental diseases. To assess the concept of modularity of human diseases, we performed a systematic study of the emergent properties of human gene-disease networks by means of network topology and functional annotation analysis. The results indicate a highly shared genetic origin of human diseases and show that for most diseases, including mendelian, complex and environmental diseases, functional modules exist. Moreover, a core set of biological pathways is found to be associated with most human diseases. We obtained similar results when studying clusters of diseases, suggesting that related diseases might arise due to dysfunction of common biological processes in the cell. Conclusions For the first time, we include mendelian, complex and environmental diseases in an integrated gene-disease association database and show that the concept of modularity applies for all of them. We furthermore provide a functional analysis of disease-related modules providing important new biological insights, which might not be discovered when considering each of the gene-disease association repositories independently. Hence, we present a suitable framework for the study of how genetic and environmental factors, such as drugs, contribute to diseases. Availability The gene-disease networks used in this study and part of the analysis are available at http://ibi.imim.es/DisGeNET/DisGeNETweb.html#Download. PMID:21695124
Bauer-Mehren, Anna; Bundschus, Markus; Rautschka, Michael; Mayer, Miguel A; Sanz, Ferran; Furlong, Laura I
2011-01-01
Scientists have been trying to understand the molecular mechanisms of diseases to design preventive and therapeutic strategies for a long time. For some diseases, it has become evident that it is not enough to obtain a catalogue of the disease-related genes but to uncover how disruptions of molecular networks in the cell give rise to disease phenotypes. Moreover, with the unprecedented wealth of information available, even obtaining such catalogue is extremely difficult. We developed a comprehensive gene-disease association database by integrating associations from several sources that cover different biomedical aspects of diseases. In particular, we focus on the current knowledge of human genetic diseases including mendelian, complex and environmental diseases. To assess the concept of modularity of human diseases, we performed a systematic study of the emergent properties of human gene-disease networks by means of network topology and functional annotation analysis. The results indicate a highly shared genetic origin of human diseases and show that for most diseases, including mendelian, complex and environmental diseases, functional modules exist. Moreover, a core set of biological pathways is found to be associated with most human diseases. We obtained similar results when studying clusters of diseases, suggesting that related diseases might arise due to dysfunction of common biological processes in the cell. For the first time, we include mendelian, complex and environmental diseases in an integrated gene-disease association database and show that the concept of modularity applies for all of them. We furthermore provide a functional analysis of disease-related modules providing important new biological insights, which might not be discovered when considering each of the gene-disease association repositories independently. Hence, we present a suitable framework for the study of how genetic and environmental factors, such as drugs, contribute to diseases. The gene-disease networks used in this study and part of the analysis are available at http://ibi.imim.es/DisGeNET/DisGeNETweb.html#Download.
Diversified Control Paths: A Significant Way Disease Genes Perturb the Human Regulatory Network
Wang, Bingbo; Gao, Lin; Zhang, Qingfang; Li, Aimin; Deng, Yue; Guo, Xingli
2015-01-01
Background The complexity of biological systems motivates us to use the underlying networks to provide deep understanding of disease etiology and the human diseases are viewed as perturbations of dynamic properties of networks. Control theory that deals with dynamic systems has been successfully used to capture systems-level knowledge in large amount of quantitative biological interactions. But from the perspective of system control, the ways by which multiple genetic factors jointly perturb a disease phenotype still remain. Results In this work, we combine tools from control theory and network science to address the diversified control paths in complex networks. Then the ways by which the disease genes perturb biological systems are identified and quantified by the control paths in a human regulatory network. Furthermore, as an application, prioritization of candidate genes is presented by use of control path analysis and gene ontology annotation for definition of similarities. We use leave-one-out cross-validation to evaluate the ability of finding the gene-disease relationship. Results have shown compatible performance with previous sophisticated works, especially in directed systems. Conclusions Our results inspire a deeper understanding of molecular mechanisms that drive pathological processes. Diversified control paths offer a basis for integrated intervention techniques which will ultimately lead to the development of novel therapeutic strategies. PMID:26284649
Wang, Feng; Liang, Yuting; Jiang, Yuji; Yang, Yunfeng; Xue, Kai; Xiong, Jinbo; Zhou, Jizhong; Sun, Bo
2015-01-01
Plants have an important impact on soil microbial communities and their functions. However, how plants determine the microbial composition and network interactions is still poorly understood. During a four-year field experiment, we investigated the functional gene composition of three types of soils (Phaeozem, Cambisols and Acrisol) under maize planting and bare fallow regimes located in cold temperate, warm temperate and subtropical regions, respectively. The core genes were identified using high-throughput functional gene microarray (GeoChip 3.0), and functional molecular ecological networks (fMENs) were subsequently developed with the random matrix theory (RMT)-based conceptual framework. Our results demonstrated that planting significantly (P < 0.05) increased the gene alpha-diversity in terms of richness and Shannon – Simpson’s indexes for all three types of soils and 83.5% of microbial alpha-diversity can be explained by the plant factor. Moreover, planting had significant impacts on the microbial community structure and the network interactions of the microbial communities. The calculated network complexity was higher under maize planting than under bare fallow regimes. The increase of the functional genes led to an increase in both soil respiration and nitrification potential with maize planting, indicating that changes in the soil microbial communities and network interactions influenced ecological functioning. PMID:26396042
Wu, Siqi; Joseph, Antony; Hammonds, Ann S; Celniker, Susan E; Yu, Bin; Frise, Erwin
2016-04-19
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set ofDrosophilaearly embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation ofDrosophilaexpression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. The performance of PP with theDrosophiladata suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.
Intrinsic noise and deviations from criticality in Boolean gene-regulatory networks
NASA Astrophysics Data System (ADS)
Villegas, Pablo; Ruiz-Franco, José; Hidalgo, Jorge; Muñoz, Miguel A.
2016-10-01
Gene regulatory networks can be successfully modeled as Boolean networks. A much discussed hypothesis says that such model networks reproduce empirical findings the best if they are tuned to operate at criticality, i.e. at the borderline between their ordered and disordered phases. Critical networks have been argued to lead to a number of functional advantages such as maximal dynamical range, maximal sensitivity to environmental changes, as well as to an excellent tradeoff between stability and flexibility. Here, we study the effect of noise within the context of Boolean networks trained to learn complex tasks under supervision. We verify that quasi-critical networks are the ones learning in the fastest possible way -even for asynchronous updating rules- and that the larger the task complexity the smaller the distance to criticality. On the other hand, when additional sources of intrinsic noise in the network states and/or in its wiring pattern are introduced, the optimally performing networks become clearly subcritical. These results suggest that in order to compensate for inherent stochasticity, regulatory and other type of biological networks might become subcritical rather than being critical, all the most if the task to be performed has limited complexity.
Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne
2005-04-15
The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.
Systems-level analysis of risk genes reveals the modular nature of schizophrenia.
Liu, Jiewei; Li, Ming; Luo, Xiong-Jian; Su, Bing
2018-05-19
Schizophrenia (SCZ) is a complex mental disorder with high heritability. Genetic studies (especially recent genome-wide association studies) have identified many risk genes for schizophrenia. However, the physical interactions among the proteins encoded by schizophrenia risk genes remain elusive and it is not known whether the identified risk genes converge on common molecular networks or pathways. Here we systematically investigated the network characteristics of schizophrenia risk genes using the high-confidence protein-protein interactions (PPI) from the human interactome. We found that schizophrenia risk genes encode a densely interconnected PPI network (P = 4.15 × 10 -31 ). Compared with the background genes, the schizophrenia risk genes in the interactome have significantly higher degree (P = 5.39 × 10 -11 ), closeness centrality (P = 7.56 × 10 -11 ), betweeness centrality (P = 1.29 × 10 -11 ), clustering coefficient (P = 2.22 × 10 -2 ), and shorter average shortest path length (P = 7.56 × 10 -11 ). Based on the densely interconnected PPI network, we identified 48 hub genes and 4 modules formed by highly interconnected schizophrenia genes. We showed that the proteins encoded by schizophrenia hub genes have significantly more direct physical interactions. Gene ontology (GO) analysis revealed that cell adhesion, cell cycle, immune system response, and GABR-receptor complex categories were enriched in the modules formed by highly interconnected schizophrenia risk genes. Our study reveals that schizophrenia risk genes encode a densely interconnected molecular network and demonstrates the modular nature of schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.
Trainable Gene Regulation Networks with Applications to Drosophila Pattern Formation
NASA Technical Reports Server (NTRS)
Mjolsness, Eric
2000-01-01
This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila melanogaster. For details the reader is referred to the papers introduced below. It will then introduce a new gene regulation network model which can describe promoter-level substructure in gene regulation. As described in chapter 2, gene regulation may be thought of as a combination of cis-acting regulation by the extended promoter of a gene (including all regulatory sequences) by way of the transcription complex, and of trans-acting regulation by the transcription factor products of other genes. If we simplify the cis-action by using a phenomenological model which can be tuned to data, such as a unit or other small portion of an artificial neural network, then the full transacting interaction between multiple genes during development can be modelled as a larger network which can again be tuned or trained to data. The larger network will in general need to have recurrent (feedback) connections since at least some real gene regulation networks do. This is the basic modeling approach taken, which describes how a set of recurrent neural networks can be used as a modeling language for multiple developmental processes including gene regulation within a single cell, cell-cell communication, and cell division. Such network models have been called "gene circuits", "gene regulation networks", or "genetic regulatory networks", sometimes without distinguishing the models from the actual modeled systems.
Modularity and evolutionary constraints in a baculovirus gene regulatory network
2013-01-01
Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates that modularity may be a general feature of biological gene regulatory networks. PMID:24006890
Modeling transcriptional networks regulating secondary growth and wood formation in forest trees
Lijun Liu; Vladimir Filkov; Andrew Groover
2013-01-01
The complex interactions among the genes that underlie a biological process can be modeled and presented as a transcriptional network, in which genes (nodes) and their interactions (edges) are shown in a graphical form similar to a wiring diagram. A large number of genes have been identified that are expressed during the radial woody growth of tree stems (secondary...
Differential network entropy reveals cancer system hallmarks
West, James; Bianconi, Ginestra; Severini, Simone; Teschendorff, Andrew E.
2012-01-01
The cellular phenotype is described by a complex network of molecular interactions. Elucidating network properties that distinguish disease from the healthy cellular state is therefore of critical importance for gaining systems-level insights into disease mechanisms and ultimately for developing improved therapies. By integrating gene expression data with a protein interaction network we here demonstrate that cancer cells are characterised by an increase in network entropy. In addition, we formally demonstrate that gene expression differences between normal and cancer tissue are anticorrelated with local network entropy changes, thus providing a systemic link between gene expression changes at the nodes and their local correlation patterns. In particular, we find that genes which drive cell-proliferation in cancer cells and which often encode oncogenes are associated with reductions in network entropy. These findings may have potential implications for identifying novel drug targets. PMID:23150773
BoolNet--an R package for generation, reconstruction and analysis of Boolean networks.
Müssel, Christoph; Hopfensitz, Martin; Kestler, Hans A
2010-05-15
As the study of information processing in living cells moves from individual pathways to complex regulatory networks, mathematical models and simulation become indispensable tools for analyzing the complex behavior of such networks and can provide deep insights into the functioning of cells. The dynamics of gene expression, for example, can be modeled with Boolean networks (BNs). These are mathematical models of low complexity, but have the advantage of being able to capture essential properties of gene-regulatory networks. However, current implementations of BNs only focus on different sub-aspects of this model and do not allow for a seamless integration into existing preprocessing pipelines. BoolNet efficiently integrates methods for synchronous, asynchronous and probabilistic BNs. This includes reconstructing networks from time series, generating random networks, robustness analysis via perturbation, Markov chain simulations, and identification and visualization of attractors. The package BoolNet is freely available from the R project at http://cran.r-project.org/ or http://www.informatik.uni-ulm.de/ni/mitarbeiter/HKestler/boolnet/ under Artistic License 2.0. hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online.
Hansen, Bjoern Oest; Meyer, Etienne H; Ferrari, Camilla; Vaid, Neha; Movahedi, Sara; Vandepoele, Klaas; Nikoloski, Zoran; Mutwil, Marek
2018-03-01
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Shadows of complexity: what biological networks reveal about epistasis and pleiotropy
Tyler, Anna L.; Asselbergs, Folkert W.; Williams, Scott M.; Moore, Jason H.
2011-01-01
Pleiotropy, in which one mutation causes multiple phenotypes, has traditionally been seen as a deviation from the conventional observation in which one gene affects one phenotype. Epistasis, or gene-gene interaction, has also been treated as an exception to the Mendelian one gene-one phenotype paradigm. This simplified perspective belies the pervasive complexity of biology and hinders progress toward a deeper understanding of biological systems. We assert that epistasis and pleiotropy are not isolated occurrences, but ubiquitous and inherent properties of biomolecular networks. These phenomena should not be treated as exceptions, but rather as fundamental components of genetic analyses. A systems level understanding of epistasis and pleiotropy is, therefore, critical to furthering our understanding of human genetics and its contribution to common human disease. Finally, graph theory offers an intuitive and powerful set of tools with which to study the network bases of these important genetic phenomena. PMID:19204994
Integration of multi-omics data for integrative gene regulatory network inference.
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun; Kang, Mingon
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called 'multi-omics data', that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN's capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed.
Integration of multi-omics data for integrative gene regulatory network inference
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called ‘multi-omics data’, that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN’s capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed. PMID:29354189
Ficklin, Stephen P; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.
Ficklin, Stephen P.; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance. PMID:23874666
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling
Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K
2006-01-01
Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705
Discover mouse gene coexpression landscapes using dictionary learning and sparse coding.
Li, Yujie; Chen, Hanbo; Jiang, Xi; Li, Xiang; Lv, Jinglei; Peng, Hanchuan; Tsien, Joe Z; Liu, Tianming
2017-12-01
Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will illuminate the higher-order transcriptome organization and offer genetic foundations of functional circuitry. Here using dictionary learning and sparse coding, we derived coexpression networks from the space-resolved anatomical comprehensive in situ hybridization data from Allen Mouse Brain Atlas dataset. The key idea is that if two genes use the same dictionary to represent their original signals, then their gene expressions must share similar patterns, thereby considering them as "coexpressed." For each network, we have simultaneous knowledge of spatial distributions, the genes in the network and the extent a particular gene conforms to the coexpression pattern. Gene ontologies and the comparisons with published gene lists reveal biologically identified coexpression networks, some of which correspond to major cell types, biological pathways, and/or anatomical regions.
On construction of stochastic genetic networks based on gene expression sequences.
Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya
2005-08-01
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
Li, Yongsheng; Sahni, Nidhi; Yi, Song
2016-11-29
Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.
Model-based design of RNA hybridization networks implemented in living cells.
Rodrigo, Guillermo; Prakash, Satya; Shen, Shensi; Majer, Eszter; Daròs, José-Antonio; Jaramillo, Alfonso
2017-09-19
Synthetic gene circuits allow the behavior of living cells to be reprogrammed, and non-coding small RNAs (sRNAs) are increasingly being used as programmable regulators of gene expression. However, sRNAs (natural or synthetic) are generally used to regulate single target genes, while complex dynamic behaviors would require networks of sRNAs regulating each other. Here, we report a strategy for implementing such networks that exploits hybridization reactions carried out exclusively by multifaceted sRNAs that are both targets of and triggers for other sRNAs. These networks are ultimately coupled to the control of gene expression. We relied on a thermodynamic model of the different stable conformational states underlying this system at the nucleotide level. To test our model, we designed five different RNA hybridization networks with a linear architecture, and we implemented them in Escherichia coli. We validated the network architecture at the molecular level by native polyacrylamide gel electrophoresis, as well as the network function at the bacterial population and single-cell levels with a fluorescent reporter. Our results suggest that it is possible to engineer complex cellular programs based on RNA from first principles. Because these networks are mainly based on physical interactions, our designs could be expanded to other organisms as portable regulatory resources or to implement biological computations. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Yang, Xiaofei; Gao, Lin; Guo, Xingli; Shi, Xinghua; Wu, Hao; Song, Fei; Wang, Bingbo
2014-01-01
Increasing evidence has indicated that long non-coding RNAs (lncRNAs) are implicated in and associated with many complex human diseases. Despite of the accumulation of lncRNA-disease associations, only a few studies had studied the roles of these associations in pathogenesis. In this paper, we investigated lncRNA-disease associations from a network view to understand the contribution of these lncRNAs to complex diseases. Specifically, we studied both the properties of the diseases in which the lncRNAs were implicated, and that of the lncRNAs associated with complex diseases. Regarding the fact that protein coding genes and lncRNAs are involved in human diseases, we constructed a coding-non-coding gene-disease bipartite network based on known associations between diseases and disease-causing genes. We then applied a propagation algorithm to uncover the hidden lncRNA-disease associations in this network. The algorithm was evaluated by leave-one-out cross validation on 103 diseases in which at least two genes were known to be involved, and achieved an AUC of 0.7881. Our algorithm successfully predicted 768 potential lncRNA-disease associations between 66 lncRNAs and 193 diseases. Furthermore, our results for Alzheimer's disease, pancreatic cancer, and gastric cancer were verified by other independent studies. PMID:24498199
Deciphering the Interdependence between Ecological and Evolutionary Networks.
Melián, Carlos J; Matthews, Blake; de Andreazzi, Cecilia S; Rodríguez, Jorge P; Harmon, Luke J; Fortuna, Miguel A
2018-05-24
Biological systems consist of elements that interact within and across hierarchical levels. For example, interactions among genes determine traits of individuals, competitive and cooperative interactions among individuals influence population dynamics, and interactions among species affect the dynamics of communities and ecosystem processes. Such systems can be represented as hierarchical networks, but can have complex dynamics when interdependencies among levels of the hierarchy occur. We propose integrating ecological and evolutionary processes in hierarchical networks to explore interdependencies in biological systems. We connect gene networks underlying predator-prey trait distributions to food webs. Our approach addresses longstanding questions about how complex traits and intraspecific trait variation affect the interdependencies among biological levels and the stability of meta-ecosystems. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Siqi; Joseph, Antony; Hammonds, Ann S.
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less
Wu, Siqi; Joseph, Antony; Hammonds, Ann S.; ...
2016-04-06
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less
Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex
2010-01-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
Ficklin, Stephen P; Luo, Feng; Feltus, F Alex
2010-09-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
Prioritization of Epilepsy Associated Candidate Genes by Convergent Analysis
Jia, Peilin; Ewers, Jeffrey M.; Zhao, Zhongming
2011-01-01
Background Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. Methodology/Principal Findings In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. Conclusions/Significance The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be applied for the study of other complex diseases. PMID:21390307
Prioritization of epilepsy associated candidate genes by convergent analysis.
Jia, Peilin; Ewers, Jeffrey M; Zhao, Zhongming
2011-02-24
Epilepsy is a severe neurological disorder affecting a large number of individuals, yet the underlying genetic risk factors for epilepsy remain unclear. Recent studies have revealed several recurrent copy number variations (CNVs) that are more likely to be associated with epilepsy. The responsible gene(s) within these regions have yet to be definitively linked to the disorder, and the implications of their interactions are not fully understood. Identification of these genes may contribute to a better pathological understanding of epilepsy, and serve to implicate novel therapeutic targets for further research. In this study, we examined genes within heterozygous deletion regions identified in a recent large-scale study, encompassing a diverse spectrum of epileptic syndromes. By integrating additional protein-protein interaction data, we constructed subnetworks for these CNV-region genes and also those previously studied for epilepsy. We observed 20 genes common to both networks, primarily concentrated within a small molecular network populated by GABA receptor, BDNF/MAPK signaling, and estrogen receptor genes. From among the hundreds of genes in the initial networks, these were designated by convergent evidence for their likely association with epilepsy. Importantly, the identified molecular network was found to contain complex interrelationships, providing further insight into epilepsy's underlying pathology. We further performed pathway enrichment and crosstalk analysis and revealed a functional map which indicates the significant enrichment of closely related neurological, immune, and kinase regulatory pathways. The convergent framework we proposed here provides a unique and powerful approach to screening and identifying promising disease genes out of typically hundreds to thousands of genes in disease-related CNV-regions. Our network and pathway analysis provides important implications for the underlying molecular mechanisms for epilepsy. The strategy can be applied for the study of other complex diseases.
The Reconstruction and Analysis of Gene Regulatory Networks.
Zheng, Guangyong; Huang, Tao
2018-01-01
In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Wang, Jiguang; Sun, Yidan; Zheng, Si; Zhang, Xiang-Sun; Zhou, Huarong; Chen, Luonan
2013-01-01
Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information.
Wang, Jiguang; Sun, Yidan; Zheng, Si; Zhang, Xiang-Sun; Zhou, Huarong; Chen, Luonan
2013-01-01
Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information. PMID:23346354
2013-01-01
Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484
Lezon, Timothy R; Banavar, Jayanth R; Cieplak, Marek; Maritan, Amos; Fedoroff, Nina V
2006-12-12
We describe a method based on the principle of entropy maximization to identify the gene interaction network with the highest probability of giving rise to experimentally observed transcript profiles. In its simplest form, the method yields the pairwise gene interaction network, but it can also be extended to deduce higher-order interactions. Analysis of microarray data from genes in Saccharomyces cerevisiae chemostat cultures exhibiting energy metabolic oscillations identifies a gene interaction network that reflects the intracellular communication pathways that adjust cellular metabolic activity and cell division to the limiting nutrient conditions that trigger metabolic oscillations. The success of the present approach in extracting meaningful genetic connections suggests that the maximum entropy principle is a useful concept for understanding living systems, as it is for other complex, nonequilibrium systems.
Introduction to focus issue: quantitative approaches to genetic networks.
Albert, Réka; Collins, James J; Glass, Leon
2013-06-01
All cells of living organisms contain similar genetic instructions encoded in the organism's DNA. In any particular cell, the control of the expression of each different gene is regulated, in part, by binding of molecular complexes to specific regions of the DNA. The molecular complexes are composed of protein molecules, called transcription factors, combined with various other molecules such as hormones and drugs. Since transcription factors are coded by genes, cellular function is partially determined by genetic networks. Recent research is making large strides to understand both the structure and the function of these networks. Further, the emerging discipline of synthetic biology is engineering novel gene circuits with specific dynamic properties to advance both basic science and potential practical applications. Although there is not yet a universally accepted mathematical framework for studying the properties of genetic networks, the strong analogies between the activation and inhibition of gene expression and electric circuits suggest frameworks based on logical switching circuits. This focus issue provides a selection of papers reflecting current research directions in the quantitative analysis of genetic networks. The work extends from molecular models for the binding of proteins, to realistic detailed models of cellular metabolism. Between these extremes are simplified models in which genetic dynamics are modeled using classical methods of systems engineering, Boolean switching networks, differential equations that are continuous analogues of Boolean switching networks, and differential equations in which control is based on power law functions. The mathematical techniques are applied to study: (i) naturally occurring gene networks in living organisms including: cyanobacteria, Mycoplasma genitalium, fruit flies, immune cells in mammals; (ii) synthetic gene circuits in Escherichia coli and yeast; and (iii) electronic circuits modeling genetic networks using field-programmable gate arrays. Mathematical analyses will be essential for understanding naturally occurring genetic networks in diverse organisms and for providing a foundation for the improved development of synthetic genetic networks.
Introduction to Focus Issue: Quantitative Approaches to Genetic Networks
NASA Astrophysics Data System (ADS)
Albert, Réka; Collins, James J.; Glass, Leon
2013-06-01
All cells of living organisms contain similar genetic instructions encoded in the organism's DNA. In any particular cell, the control of the expression of each different gene is regulated, in part, by binding of molecular complexes to specific regions of the DNA. The molecular complexes are composed of protein molecules, called transcription factors, combined with various other molecules such as hormones and drugs. Since transcription factors are coded by genes, cellular function is partially determined by genetic networks. Recent research is making large strides to understand both the structure and the function of these networks. Further, the emerging discipline of synthetic biology is engineering novel gene circuits with specific dynamic properties to advance both basic science and potential practical applications. Although there is not yet a universally accepted mathematical framework for studying the properties of genetic networks, the strong analogies between the activation and inhibition of gene expression and electric circuits suggest frameworks based on logical switching circuits. This focus issue provides a selection of papers reflecting current research directions in the quantitative analysis of genetic networks. The work extends from molecular models for the binding of proteins, to realistic detailed models of cellular metabolism. Between these extremes are simplified models in which genetic dynamics are modeled using classical methods of systems engineering, Boolean switching networks, differential equations that are continuous analogues of Boolean switching networks, and differential equations in which control is based on power law functions. The mathematical techniques are applied to study: (i) naturally occurring gene networks in living organisms including: cyanobacteria, Mycoplasma genitalium, fruit flies, immune cells in mammals; (ii) synthetic gene circuits in Escherichia coli and yeast; and (iii) electronic circuits modeling genetic networks using field-programmable gate arrays. Mathematical analyses will be essential for understanding naturally occurring genetic networks in diverse organisms and for providing a foundation for the improved development of synthetic genetic networks.
Arancio, Walter
2012-04-01
Hutchinson-Gilford progeria syndrome (HGPS) is a rare human genetic disease that leads to premature aging. HGPS is caused by mutation in the Lamin-A (LMNA) gene that leads, in affected young individuals, to the accumulation of the progerin protein, usually present only in aging differentiated cells. Bioinformatics analyses of the network of interactions of the LMNA gene and transcripts are presented. The LMNA gene network has been analyzed using the BioGRID database (http://thebiogrid.org/) and related analysis tools such as Osprey (http://biodata.mshri.on.ca/osprey/servlet/Index) and GeneMANIA ( http://genemania.org/). The network of interaction of LMNA transcripts has been further analyzed following the competing endogenous (ceRNA) hypotheses (RNA cross-talk via microRNAs [miRNAs]) and using the miRWalk database and tools (www.ma.uni-heidelberg.de/apps/zmf/mirwalk/). These analyses suggest particular relevance of epigenetic modifiers (via acetylase complexes and specifically HTATIP histone acetylase) and adenosine triphosphate (ATP)-dependent chromatin remodelers (via pBAF, BAF, and SWI/SNF complexes).
Jia, Peilin; Chen, Xiangning; Fanous, Ayman H; Zhao, Zhongming
2018-05-24
Genetic components susceptible to complex disease such as schizophrenia include a wide spectrum of variants, including common variants (CVs) and de novo mutations (DNMs). Although CVs and DNMs differ by origin, it remains elusive whether and how they interact at the gene, pathway, and network levels that leads to the disease. In this work, we characterized the genes harboring schizophrenia-associated CVs (CVgenes) and the genes harboring DNMs (DNMgenes) using measures from network, tissue-specific expression profile, and spatiotemporal brain expression profile. We developed an algorithm to link the DNMgenes and CVgenes in spatiotemporal brain co-expression networks. DNMgenes tended to have central roles in the human protein-protein interaction (PPI) network, evidenced in their high degree and high betweenness values. DNMgenes and CVgenes connected with each other significantly more often than with other genes in the networks. However, only CVgenes remained significantly connected after adjusting for their degree. In our gene co-expression PPI network, we found DNMgenes and CVgenes connected in a tissue-specific fashion, and such a pattern was similar to that in GTEx brain but not in other GTEx tissues. Importantly, DNMgene-CVgene subnetworks were enriched with pathways of chromatin remodeling, MHC protein complex binding, and neurotransmitter activities. In summary, our results unveiled that both DNMgenes and CVgenes contributed to a core set of biologically important pathways and networks, and their interactions may attribute to the risk for schizophrenia. Our results also suggested a stronger biological effect of DNMgenes than CVgenes in schizophrenia.
Discovery and validation of a glioblastoma co-expressed gene module
Dunwoodie, Leland J.; Poehlman, William L.; Ficklin, Stephen P.; Feltus, Frank Alexander
2018-01-01
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network. PMID:29541392
Discovery and validation of a glioblastoma co-expressed gene module.
Dunwoodie, Leland J; Poehlman, William L; Ficklin, Stephen P; Feltus, Frank Alexander
2018-02-16
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network.
EgoNet: identification of human disease ego-network modules
2014-01-01
Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628
Reverse engineering and analysis of large genome-scale gene networks
Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas
2013-01-01
Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
Mammalian synthetic biology: emerging medical applications
Kis, Zoltán; Pereira, Hugo Sant'Ana; Homma, Takayuki; Pedrigi, Ryan M.; Krams, Rob
2015-01-01
In this review, we discuss new emerging medical applications of the rapidly evolving field of mammalian synthetic biology. We start with simple mammalian synthetic biological components and move towards more complex and therapy-oriented gene circuits. A comprehensive list of ON–OFF switches, categorized into transcriptional, post-transcriptional, translational and post-translational, is presented in the first sections. Subsequently, Boolean logic gates, synthetic mammalian oscillators and toggle switches will be described. Several synthetic gene networks are further reviewed in the medical applications section, including cancer therapy gene circuits, immuno-regulatory networks, among others. The final sections focus on the applicability of synthetic gene networks to drug discovery, drug delivery, receptor-activating gene circuits and mammalian biomanufacturing processes. PMID:25808341
Learning contextual gene set interaction networks of cancer with condition specificity
2013-01-01
Background Identifying similarities and differences in the molecular constitutions of various types of cancer is one of the key challenges in cancer research. The appearances of a cancer depend on complex molecular interactions, including gene regulatory networks and gene-environment interactions. This complexity makes it challenging to decipher the molecular origin of the cancer. In recent years, many studies reported methods to uncover heterogeneous depictions of complex cancers, which are often categorized into different subtypes. The challenge is to identify diverse molecular contexts within a cancer, to relate them to different subtypes, and to learn underlying molecular interactions specific to molecular contexts so that we can recommend context-specific treatment to patients. Results In this study, we describe a novel method to discern molecular interactions specific to certain molecular contexts. Unlike conventional approaches to build modular networks of individual genes, our focus is to identify cancer-generic and subtype-specific interactions between contextual gene sets, of which each gene set share coherent transcriptional patterns across a subset of samples, termed contextual gene set. We then apply a novel formulation for quantitating the effect of the samples from each subtype on the calculated strength of interactions observed. Two cancer data sets were analyzed to support the validity of condition-specificity of identified interactions. When compared to an existing approach, the proposed method was much more sensitive in identifying condition-specific interactions even in heterogeneous data set. The results also revealed that network components specific to different types of cancer are related to different biological functions than cancer-generic network components. We found not only the results that are consistent with previous studies, but also new hypotheses on the biological mechanisms specific to certain cancer types that warrant further investigations. Conclusions The analysis on the contextual gene sets and characterization of networks of interaction composed of these sets discovered distinct functional differences underlying various types of cancer. The results show that our method successfully reveals many subtype-specific regions in the identified maps of biological contexts, which well represent biological functions that can be connected to specific subtypes. PMID:23418942
Ficklin, Stephen P.; Feltus, F. Alex
2011-01-01
One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species. PMID:21606319
Ficklin, Stephen P; Feltus, F Alex
2011-07-01
One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species.
Coding and non-coding gene regulatory networks underlie the immune response in liver cirrhosis.
Gao, Bo; Zhang, Xueming; Huang, Yongming; Yang, Zhengpeng; Zhang, Yuguo; Zhang, Weihui; Gao, Zu-Hua; Xue, Dongbo
2017-01-01
Liver cirrhosis is recognized as being the consequence of immune-mediated hepatocyte damage and repair processes. However, the regulation of these immune responses underlying liver cirrhosis has not been elucidated. In this study, we used GEO datasets and bioinformatics methods to established coding and non-coding gene regulatory networks including transcription factor-/lncRNA-microRNA-mRNA, and competing endogenous RNA interaction networks. Our results identified 2224 mRNAs, 70 lncRNAs and 46 microRNAs were differentially expressed in liver cirrhosis. The transcription factor -/lncRNA- microRNA-mRNA network we uncovered that results in immune-mediated liver cirrhosis is comprised of 5 core microRNAs (e.g., miR-203; miR-219-5p), 3 transcription factors (i.e., FOXP3, ETS1 and FOS) and 7 lncRNAs (e.g., ENTS00000671336, ENST00000575137). The competing endogenous RNA interaction network we identified includes a complex immune response regulatory subnetwork that controls the entire liver cirrhosis network. Additionally, we found 10 overlapping GO terms shared by both liver cirrhosis and hepatocellular carcinoma including "immune response" as well. Interestingly, the overlapping differentially expressed genes in liver cirrhosis and hepatocellular carcinoma were enriched in immune response-related functional terms. In summary, a complex gene regulatory network underlying immune response processes may play an important role in the development and progression of liver cirrhosis, and its development into hepatocellular carcinoma.
2011-01-01
Background Green plant leaves have always fascinated biologists as hosts for photosynthesis and providers of basic energy to many food webs. Today, comprehensive databases of gene expression data enable us to apply increasingly more advanced computational methods for reverse-engineering the regulatory network of leaves, and to begin to understand the gene interactions underlying complex emergent properties related to stress-response and development. These new systems biology methods are now also being applied to organisms such as Populus, a woody perennial tree, in order to understand the specific characteristics of these species. Results We present a systems biology model of the regulatory network of Populus leaves. The network is reverse-engineered from promoter information and expression profiles of leaf-specific genes measured over a large set of conditions related to stress and developmental. The network model incorporates interactions between regulators, such as synergistic and competitive relationships, by evaluating increasingly more complex regulatory mechanisms, and is therefore able to identify new regulators of leaf development not found by traditional genomics methods based on pair-wise expression similarity. The approach is shown to explain available gene function information and to provide robust prediction of expression levels in new data. We also use the predictive capability of the model to identify condition-specific regulation as well as conserved regulation between Populus and Arabidopsis. Conclusions We outline a computationally inferred model of the regulatory network of Populus leaves, and show how treating genes as interacting, rather than individual, entities identifies new regulators compared to traditional genomics analysis. Although systems biology models should be used with care considering the complexity of regulatory programs and the limitations of current genomics data, methods describing interactions can provide hypotheses about the underlying cause of emergent properties and are needed if we are to identify target genes other than those constituting the "low hanging fruit" of genomic analysis. PMID:21232107
Wang, Zhiwei; Liao, Tianqi; Zhou, Zhongkai; Wang, Yuyang; Diao, Yongjia; Strappe, Padraig; Prenzler, Paul; Ayton, Jamie; Blanchard, Chris
2016-09-06
To study the mechanism underlying the liver damage induced by deep-fried oil (DO) consumption and the beneficial effects from resistant starch (RS) supplement, differential gene expression and pathway network were analyzed based on RNA sequencing data from rats. The up/down regulated genes and corresponding signaling pathways were used to construct a novel local gene network (LGN). The topology of the network showed characteristics of small-world network, with some pathways demonstrating a high degree. Some changes in genes led to a larger probability occurrence of disease or infection with DO intake. More importantly, the main pathways were found to be almost the same between the two LGNs (30 pathways overlapped in total 48) with gene expression profile. This finding may indicate that RS supplement in DO-containing diet may mainly regulate the genes that related to DO damage, and RS in the diet may provide direct signals to the liver cells and modulate its effect through a network involving complex gene regulatory events. It is the first attempt to reveal the mechanism of the attenuation of liver dysfunction from RS supplement in the DO-containing diet using differential gene expression and pathway network. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
2010-01-01
Background Growing interest and burgeoning technology for discovering genetic mechanisms that influence disease processes have ushered in a flood of genetic association studies over the last decade, yet little heritability in highly studied complex traits has been explained by genetic variation. Non-additive gene-gene interactions, which are not often explored, are thought to be one source of this "missing" heritability. Methods Stochastic methods employing evolutionary algorithms have demonstrated promise in being able to detect and model gene-gene and gene-environment interactions that influence human traits. Here we demonstrate modifications to a neural network algorithm in ATHENA (the Analysis Tool for Heritable and Environmental Network Associations) resulting in clear performance improvements for discovering gene-gene interactions that influence human traits. We employed an alternative tree-based crossover, backpropagation for locally fitting neural network weights, and incorporation of domain knowledge obtainable from publicly accessible biological databases for initializing the search for gene-gene interactions. We tested these modifications in silico using simulated datasets. Results We show that the alternative tree-based crossover modification resulted in a modest increase in the sensitivity of the ATHENA algorithm for discovering gene-gene interactions. The performance increase was highly statistically significant when backpropagation was used to locally fit NN weights. We also demonstrate that using domain knowledge to initialize the search for gene-gene interactions results in a large performance increase, especially when the search space is larger than the search coverage. Conclusions We show that a hybrid optimization procedure, alternative crossover strategies, and incorporation of domain knowledge from publicly available biological databases can result in marked increases in sensitivity and performance of the ATHENA algorithm for detecting and modelling gene-gene interactions that influence a complex human trait. PMID:20875103
Sato, Masanao; Tsuda, Kenichi; Wang, Lin; Coller, John; Watanabe, Yuichiro; Glazebrook, Jane; Katagiri, Fumiaki
2010-01-01
Biological signaling processes may be mediated by complex networks in which network components and network sectors interact with each other in complex ways. Studies of complex networks benefit from approaches in which the roles of individual components are considered in the context of the network. The plant immune signaling network, which controls inducible responses to pathogen attack, is such a complex network. We studied the Arabidopsis immune signaling network upon challenge with a strain of the bacterial pathogen Pseudomonas syringae expressing the effector protein AvrRpt2 (Pto DC3000 AvrRpt2). This bacterial strain feeds multiple inputs into the signaling network, allowing many parts of the network to be activated at once. mRNA profiles for 571 immune response genes of 22 Arabidopsis immunity mutants and wild type were collected 6 hours after inoculation with Pto DC3000 AvrRpt2. The mRNA profiles were analyzed as detailed descriptions of changes in the network state resulting from the genetic perturbations. Regulatory relationships among the genes corresponding to the mutations were inferred by recursively applying a non-linear dimensionality reduction procedure to the mRNA profile data. The resulting static network model accurately predicted 23 of 25 regulatory relationships reported in the literature, suggesting that predictions of novel regulatory relationships are also accurate. The network model revealed two striking features: (i) the components of the network are highly interconnected; and (ii) negative regulatory relationships are common between signaling sectors. Complex regulatory relationships, including a novel negative regulatory relationship between the early microbe-associated molecular pattern-triggered signaling sectors and the salicylic acid sector, were further validated. We propose that prevalent negative regulatory relationships among the signaling sectors make the plant immune signaling network a “sector-switching” network, which effectively balances two apparently conflicting demands, robustness against pathogenic perturbations and moderation of negative impacts of immune responses on plant fitness. PMID:20661428
Protein complex prediction in large ontology attributed protein-protein interaction networks.
Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo
2013-01-01
Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.
Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro
2010-04-21
The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.
Architecture of the human interactome defines protein communities and disease networks
Huttlin, Edward L.; Bruckner, Raphael J.; Paulo, Joao A.; Cannon, Joe R.; Ting, Lily; Baltier, Kurt; Colby, Greg; Gebreab, Fana; Gygi, Melanie P.; Parzen, Hannah; Szpyt, John; Tam, Stanley; Zarraga, Gabriela; Pontano-Vaites, Laura; Swarup, Sharan; White, Anne E.; Schweppe, Devin K.; Rad, Ramin; Erickson, Brian K.; Obar, Robert A.; Guruharsha, K.G.; Li, Kejie; Artavanis-Tsakonas, Spyros; Gygi, Steven P.; Harper, J. Wade
2017-01-01
The physiology of a cell can be viewed as the product of thousands of proteins acting in concert to shape the cellular response. Coordination is achieved in part through networks of protein-protein interactions that assemble functionally related proteins into complexes, organelles, and signal transduction pathways. Understanding the architecture of the human proteome has the potential to inform cellular, structural, and evolutionary mechanisms and is critical to elucidation of how genome variation contributes to disease1–3. Here, we present BioPlex 2.0 (Biophysical Interactions of ORFEOME-derived complexes), which employs robust affinity purification-mass spectrometry (AP-MS) methodology4 to elucidate protein interaction networks and co-complexes nucleated by more than 25% of protein coding genes from the human genome, and constitutes the largest such network to date. With >56,000 candidate interactions, BioPlex 2.0 contains >29,000 previously unknown co-associations and provides functional insights into hundreds of poorly characterized proteins while enhancing network-based analyses of domain associations, subcellular localization, and co-complex formation. Unsupervised Markov clustering (MCL)5 of interacting proteins identified more than 1300 protein communities representing diverse cellular activities. Genes essential for cell fitness6,7 are enriched within 53 communities representing central cellular functions. Moreover, we identified 442 communities associated with more than 2000 disease annotations, placing numerous candidate disease genes into a cellular framework. BioPlex 2.0 exceeds previous experimentally derived interaction networks in depth and breadth, and will be a valuable resource for exploring the biology of incompletely characterized proteins and for elucidating larger-scale patterns of proteome organization. PMID:28514442
NASA Astrophysics Data System (ADS)
Ravindran, Vandana; Sunitha, V.; Bagler, Ganesh
2017-05-01
Cancer is characterized by a complex web of regulatory mechanisms which makes it difficult to identify features that are central to its control. Molecular integrative models of cancer, generated with the help of data from experimental assays, facilitate use of control theory to probe for ways of controlling the state of such a complex dynamic network. We modeled the human cancer signaling network as a directed graph and analyzed it for its controllability, identification of driver nodes and their characterization. We identified the driver nodes using the maximum matching algorithm and classified them as backbone, peripheral and ordinary based on their role in regulatory interactions and control of the network. We found that the backbone driver nodes were key to driving the regulatory network into cancer phenotype (via mutations) as well as for steering into healthy phenotype (as drug targets). This implies that while backbone genes could lead to cancer by virtue of mutations, they are also therapeutic targets of cancer. Further, based on their impact on the size of the set of driver nodes, genes were characterized as indispensable, dispensable and neutral. Indispensable nodes within backbone of the network emerged as central to regulatory mechanisms of control of cancer. In addition to probing the cancer signaling network from the perspective of control, our findings suggest that indispensable backbone driver nodes could be potentially leveraged as therapeutic targets. This study also illustrates the application of structural controllability for studying the mechanisms underlying the regulation of complex diseases.
Rong, Junkang; Feltus, F. Alex; Waghmare, Vijay N.; Pierce, Gary J.; Chee, Peng W.; Draye, Xavier; Saranga, Yehoshua; Wright, Robert J.; Wilkins, Thea A.; May, O. Lloyd; Smith, C. Wayne; Gannaway, John R.; Wendel, Jonathan F.; Paterson, Andrew H.
2007-01-01
QTL mapping experiments yield heterogeneous results due to the use of different genotypes, environments, and sampling variation. Compilation of QTL mapping results yields a more complete picture of the genetic control of a trait and reveals patterns in organization of trait variation. A total of 432 QTL mapped in one diploid and 10 tetraploid interspecific cotton populations were aligned using a reference map and depicted in a CMap resource. Early demonstrations that genes from the non-fiber-producing diploid ancestor contribute to tetraploid lint fiber genetics gain further support from multiple populations and environments and advanced-generation studies detecting QTL of small phenotypic effect. Both tetraploid subgenomes contribute QTL at largely non-homeologous locations, suggesting divergent selection acting on many corresponding genes before and/or after polyploid formation. QTL correspondence across studies was only modest, suggesting that additional QTL for the target traits remain to be discovered. Crosses between closely-related genotypes differing by single-gene mutants yield profoundly different QTL landscapes, suggesting that fiber variation involves a complex network of interacting genes. Members of the lint fiber development network appear clustered, with cluster members showing heterogeneous phenotypic effects. Meta-analysis linked to synteny-based and expression-based information provides clues about specific genes and families involved in QTL networks. PMID:17565937
Rong, Junkang; Feltus, F Alex; Waghmare, Vijay N; Pierce, Gary J; Chee, Peng W; Draye, Xavier; Saranga, Yehoshua; Wright, Robert J; Wilkins, Thea A; May, O Lloyd; Smith, C Wayne; Gannaway, John R; Wendel, Jonathan F; Paterson, Andrew H
2007-08-01
QTL mapping experiments yield heterogeneous results due to the use of different genotypes, environments, and sampling variation. Compilation of QTL mapping results yields a more complete picture of the genetic control of a trait and reveals patterns in organization of trait variation. A total of 432 QTL mapped in one diploid and 10 tetraploid interspecific cotton populations were aligned using a reference map and depicted in a CMap resource. Early demonstrations that genes from the non-fiber-producing diploid ancestor contribute to tetraploid lint fiber genetics gain further support from multiple populations and environments and advanced-generation studies detecting QTL of small phenotypic effect. Both tetraploid subgenomes contribute QTL at largely non-homeologous locations, suggesting divergent selection acting on many corresponding genes before and/or after polyploid formation. QTL correspondence across studies was only modest, suggesting that additional QTL for the target traits remain to be discovered. Crosses between closely-related genotypes differing by single-gene mutants yield profoundly different QTL landscapes, suggesting that fiber variation involves a complex network of interacting genes. Members of the lint fiber development network appear clustered, with cluster members showing heterogeneous phenotypic effects. Meta-analysis linked to synteny-based and expression-based information provides clues about specific genes and families involved in QTL networks.
Rossin, Elizabeth J.; Lage, Kasper; Raychaudhuri, Soumya; Xavier, Ramnik J.; Tatar, Diana; Benita, Yair
2011-01-01
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein–protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease. PMID:21249183
An integrative approach to inferring biologically meaningful gene modules.
Cho, Ji-Hoon; Wang, Kai; Galas, David J
2011-07-26
The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
A comparative study of disease genes and drug targets in the human protein interactome
2015-01-01
Background Disease genes cause or contribute genetically to the development of the most complex diseases. Drugs are the major approaches to treat the complex disease through interacting with their targets. Thus, drug targets are critical for treatment efficacy. However, the interrelationship between the disease genes and drug targets is not clear. Results In this study, we comprehensively compared the network properties of disease genes and drug targets for five major disease categories (cancer, cardiovascular disease, immune system disease, metabolic disease, and nervous system disease). We first collected disease genes from genome-wide association studies (GWAS) for five disease categories and collected their corresponding drugs based on drugs' Anatomical Therapeutic Chemical (ATC) classification. Then, we obtained the drug targets for these five different disease categories. We found that, though the intersections between disease genes and drug targets were small, disease genes were significantly enriched in targets compared to their enrichment in human protein-coding genes. We further compared network properties of the proteins encoded by disease genes and drug targets in human protein-protein interaction networks (interactome). The results showed that the drug targets tended to have higher degree, higher betweenness, and lower clustering coefficient in cancer Furthermore, we observed a clear fraction increase of disease proteins or drug targets in the near neighborhood compared with the randomized genes. Conclusions The study presents the first comprehensive comparison of the disease genes and drug targets in the context of interactome. The results provide some foundational network characteristics for further designing computational strategies to predict novel drug targets and drug repurposing. PMID:25861037
A comparative study of disease genes and drug targets in the human protein interactome.
Sun, Jingchun; Zhu, Kevin; Zheng, W; Xu, Hua
2015-01-01
Disease genes cause or contribute genetically to the development of the most complex diseases. Drugs are the major approaches to treat the complex disease through interacting with their targets. Thus, drug targets are critical for treatment efficacy. However, the interrelationship between the disease genes and drug targets is not clear. In this study, we comprehensively compared the network properties of disease genes and drug targets for five major disease categories (cancer, cardiovascular disease, immune system disease, metabolic disease, and nervous system disease). We first collected disease genes from genome-wide association studies (GWAS) for five disease categories and collected their corresponding drugs based on drugs' Anatomical Therapeutic Chemical (ATC) classification. Then, we obtained the drug targets for these five different disease categories. We found that, though the intersections between disease genes and drug targets were small, disease genes were significantly enriched in targets compared to their enrichment in human protein-coding genes. We further compared network properties of the proteins encoded by disease genes and drug targets in human protein-protein interaction networks (interactome). The results showed that the drug targets tended to have higher degree, higher betweenness, and lower clustering coefficient in cancer Furthermore, we observed a clear fraction increase of disease proteins or drug targets in the near neighborhood compared with the randomized genes. The study presents the first comprehensive comparison of the disease genes and drug targets in the context of interactome. The results provide some foundational network characteristics for further designing computational strategies to predict novel drug targets and drug repurposing.
Genomic survey, expression profile and co-expression network analysis of OsWD40 family in rice
2012-01-01
Background WD40 proteins represent a large family in eukaryotes, which have been involved in a broad spectrum of crucial functions. Systematic characterization and co-expression analysis of OsWD40 genes enable us to understand the networks of the WD40 proteins and their biological processes and gene functions in rice. Results In this study, we identify and analyze 200 potential OsWD40 genes in rice, describing their gene structures, genome localizations, and evolutionary relationship of each member. Expression profiles covering the whole life cycle in rice has revealed that transcripts of OsWD40 were accumulated differentially during vegetative and reproductive development and preferentially up or down-regulated in different tissues. Under phytohormone treatments, 25 OsWD40 genes were differentially expressed with treatments of one or more of the phytohormone NAA, KT, or GA3 in rice seedlings. We also used a combined analysis of expression correlation and Gene Ontology annotation to infer the biological role of the OsWD40 genes in rice. The results suggested that OsWD40 genes may perform their diverse functions by complex network, thus were predictive for understanding their biological pathways. The analysis also revealed that OsWD40 genes might interact with each other to take part in metabolic pathways, suggesting a more complex feedback network. Conclusions All of these analyses suggest that the functions of OsWD40 genes are diversified, which provide useful references for selecting candidate genes for further functional studies. PMID:22429805
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules
Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.
Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
O'Brien, M.A.; Costin, B.N.; Miles, M.F.
2014-01-01
Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313
Evolutionary rewiring of bacterial regulatory networks
Taylor, Tiffany B.; Mulley, Geraldine; McGuffin, Liam J.; Johnson, Louise J.; Brockhurst, Michael A.; Arseneault, Tanya; Silby, Mark W.; Jackson, Robert W.
2015-01-01
Bacteria have evolved complex regulatory networks that enable integration of multiple intracellular and extracellular signals to coordinate responses to environmental changes. However, our knowledge of how regulatory systems function and evolve is still relatively limited. There is often extensive homology between components of different networks, due to past cycles of gene duplication, divergence, and horizontal gene transfer, raising the possibility of cross-talk or redundancy. Consequently, evolutionary resilience is built into gene networks - homology between regulators can potentially allow rapid rescue of lost regulatory function across distant regions of the genome. In our recent study [Taylor, et al. Science (2015), 347(6225)] we find that mutations that facilitate cross-talk between pathways can contribute to gene network evolution, but that such mutations come with severe pleiotropic costs. Arising from this work are a number of questions surrounding how this phenomenon occurs. PMID:28357301
Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.; ...
2016-09-29
Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.
Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
Maneuvering in the Complex Path from Genotype to Phenotype
NASA Astrophysics Data System (ADS)
Strohman, Richard
2002-04-01
Human disease phenotypes are controlled not only by genes but by lawful self-organizing networks that display system-wide dynamics. These networks range from metabolic pathways to signaling pathways that regulate hormone action. When perturbed, networks alter their output of matter and energy which, depending on the environmental context, can produce either a pathological or a normal phenotype. Study of the dynamics of these networks by approaches such as metabolic control analysis may provide new insights into the pathogenesis and treatment of complex diseases.
Lezon, Timothy R.; Banavar, Jayanth R.; Cieplak, Marek; Maritan, Amos; Fedoroff, Nina V.
2006-01-01
We describe a method based on the principle of entropy maximization to identify the gene interaction network with the highest probability of giving rise to experimentally observed transcript profiles. In its simplest form, the method yields the pairwise gene interaction network, but it can also be extended to deduce higher-order interactions. Analysis of microarray data from genes in Saccharomyces cerevisiae chemostat cultures exhibiting energy metabolic oscillations identifies a gene interaction network that reflects the intracellular communication pathways that adjust cellular metabolic activity and cell division to the limiting nutrient conditions that trigger metabolic oscillations. The success of the present approach in extracting meaningful genetic connections suggests that the maximum entropy principle is a useful concept for understanding living systems, as it is for other complex, nonequilibrium systems. PMID:17138668
Sharma, Amitabh; Gulbahce, Natali; Pevzner, Samuel J.; Menche, Jörg; Ladenvall, Claes; Folkersen, Lasse; Eriksson, Per; Orho-Melander, Marju; Barabási, Albert-László
2013-01-01
Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes. PMID:23882023
Modelling and analysis of gene regulatory network using feedback control theory
NASA Astrophysics Data System (ADS)
El-Samad, H.; Khammash, M.
2010-01-01
Molecular pathways are a part of a remarkable hierarchy of regulatory networks that operate at all levels of organisation. These regulatory networks are responsible for much of the biological complexity within the cell. The dynamic character of these pathways and the prevalence of feedback regulation strategies in their operation make them amenable to systematic mathematical analysis using the same tools that have been used with success in analysing and designing engineering control systems. In this article, we aim at establishing this strong connection through various examples where the behaviour exhibited by gene networks is explained in terms of their underlying control strategies. We complement our analysis by a survey of mathematical techniques commonly used to model gene regulatory networks and analyse their dynamic behaviour.
Analyzing complex networks evolution through Information Theory quantifiers
NASA Astrophysics Data System (ADS)
Carpi, Laura C.; Rosso, Osvaldo A.; Saco, Patricia M.; Ravetti, Martín Gómez
2011-01-01
A methodology to analyze dynamical changes in complex networks based on Information Theory quantifiers is proposed. The square root of the Jensen-Shannon divergence, a measure of dissimilarity between two probability distributions, and the MPR Statistical Complexity are used to quantify states in the network evolution process. Three cases are analyzed, the Watts-Strogatz model, a gene network during the progression of Alzheimer's disease and a climate network for the Tropical Pacific region to study the El Niño/Southern Oscillation (ENSO) dynamic. We find that the proposed quantifiers are able not only to capture changes in the dynamics of the processes but also to quantify and compare states in their evolution.
Pan- and core- network analysis of co-expression genes in a model plant
He, Fei; Maslov, Sergei
2016-12-16
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Pan- and core- network analysis of co-expression genes in a model plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Fei; Maslov, Sergei
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Albert, Nick W.; Davies, Kevin M.; Lewis, David H.; Zhang, Huaibi; Montefiori, Mirco; Brendolise, Cyril; Boase, Murray R.; Ngo, Hanh; Jameson, Paula E.; Schwinn, Kathy E.
2014-01-01
Plants require sophisticated regulatory mechanisms to ensure the degree of anthocyanin pigmentation is appropriate to myriad developmental and environmental signals. Central to this process are the activity of MYB-bHLH-WD repeat (MBW) complexes that regulate the transcription of anthocyanin genes. In this study, the gene regulatory network that regulates anthocyanin synthesis in petunia (Petunia hybrida) has been characterized. Genetic and molecular evidence show that the R2R3-MYB, MYB27, is an anthocyanin repressor that functions as part of the MBW complex and represses transcription through its C-terminal EAR motif. MYB27 targets both the anthocyanin pathway genes and basic-helix-loop-helix (bHLH) ANTHOCYANIN1 (AN1), itself an essential component of the MBW activation complex for pigmentation. Other features of the regulatory network identified include inhibition of AN1 activity by the competitive R3-MYB repressor MYBx and the activation of AN1, MYB27, and MYBx by the MBW activation complex, providing for both reinforcement and feedback regulation. We also demonstrate the intercellular movement of the WDR protein (AN11) and R3-repressor (MYBx), which may facilitate anthocyanin pigment pattern formation. The fundamental features of this regulatory network in the Asterid model of petunia are similar to those in the Rosid model of Arabidopsis thaliana and are thus likely to be widespread in the Eudicots. PMID:24642943
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
PubNet: a flexible system for visualizing literature derived networks
Douglas, Shawn M; Montelione, Gaetano T; Gerstein, Mark
2005-01-01
We have developed PubNet, a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks, allowing for graphical visualization, textual navigation, and topological analysis. PubNet supports the creation of complex networks derived from the contents of individual citations, such as genes, proteins, Protein Data Bank (PDB) IDs, Medical Subject Headings (MeSH) terms, and authors. This feature allows one to, for example, examine a literature derived network of genes based on functional similarity. PMID:16168087
Ethanol modulation of gene networks: implications for alcoholism.
Farris, Sean P; Miles, Michael F
2012-01-01
Alcoholism is a complex disease caused by a confluence of environmental and genetic factors influencing multiple brain pathways to produce a variety of behavioral sequelae, including addiction. Genetic factors contribute to over 50% of the risk for alcoholism and recent evidence points to a large number of genes with small effect sizes as the likely molecular basis for this disease. Recent progress in genomics (microarrays or RNA-Seq) and genetics has led to the identification of a large number of potential candidate genes influencing ethanol behaviors or alcoholism itself. To organize this complex information, investigators have begun to focus on the contribution of gene networks, rather than individual genes, for various ethanol-induced behaviors in animal models or behavioral endophenotypes comprising alcoholism. This chapter reviews some of the methods used for constructing gene networks from genomic data and some of the recent progress made in applying such approaches to the study of the neurobiology of ethanol. We show that rapid technology development in gathering genomic data, together with sophisticated experimental design and a growing collection of analysis tools are producing novel insights for understanding the molecular basis of alcoholism and that such approaches promise new opportunities for therapeutic development. Copyright © 2011 Elsevier Inc. All rights reserved.
Ma, Chihua; Luciani, Timothy; Terebus, Anna; Liang, Jie; Marai, G Elisabeta
2017-02-15
Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists' understanding of phenotypic behavior associated with specific genes. We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases.
Network representations of immune system complexity
Subramanian, Naeha; Torabi-Parizi, Parizad; Gottschalk, Rachel A.; Germain, Ronald N.; Dutta, Bhaskar
2015-01-01
The mammalian immune system is a dynamic multi-scale system composed of a hierarchically organized set of molecular, cellular and organismal networks that act in concert to promote effective host defense. These networks range from those involving gene regulatory and protein-protein interactions underlying intracellular signaling pathways and single cell responses to increasingly complex networks of in vivo cellular interaction, positioning and migration that determine the overall immune response of an organism. Immunity is thus not the product of simple signaling events but rather non-linear behaviors arising from dynamic, feedback-regulated interactions among many components. One of the major goals of systems immunology is to quantitatively measure these complex multi-scale spatial and temporal interactions, permitting development of computational models that can be used to predict responses to perturbation. Recent technological advances permit collection of comprehensive datasets at multiple molecular and cellular levels while advances in network biology support representation of the relationships of components at each level as physical or functional interaction networks. The latter facilitate effective visualization of patterns and recognition of emergent properties arising from the many interactions of genes, molecules, and cells of the immune system. We illustrate the power of integrating ‘omics’ and network modeling approaches for unbiased reconstruction of signaling and transcriptional networks with a focus on applications involving the innate immune system. We further discuss future possibilities for reconstruction of increasingly complex cellular and organism-level networks and development of sophisticated computational tools for prediction of emergent immune behavior arising from the concerted action of these networks. PMID:25625853
Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks
Li, Min; Chen, Weijie; Wang, Jianxin; Pan, Yi
2014-01-01
Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data of Saccharomyces cerevisiae and the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures. PMID:24963481
Baresic, Mario; Salatino, Silvia; Kupr, Barbara
2014-01-01
Skeletal muscle tissue shows an extraordinary cellular plasticity, but the underlying molecular mechanisms are still poorly understood. Here, we use a combination of experimental and computational approaches to unravel the complex transcriptional network of muscle cell plasticity centered on the peroxisome proliferator-activated receptor γ coactivator 1α (PGC-1α), a regulatory nexus in endurance training adaptation. By integrating data on genome-wide binding of PGC-1α and gene expression upon PGC-1α overexpression with comprehensive computational prediction of transcription factor binding sites (TFBSs), we uncover a hitherto-underestimated number of transcription factor partners involved in mediating PGC-1α action. In particular, principal component analysis of TFBSs at PGC-1α binding regions predicts that, besides the well-known role of the estrogen-related receptor α (ERRα), the activator protein 1 complex (AP-1) plays a major role in regulating the PGC-1α-controlled gene program of the hypoxia response. Our findings thus reveal the complex transcriptional network of muscle cell plasticity controlled by PGC-1α. PMID:24912679
An Arabidopsis Gene Regulatory Network for Secondary Cell Wall Synthesis
Taylor-Teeples, M; Lin, L; de Lucas, M; Turco, G; Toal, TW; Gaudinier, A; Young, NF; Trabucco, GM; Veling, MT; Lamothe, R; Handakumbura, PP; Xiong, G; Wang, C; Corwin, J; Tsoukalas, A; Zhang, L; Ware, D; Pauly, M; Kliebenstein, DJ; Dehesh, K; Tagkopoulos, I; Breton, G; Pruneda-Paz, JL; Ahnert, SE; Kay, SA; Hazen, SP; Brady, SM
2014-01-01
Summary The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component. PMID:25533953
The emergence of overlapping scale-free genetic architecture in digital organisms.
Gerlee, P; Lundh, T
2008-01-01
We have studied the evolution of genetic architecture in digital organisms and found that the gene overlap follows a scale-free distribution, which is commonly found in metabolic networks of many organisms. Our results show that the slope of the scale-free distribution depends on the mutation rate and that the gene development is driven by expansion of already existing genes, which is in direct correspondence to the preferential growth algorithm that gives rise to scale-free networks. To further validate our results we have constructed a simple model of gene development, which recapitulates the results from the evolutionary process and shows that the mutation rate affects the tendency of genes to cluster. In addition we could relate the slope of the scale-free distribution to the genetic complexity of the organisms and show that a high mutation rate gives rise to a more complex genetic architecture.
Linear control theory for gene network modeling.
Shin, Yong-Jun; Bleris, Leonidas
2010-09-16
Systems biology is an interdisciplinary field that aims at understanding complex interactions in cells. Here we demonstrate that linear control theory can provide valuable insight and practical tools for the characterization of complex biological networks. We provide the foundation for such analyses through the study of several case studies including cascade and parallel forms, feedback and feedforward loops. We reproduce experimental results and provide rational analysis of the observed behavior. We demonstrate that methods such as the transfer function (frequency domain) and linear state-space (time domain) can be used to predict reliably the properties and transient behavior of complex network topologies and point to specific design strategies for synthetic networks.
Mank, Nils N; Berghoff, Bork A; Klug, Gabriele
2013-03-01
Living cells use a variety of regulatory network motifs for accurate gene expression in response to changes in their environment or during differentiation processes. In Rhodobacter sphaeroides, a complex regulatory network controls expression of photosynthesis genes to guarantee optimal energy supply on one hand and to avoid photooxidative stress on the other hand. Recently, we identified a mixed incoherent feed-forward loop comprising the transcription factor PrrA, the sRNA PcrZ and photosynthesis target genes as part of this regulatory network. This point-of-view provides a comparison to other described feed-forward loops and discusses the physiological relevance of PcrZ in more detail.
A mixed incoherent feed-forward loop contributes to the regulation of bacterial photosynthesis genes
Mank, Nils N.; Berghoff, Bork A.; Klug, Gabriele
2013-01-01
Living cells use a variety of regulatory network motifs for accurate gene expression in response to changes in their environment or during differentiation processes. In Rhodobacter sphaeroides, a complex regulatory network controls expression of photosynthesis genes to guarantee optimal energy supply on one hand and to avoid photooxidative stress on the other hand. Recently, we identified a mixed incoherent feed-forward loop comprising the transcription factor PrrA, the sRNA PcrZ and photosynthesis target genes as part of this regulatory network. This point-of-view provides a comparison to other described feed-forward loops and discusses the physiological relevance of PcrZ in more detail. PMID:23392242
Coding and non-coding gene regulatory networks underlie the immune response in liver cirrhosis
Zhang, Xueming; Huang, Yongming; Yang, Zhengpeng; Zhang, Yuguo; Zhang, Weihui; Gao, Zu-hua; Xue, Dongbo
2017-01-01
Liver cirrhosis is recognized as being the consequence of immune-mediated hepatocyte damage and repair processes. However, the regulation of these immune responses underlying liver cirrhosis has not been elucidated. In this study, we used GEO datasets and bioinformatics methods to established coding and non-coding gene regulatory networks including transcription factor-/lncRNA-microRNA-mRNA, and competing endogenous RNA interaction networks. Our results identified 2224 mRNAs, 70 lncRNAs and 46 microRNAs were differentially expressed in liver cirrhosis. The transcription factor -/lncRNA- microRNA-mRNA network we uncovered that results in immune-mediated liver cirrhosis is comprised of 5 core microRNAs (e.g., miR-203; miR-219-5p), 3 transcription factors (i.e., FOXP3, ETS1 and FOS) and 7 lncRNAs (e.g., ENTS00000671336, ENST00000575137). The competing endogenous RNA interaction network we identified includes a complex immune response regulatory subnetwork that controls the entire liver cirrhosis network. Additionally, we found 10 overlapping GO terms shared by both liver cirrhosis and hepatocellular carcinoma including “immune response” as well. Interestingly, the overlapping differentially expressed genes in liver cirrhosis and hepatocellular carcinoma were enriched in immune response-related functional terms. In summary, a complex gene regulatory network underlying immune response processes may play an important role in the development and progression of liver cirrhosis, and its development into hepatocellular carcinoma. PMID:28355233
Parenclitic networks: uncovering new functions in biological data
Zanin, Massimiliano; Alcazar, Joaquín Medina; Carbajosa, Jesus Vicente; Paez, Marcela Gomez; Papo, David; Sousa, Pedro; Menasalvas, Ernestina; Boccaletti, Stefano
2014-01-01
We introduce a novel method to represent time independent, scalar data sets as complex networks. We apply our method to investigate gene expression in the response to osmotic stress of Arabidopsis thaliana. In the proposed network representation, the most important genes for the plant response turn out to be the nodes with highest centrality in appropriately reconstructed networks. We also performed a target experiment, in which the predicted genes were artificially induced one by one, and the growth of the corresponding phenotypes compared to that of the wild-type. The joint application of the network reconstruction method and of the in vivo experiments allowed identifying 15 previously unknown key genes, and provided models of their mutual relationships. This novel representation extends the use of graph theory to data sets hitherto considered outside of the realm of its application, vastly simplifying the characterization of their underlying structure. PMID:24870931
Parenclitic Network Analysis of Methylation Data for Cancer Identification
Karsakov, Alexander; Bartlett, Thomas; Ryblov, Artem; Meyerov, Iosif; Ivanchenko, Mikhail; Zaikin, Alexey
2017-01-01
We make use of ideas from the theory of complex networks to implement a machine learning classification of human DNA methylation data, that carry signatures of cancer development. The data were obtained from patients with various kinds of cancers and represented as parenclictic networks, wherein nodes correspond to genes, and edges are weighted according to pairwise variation from control group subjects. We demonstrate that for the 10 types of cancer under study, it is possible to obtain a high performance of binary classification between cancer-positive and negative samples based on network measures. Remarkably, an accuracy as high as 93−99% is achieved with only 12 network topology indices, in a dramatic reduction of complexity from the original 15295 gene methylation levels. Moreover, it was found that the parenclictic networks are scale-free in cancer-negative subjects, and deviate from the power-law node degree distribution in cancer. The node centrality ranking and arising modular structure could provide insights into the systems biology of cancer. PMID:28107365
Therapeutic synthetic gene networks.
Karlsson, Maria; Weber, Wilfried
2012-10-01
The field of synthetic biology is rapidly expanding and has over the past years evolved from the development of simple gene networks to complex treatment-oriented circuits. The reprogramming of cell fate with open-loop or closed-loop synthetic control circuits along with biologically implemented logical functions have fostered applications spanning over a wide range of disciplines, including artificial insemination, personalized medicine and the treatment of cancer and metabolic disorders. In this review we describe several applications of interactive gene networks, a synthetic biology-based approach for future gene therapy, as well as the utilization of synthetic gene circuits as blueprints for the design of stimuli-responsive biohybrid materials. The recent progress in synthetic biology, including the rewiring of biosensing devices with the body's endogenous network as well as novel therapeutic approaches originating from interdisciplinary work, generates numerous opportunities for future biomedical applications. Copyright © 2012 Elsevier Ltd. All rights reserved.
Reconstructing directed gene regulatory network by only gene expression data.
Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng
2016-08-18
Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
The Importance of Normalization on Large and Heterogeneous Microarray Datasets
DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, ne...
Bagot, Rosemary C; Cates, Hannah M; Purushothaman, Immanuel; Lorsch, Zachary S; Walker, Deena M; Wang, Junshi; Huang, Xiaojie; Schlüter, Oliver M; Maze, Ian; Peña, Catherine J; Heller, Elizabeth A; Issler, Orna; Wang, Minghui; Song, Won-Min; Stein, Jason L; Liu, Xiaochuan; Doyle, Marie A; Scobie, Kimberly N; Sun, Hao Sheng; Neve, Rachael L; Geschwind, Daniel; Dong, Yan; Shen, Li; Zhang, Bin; Nestler, Eric J
2016-06-01
Depression is a complex, heterogeneous disorder and a leading contributor to the global burden of disease. Most previous research has focused on individual brain regions and genes contributing to depression. However, emerging evidence in humans and animal models suggests that dysregulated circuit function and gene expression across multiple brain regions drive depressive phenotypes. Here, we performed RNA sequencing on four brain regions from control animals and those susceptible or resilient to chronic social defeat stress at multiple time points. We employed an integrative network biology approach to identify transcriptional networks and key driver genes that regulate susceptibility to depressive-like symptoms. Further, we validated in vivo several key drivers and their associated transcriptional networks that regulate depression susceptibility and confirmed their functional significance at the levels of gene transcription, synaptic regulation, and behavior. Our study reveals novel transcriptional networks that control stress susceptibility and offers fundamentally new leads for antidepressant drug discovery. Copyright © 2016 Elsevier Inc. All rights reserved.
An integrative approach to inferring biologically meaningful gene modules
2011-01-01
Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level. PMID:21791051
Nigam, Deepti; Sawant, Samir V
2013-01-01
Technological development led to an increased interest in systems biological approaches in plants to characterize developmental mechanism and candidate genes relevant to specific tissue or cell morphology. AUX-IAA proteins are important plant-specific putative transcription factors. There are several reports on physiological response of this family in Arabidopsis but in cotton fiber the transcriptional network through which AUX-IAA regulated its target genes is still unknown. in-silico modelling of cotton fiber development specific gene expression data (108 microarrays and 22,737 genes) using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals 3690 putative AUX-IAA target genes of which 139 genes were known to be AUX-IAA co-regulated within Arabidopsis. Further AUX-IAA targeted gene regulatory network (GRN) had substantial impact on the transcriptional dynamics of cotton fiber, as showed by, altered TF networks, and Gene Ontology (GO) biological processes and metabolic pathway associated with its target genes. Analysis of the AUX-IAA-correlated gene network reveals multiple functions for AUX-IAA target genes such as unidimensional cell growth, cellular nitrogen compound metabolic process, nucleosome organization, DNA-protein complex and process related to cell wall. These candidate networks/pathways have a variety of profound impacts on such cellular functions as stress response, cell proliferation, and cell differentiation. While these functions are fairly broad, their underlying TF networks may provide a global view of AUX-IAA regulated gene expression and a GRN that guides future studies in understanding role of AUX-IAA box protein and its targets regulating fiber development. PMID:24497725
Wang, Jianxin; Chen, Bo; Wang, Yaqun; Wang, Ningtao; Garbey, Marc; Tran-Son-Tay, Roger; Berceli, Scott A.; Wu, Rongling
2013-01-01
The capacity of an organism to respond to its environment is facilitated by the environmentally induced alteration of gene and protein expression, i.e. expression plasticity. The reconstruction of gene regulatory networks based on expression plasticity can gain not only new insights into the causality of transcriptional and cellular processes but also the complex regulatory mechanisms that underlie biological function and adaptation. We describe an approach for network inference by integrating expression plasticity into Shannon’s mutual information. Beyond Pearson correlation, mutual information can capture non-linear dependencies and topology sparseness. The approach measures the network of dependencies of genes expressed in different environments, allowing the environment-induced plasticity of gene dependencies to be tested in unprecedented details. The approach is also able to characterize the extent to which the same genes trigger different amounts of expression in response to environmental changes. We demonstrated the usefulness of this approach through analysing gene expression data from a rabbit vein graft study that includes two distinct blood flow environments. The proposed approach provides a powerful tool for the modelling and analysis of dynamic regulatory networks using gene expression data from distinct environments. PMID:23470995
The impact of network medicine in gastroenterology and hepatology.
Baffy, György
2013-10-01
In the footsteps of groundbreaking achievements made by biomedical research, another scientific revolution is unfolding. Systems biology draws from the chaos and complexity theory and applies computational models to predict emerging behavior of the interactions between genes, gene products, and environmental factors. Adaptation of systems biology to translational and clinical sciences has been termed network medicine, and is likely to change the way we think about preventing, predicting, diagnosing, and treating complex human diseases. Network medicine finds gene-disease associations by analyzing the unparalleled digital information discovered and created by high-throughput technologies (dubbed as "omics" science) and links genetic variance to clinical disease phenotypes through intermediate organizational levels of life such as the epigenome, transcriptome, proteome, and metabolome. Supported by large reference databases, unprecedented data storage capacity, and innovative computational analysis, network medicine is poised to find links between conditions that were thought to be distinct, uncover shared disease mechanisms and key drivers of the pathogenesis, predict individual disease outcomes and trajectories, identify novel therapeutic applications, and help avoid off-target and undesirable drug effects. Recent advances indicate that these perspectives are increasingly within our reach for understanding and managing complex diseases of the digestive system. Copyright © 2013 AGA Institute. Published by Elsevier Inc. All rights reserved.
Mammalian synthetic biology: emerging medical applications.
Kis, Zoltán; Pereira, Hugo Sant'Ana; Homma, Takayuki; Pedrigi, Ryan M; Krams, Rob
2015-05-06
In this review, we discuss new emerging medical applications of the rapidly evolving field of mammalian synthetic biology. We start with simple mammalian synthetic biological components and move towards more complex and therapy-oriented gene circuits. A comprehensive list of ON-OFF switches, categorized into transcriptional, post-transcriptional, translational and post-translational, is presented in the first sections. Subsequently, Boolean logic gates, synthetic mammalian oscillators and toggle switches will be described. Several synthetic gene networks are further reviewed in the medical applications section, including cancer therapy gene circuits, immuno-regulatory networks, among others. The final sections focus on the applicability of synthetic gene networks to drug discovery, drug delivery, receptor-activating gene circuits and mammalian biomanufacturing processes. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Telonis-Scott, Marina; Sgrò, Carla M.; Hoffmann, Ary A.; Griffin, Philippa C.
2016-01-01
Repeated attempts to map the genomic basis of complex traits often yield different outcomes because of the influence of genetic background, gene-by-environment interactions, and/or statistical limitations. However, where repeatability is low at the level of individual genes, overlap often occurs in gene ontology categories, genetic pathways, and interaction networks. Here we report on the genomic overlap for natural desiccation resistance from a Pool-genome-wide association study experiment and a selection experiment in flies collected from the same region in southeastern Australia in different years. We identified over 600 single nucleotide polymorphisms associated with desiccation resistance in flies derived from almost 1,000 wild-caught genotypes, a similar number of loci to that observed in our previous genomic study of selected lines, demonstrating the genetic complexity of this ecologically important trait. By harnessing the power of cross-study comparison, we narrowed the candidates from almost 400 genes in each study to a core set of 45 genes, enriched for stimulus, stress, and defense responses. In addition to gene-level overlap, there was higher order congruence at the network and functional levels, suggesting genetic redundancy in key stress sensing, stress response, immunity, signaling, and gene expression pathways. We also identified variants linked to different molecular aspects of desiccation physiology previously verified from functional experiments. Our approach provides insight into the genomic basis of a complex and ecologically important trait and predicts candidate genetic pathways to explore in multiple genetic backgrounds and related species within a functional framework. PMID:26733490
Genome-Wide Analysis of the Complex Transcriptional Networks of Rice Developing Seeds
Xue, Liang-Jiao; Zhang, Jing-Jing; Xue, Hong-Wei
2012-01-01
Background The development of rice (Oryza sativa) seed is closely associated with assimilates storage and plant yield, and is fine controlled by complex regulatory networks. Exhaustive transcriptome analysis of developing rice embryo and endosperm will help to characterize the genes possibly involved in the regulation of seed development and provide clues of yield and quality improvement. Principal Findings Our analysis showed that genes involved in metabolism regulation, hormone response and cellular organization processes are predominantly expressed during rice development. Interestingly, 191 transcription factor (TF)-encoding genes are predominantly expressed in seed and 59 TFs are regulated during seed development, some of which are homologs of seed-specific TFs or regulators of Arabidopsis seed development. Gene co-expression network analysis showed these TFs associated with multiple cellular and metabolism pathways, indicating a complex regulation of rice seed development. Further, by employing a cold-resistant cultivar Hanfeng (HF), genome-wide analyses of seed transcriptome at normal and low temperature reveal that rice seed is sensitive to low temperature at early stage and many genes associated with seed development are down-regulated by low temperature, indicating that the delayed development of rice seed by low temperature is mainly caused by the inhibition of the development-related genes. The transcriptional response of seed and seedling to low temperature is different, and the differential expressions of genes in signaling and metabolism pathways may contribute to the chilling tolerance of HF during seed development. Conclusions These results provide informative clues and will significantly improve the understanding of rice seed development regulation and the mechanism of cold response in rice seed. PMID:22363552
Munger, Steven C.; Aylor, David L.; Syed, Haider Ali; Magwene, Paul M.; Threadgill, David W.; Capel, Blanche
2009-01-01
Despite the identification of some key genes that regulate sex determination, most cases of disorders of sexual development remain unexplained. Evidence suggests that the sexual fate decision in the developing gonad depends on a complex network of interacting factors that converge on a critical threshold. To elucidate the transcriptional network underlying sex determination, we took the first expression quantitative trait loci (eQTL) approach in a developing organ. We identified reproducible differences in the transcriptome of the embryonic day 11.5 (E11.5) XY gonad between C57BL/6J (B6) and 129S1/SvImJ (129S1), indicating that the reported sensitivity of B6 to sex reversal is consistent with a higher expression of a female-like transcriptome in B6. Gene expression is highly variable in F2 XY gonads from B6 and 129S1 intercrosses, yet strong correlations emerged. We estimated the F2 coexpression network and predicted roles for genes of unknown function based on their connectivity and position within the network. A genetic analysis of the F2 population detected autosomal regions that control the expression of many sex-related genes, including Sry (sex-determining region of the Y chromosome) and Sox9 (Sry-box containing gene 9), the key regulators of male sex determination. Our results reveal the complex transcription architecture underlying sex determination, and provide a mechanism by which individuals may be sensitized for sex reversal. PMID:19884258
The complexity of gene expression dynamics revealed by permutation entropy
2010-01-01
Background High complexity is considered a hallmark of living systems. Here we investigate the complexity of temporal gene expression patterns using the concept of Permutation Entropy (PE) first introduced in dynamical systems theory. The analysis of gene expression data has so far focused primarily on the identification of differentially expressed genes, or on the elucidation of pathway and regulatory relationships. We aim to study gene expression time series data from the viewpoint of complexity. Results Applying the PE complexity metric to abiotic stress response time series data in Arabidopsis thaliana, genes involved in stress response and signaling were found to be associated with the highest complexity not only under stress, but surprisingly, also under reference, non-stress conditions. Genes with house-keeping functions exhibited lower PE complexity. Compared to reference conditions, the PE of temporal gene expression patterns generally increased upon stress exposure. High-complexity genes were found to have longer upstream intergenic regions and more cis-regulatory motifs in their promoter regions indicative of a more complex regulatory apparatus needed to orchestrate their expression, and to be associated with higher correlation network connectivity degree. Arabidopsis genes also present in other plant species were observed to exhibit decreased PE complexity compared to Arabidopsis specific genes. Conclusions We show that Permutation Entropy is a simple yet robust and powerful approach to identify temporal gene expression profiles of varying complexity that is equally applicable to other types of molecular profile data. PMID:21176199
Park, Chihyun; Yun, So Jeong; Ryu, Sung Jin; Lee, Soyoung; Lee, Young-Sam; Yoon, Youngmi; Park, Sang Chul
2017-03-15
Cellular senescence irreversibly arrests growth of human diploid cells. In addition, recent studies have indicated that senescence is a multi-step evolving process related to important complex biological processes. Most studies analyzed only the genes and their functions representing each senescence phase without considering gene-level interactions and continuously perturbed genes. It is necessary to reveal the genotypic mechanism inferred by affected genes and their interaction underlying the senescence process. We suggested a novel computational approach to identify an integrative network which profiles an underlying genotypic signature from time-series gene expression data. The relatively perturbed genes were selected for each time point based on the proposed scoring measure denominated as perturbation scores. Then, the selected genes were integrated with protein-protein interactions to construct time point specific network. From these constructed networks, the conserved edges across time point were extracted for the common network and statistical test was performed to demonstrate that the network could explain the phenotypic alteration. As a result, it was confirmed that the difference of average perturbation scores of common networks at both two time points could explain the phenotypic alteration. We also performed functional enrichment on the common network and identified high association with phenotypic alteration. Remarkably, we observed that the identified cell cycle specific common network played an important role in replicative senescence as a key regulator. Heretofore, the network analysis from time series gene expression data has been focused on what topological structure was changed over time point. Conversely, we focused on the conserved structure but its context was changed in course of time and showed it was available to explain the phenotypic changes. We expect that the proposed method will help to elucidate the biological mechanism unrevealed by the existing approaches.
Malashchuk, Igor; Lajoie, Brian R.; Mardaryev, Andrei N.; Gdula, Michal R.; Sharov, Andrey A.; Kohwi-Shigematsu, Terumi; Fessing, Michael Y.
2017-01-01
Mammalian genomes contain several dozens of large (>0.5 Mbp) lineage-specific gene loci harbouring functionally related genes. However, spatial chromatin folding, organization of the enhancer-promoter networks and their relevance to Topologically Associating Domains (TADs) in these loci remain poorly understood. TADs are principle units of the genome folding and represents the DNA regions within which DNA interacts more frequently and less frequently across the TAD boundary. Here, we used Chromatin Conformation Capture Carbon Copy (5C) technology to characterize spatial chromatin interaction network in the 3.1 Mb Epidermal Differentiation Complex (EDC) locus harbouring 61 functionally related genes that show lineage-specific activation during terminal keratinocyte differentiation in the epidermis. 5C data validated by 3D-FISH demonstrate that the EDC locus is organized into several TADs showing distinct lineage-specific chromatin interaction networks based on their transcription activity and the gene-rich or gene-poor status. Correlation of the 5C results with genome-wide studies for enhancer-specific histone modifications (H3K4me1 and H3K27ac) revealed that the majority of spatial chromatin interactions that involves the gene-rich TADs at the EDC locus in keratinocytes include both intra- and inter-TAD interaction networks, connecting gene promoters and enhancers. Compared to thymocytes in which the EDC locus is mostly transcriptionally inactive, these interactions were found to be keratinocyte-specific. In keratinocytes, the promoter-enhancer anchoring regions in the gene-rich transcriptionally active TADs are enriched for the binding of chromatin architectural proteins CTCF, Rad21 and chromatin remodeler Brg1. In contrast to gene-rich TADs, gene-poor TADs show preferential spatial contacts with each other, do not contain active enhancers and show decreased binding of CTCF, Rad21 and Brg1 in keratinocytes. Thus, spatial interactions between gene promoters and enhancers at the multi-TAD EDC locus in skin epithelial cells are cell type-specific and involve extensive contacts within TADs as well as between different gene-rich TADs, forming the framework for lineage-specific transcription. PMID:28863138
Detecting phenotype-driven transitions in regulatory network structure.
Padi, Megha; Quackenbush, John
2018-01-01
Complex traits and diseases like human height or cancer are often not caused by a single mutation or genetic variant, but instead arise from functional changes in the underlying molecular network. Biological networks are known to be highly modular and contain dense "communities" of genes that carry out cellular processes, but these structures change between tissues, during development, and in disease. While many methods exist for inferring networks and analyzing their topologies separately, there is a lack of robust methods for quantifying differences in network structure. Here, we describe ALPACA (ALtered Partitions Across Community Architectures), a method for comparing two genome-scale networks derived from different phenotypic states to identify condition-specific modules. In simulations, ALPACA leads to more nuanced, sensitive, and robust module discovery than currently available network comparison methods. As an application, we use ALPACA to compare transcriptional networks in three contexts: angiogenic and non-angiogenic subtypes of ovarian cancer, human fibroblasts expressing transforming viral oncogenes, and sexual dimorphism in human breast tissue. In each case, ALPACA identifies modules enriched for processes relevant to the phenotype. For example, modules specific to angiogenic ovarian tumors are enriched for genes associated with blood vessel development, and modules found in female breast tissue are enriched for genes involved in estrogen receptor and ERK signaling. The functional relevance of these new modules suggests that not only can ALPACA identify structural changes in complex networks, but also that these changes may be relevant for characterizing biological phenotypes.
GARNET--gene set analysis with exploration of annotation relations.
Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu
2011-02-15
Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).
Cánovas, Angela; Reverter, Antonio; DeAtley, Kasey L.; Ashley, Ryan L.; Colgrave, Michelle L.; Fortes, Marina R. S.; Islas-Trejo, Alma; Lehnert, Sigrid; Porto-Neto, Laercio; Rincón, Gonzalo; Silver, Gail A.; Snelling, Warren M.; Medrano, Juan F.; Thomas, Milton G.
2014-01-01
Puberty is a complex physiological event by which animals mature into an adult capable of sexual reproduction. In order to enhance our understanding of the genes and regulatory pathways and networks involved in puberty, we characterized the transcriptome of five reproductive tissues (i.e. hypothalamus, pituitary gland, ovary, uterus, and endometrium) as well as tissues known to be relevant to growth and metabolism needed to achieve puberty (i.e., longissimus dorsi muscle, adipose, and liver). These tissues were collected from pre- and post-pubertal Brangus heifers (3/8 Brahman; Bos indicus x 5/8 Angus; Bos taurus) derived from a population of cattle used to identify quantitative trait loci associated with fertility traits (i.e., age of first observed corpus luteum (ACL), first service conception (FSC), and heifer pregnancy (HPG)). In order to exploit the power of complementary omics analyses, pre- and post-puberty co-expression gene networks were constructed by combining the results from genome-wide association studies (GWAS), RNA-Seq, and bovine transcription factors. Eight tissues among pre-pubertal and post-pubertal Brangus heifers revealed 1,515 differentially expressed and 943 tissue-specific genes within the 17,832 genes confirmed by RNA-Seq analysis. The hypothalamus experienced the most notable up-regulation of genes via puberty (i.e., 204 out of 275 genes). Combining the results of GWAS and RNA-Seq, we identified 25 loci containing a single nucleotide polymorphism (SNP) associated with ACL, FSC, and (or) HPG. Seventeen of these SNP were within a gene and 13 of the genes were expressed in uterus or endometrium. Multi-tissue omics analyses revealed 2,450 co-expressed genes relative to puberty. The pre-pubertal network had 372,861 connections whereas the post-pubertal network had 328,357 connections. A sub-network from this process revealed key transcriptional regulators (i.e., PITX2, FOXA1, DACH2, PROP1, SIX6, etc.). Results from these multi-tissue omics analyses improve understanding of the number of genes and their complex interactions for puberty in cattle. PMID:25048735
Modelling the influence of parental effects on gene-network evolution.
Odorico, Andreas; Rünneburger, Estelle; Le Rouzic, Arnaud
2018-05-01
Understanding the importance of nongenetic heredity in the evolutionary process is a major topic in modern evolutionary biology. We modified a classical gene-network model by allowing parental transmission of gene expression and studied its evolutionary properties through individual-based simulations. We identified ontogenetic time (i.e. the time gene networks have to stabilize before being submitted to natural selection) as a crucial factor in determining the evolutionary impact of this phenotypic inheritance. Indeed, fast-developing organisms display enhanced adaptation and greater robustness to mutations when evolving in presence of nongenetic inheritance (NGI). In contrast, in our model, long development reduces the influence of the inherited state of the gene network. NGI thus had a negligible effect on the evolution of gene networks when the speed at which transcription levels reach equilibrium is not constrained. Nevertheless, simulations show that intergenerational transmission of the gene-network state negatively affects the evolution of robustness to environmental disturbances for either fast- or slow-developing organisms. Therefore, these results suggest that the evolutionary consequences of NGI might not be sought only in the way species respond to selection, but also on the evolution of emergent properties (such as environmental and genetic canalization) in complex genetic architectures. © 2018 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2018 European Society For Evolutionary Biology.
Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu
2017-01-01
Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663
Modeling gene regulatory network motifs using statecharts
2012-01-01
Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967
Rough set soft computing cancer classification and network: one stone, two birds.
Zhang, Yue
2010-07-15
Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.
HEDD: Human Enhancer Disease Database
Wang, Zhen; Zhang, Quanwei; Zhang, Wen; Lin, Jhih-Rong; Cai, Ying; Mitra, Joydeep
2018-01-01
Abstract Enhancers, as specialized genomic cis-regulatory elements, activate transcription of their target genes and play an important role in pathogenesis of many human complex diseases. Despite recent systematic identification of them in the human genome, currently there is an urgent need for comprehensive annotation databases of human enhancers with a focus on their disease connections. In response, we built the Human Enhancer Disease Database (HEDD) to facilitate studies of enhancers and their potential roles in human complex diseases. HEDD currently provides comprehensive genomic information for ∼2.8 million human enhancers identified by ENCODE, FANTOM5 and RoadMap with disease association scores based on enhancer–gene and gene–disease connections. It also provides Web-based analytical tools to visualize enhancer networks and score enhancers given a set of selected genes in a specific gene network. HEDD is freely accessible at http://zdzlab.einstein.yu.edu/1/hedd.php. PMID:29077884
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.
Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H
2017-12-20
Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
USDA-ARS?s Scientific Manuscript database
To balance the demand for uptake of essential elements with their potential toxicity living cells have complex regulatory mechanisms. Here, we describe a genome-wide screen to identify genes that impact the elemental composition (‘ionome’) of yeast Saccharomyces cerevisiae. Using inductively coupled...
Resistance Genes in Global Crop Breeding Networks.
Garrett, K A; Andersen, K F; Asche, F; Bowden, R L; Forbes, G A; Kulakow, P A; Zhou, B
2017-10-01
Resistance genes are a major tool for managing crop diseases. The networks of crop breeders who exchange resistance genes and deploy them in varieties help to determine the global landscape of resistance and epidemics, an important system for maintaining food security. These networks function as a complex adaptive system, with associated strengths and vulnerabilities, and implications for policies to support resistance gene deployment strategies. Extensions of epidemic network analysis can be used to evaluate the multilayer agricultural networks that support and influence crop breeding networks. Here, we evaluate the general structure of crop breeding networks for cassava, potato, rice, and wheat. All four are clustered due to phytosanitary and intellectual property regulations, and linked through CGIAR hubs. Cassava networks primarily include public breeding groups, whereas others are more mixed. These systems must adapt to global change in climate and land use, the emergence of new diseases, and disruptive breeding technologies. Research priorities to support policy include how best to maintain both diversity and redundancy in the roles played by individual crop breeding groups (public versus private and global versus local), and how best to manage connectivity to optimize resistance gene deployment while avoiding risks to the useful life of resistance genes. [Formula: see text] Copyright © 2017 The Author(s). This is an open access article distributed under the CC BY 4.0 International license .
Grappling with the HOX network in hematopoiesis and leukemia.
McGonigle, Glenda J; Lappin, Terence R J; Thompson, Alexander
2008-05-01
The mammalian HOX gene network encodes a family of proteins which act as master regulators of developmental processes such as embryogenesis and hematopoiesis. The complex arrangement, regulation and co-factor association of HOX has been an area of intense research, particularly in cancer biology, for over a decade. The concept of redeployment of embryonic regulators in the neoplastic arena has received support from many quarters. Observations of altered HOX gene expression in various solid tumours and leukemia appear to support the thesis that 'oncology recapitulates ontogeny' but the identification of critical HOX subsets and their functional role in cancer onset and maintenance requires further investigation. The application of novel techniques and model systems will continue to enhance our understanding of the HOX network in the years to come. Better understanding of the intricacy of the complex as well as identification of functional pathways and direct targets of the encoded proteins will permit harnessing of this family of genes for clinical application.
Gene network analysis: from heart development to cardiac therapy.
Ferrazzi, Fulvia; Bellazzi, Riccardo; Engel, Felix B
2015-03-01
Networks offer a flexible framework to represent and analyse the complex interactions between components of cellular systems. In particular gene networks inferred from expression data can support the identification of novel hypotheses on regulatory processes. In this review we focus on the use of gene network analysis in the study of heart development. Understanding heart development will promote the elucidation of the aetiology of congenital heart disease and thus possibly improve diagnostics. Moreover, it will help to establish cardiac therapies. For example, understanding cardiac differentiation during development will help to guide stem cell differentiation required for cardiac tissue engineering or to enhance endogenous repair mechanisms. We introduce different methodological frameworks to infer networks from expression data such as Boolean and Bayesian networks. Then we present currently available temporal expression data in heart development and discuss the use of network-based approaches in published studies. Collectively, our literature-based analysis indicates that gene network analysis constitutes a promising opportunity to infer therapy-relevant regulatory processes in heart development. However, the use of network-based approaches has so far been limited by the small amount of samples in available datasets. Thus, we propose to acquire high-resolution temporal expression data to improve the mathematical descriptions of regulatory processes obtained with gene network inference methodologies. Especially probabilistic methods that accommodate the intrinsic variability of biological systems have the potential to contribute to a deeper understanding of heart development.
Finding gene regulatory network candidates using the gene expression knowledge base.
Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin
2014-12-10
Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Ni, Jingchao; Koyuturk, Mehmet; Tong, Hanghang; Haines, Jonathan; Xu, Rong; Zhang, Xiang
2016-11-10
Accurately prioritizing candidate disease genes is an important and challenging problem. Various network-based methods have been developed to predict potential disease genes by utilizing the disease similarity network and molecular networks such as protein interaction or gene co-expression networks. Although successful, a common limitation of the existing methods is that they assume all diseases share the same molecular network and a single generic molecular network is used to predict candidate genes for all diseases. However, different diseases tend to manifest in different tissues, and the molecular networks in different tissues are usually different. An ideal method should be able to incorporate tissue-specific molecular networks for different diseases. In this paper, we develop a robust and flexible method to integrate tissue-specific molecular networks for disease gene prioritization. Our method allows each disease to have its own tissue-specific network(s). We formulate the problem of candidate gene prioritization as an optimization problem based on network propagation. When there are multiple tissue-specific networks available for a disease, our method can automatically infer the relative importance of each tissue-specific network. Thus it is robust to the noisy and incomplete network data. To solve the optimization problem, we develop fast algorithms which have linear time complexities in the number of nodes in the molecular networks. We also provide rigorous theoretical foundations for our algorithms in terms of their optimality and convergence properties. Extensive experimental results show that our method can significantly improve the accuracy of candidate gene prioritization compared with the state-of-the-art methods. In our experiments, we compare our methods with 7 popular network-based disease gene prioritization algorithms on diseases from Online Mendelian Inheritance in Man (OMIM) database. The experimental results demonstrate that our methods recover true associations more accurately than other methods in terms of AUC values, and the performance differences are significant (with paired t-test p-values less than 0.05). This validates the importance to integrate tissue-specific molecular networks for studying disease gene prioritization and show the superiority of our network models and ranking algorithms toward this purpose. The source code and datasets are available at http://nijingchao.github.io/CRstar/ .
2014-01-01
Background Network inference of gene expression data is an important challenge in systems biology. Novel algorithms may provide more detailed gene regulatory networks (GRN) for complex, chronic inflammatory diseases such as rheumatoid arthritis (RA), in which activated synovial fibroblasts (SFBs) play a major role. Since the detailed mechanisms underlying this activation are still unclear, simultaneous investigation of multi-stimuli activation of SFBs offers the possibility to elucidate the regulatory effects of multiple mediators and to gain new insights into disease pathogenesis. Methods A GRN was therefore inferred from RA-SFBs treated with 4 different stimuli (IL-1 β, TNF- α, TGF- β, and PDGF-D). Data from time series microarray experiments (0, 1, 2, 4, 12 h; Affymetrix HG-U133 Plus 2.0) were batch-corrected applying ‘ComBat’, analyzed for differentially expressed genes over time with ‘Limma’, and used for the inference of a robust GRN with NetGenerator V2.0, a heuristic ordinary differential equation-based method with soft integration of prior knowledge. Results Using all genes differentially expressed over time in RA-SFBs for any stimulus, and selecting the genes belonging to the most significant gene ontology (GO) term, i.e., ‘cartilage development’, a dynamic, robust, moderately complex multi-stimuli GRN was generated with 24 genes and 57 edges in total, 31 of which were gene-to-gene edges. Prior literature-based knowledge derived from Pathway Studio or manual searches was reflected in the final network by 25/57 confirmed edges (44%). The model contained known network motifs crucial for dynamic cellular behavior, e.g., cross-talk among pathways, positive feed-back loops, and positive feed-forward motifs (including suppression of the transcriptional repressor OSR2 by all 4 stimuli. Conclusion A multi-stimuli GRN highly concordant with literature data was successfully generated by network inference from the gene expression of stimulated RA-SFBs. The GRN showed high reliability, since 10 predicted edges were independently validated by literature findings post network inference. The selected GO term ‘cartilage development’ contained a number of differentiation markers, growth factors, and transcription factors with potential relevance for RA. Finally, the model provided new insight into the response of RA-SFBs to multiple stimuli implicated in the pathogenesis of RA, in particular to the ‘novel’ potent growth factor PDGF-D. PMID:24989895
Kinetic models of gene expression including non-coding RNAs
NASA Astrophysics Data System (ADS)
Zhdanov, Vladimir P.
2011-03-01
In cells, genes are transcribed into mRNAs, and the latter are translated into proteins. Due to the feedbacks between these processes, the kinetics of gene expression may be complex even in the simplest genetic networks. The corresponding models have already been reviewed in the literature. A new avenue in this field is related to the recognition that the conventional scenario of gene expression is fully applicable only to prokaryotes whose genomes consist of tightly packed protein-coding sequences. In eukaryotic cells, in contrast, such sequences are relatively rare, and the rest of the genome includes numerous transcript units representing non-coding RNAs (ncRNAs). During the past decade, it has become clear that such RNAs play a crucial role in gene expression and accordingly influence a multitude of cellular processes both in the normal state and during diseases. The numerous biological functions of ncRNAs are based primarily on their abilities to silence genes via pairing with a target mRNA and subsequently preventing its translation or facilitating degradation of the mRNA-ncRNA complex. Many other abilities of ncRNAs have been discovered as well. Our review is focused on the available kinetic models describing the mRNA, ncRNA and protein interplay. In particular, we systematically present the simplest models without kinetic feedbacks, models containing feedbacks and predicting bistability and oscillations in simple genetic networks, and models describing the effect of ncRNAs on complex genetic networks. Mathematically, the presentation is based primarily on temporal mean-field kinetic equations. The stochastic and spatio-temporal effects are also briefly discussed.
Portrait of Candida Species Biofilm Regulatory Network Genes.
Araújo, Daniela; Henriques, Mariana; Silva, Sónia
2017-01-01
Most cases of candidiasis have been attributed to Candida albicans, but Candida glabrata, Candida parapsilosis and Candida tropicalis, designated as non-C. albicans Candida (NCAC), have been identified as frequent human pathogens. Moreover, Candida biofilms are an escalating clinical problem associated with significant rates of mortality. Biofilms have distinct developmental phases, including adhesion/colonisation, maturation and dispersal, controlled by complex regulatory networks. This review discusses recent advances regarding Candida species biofilm regulatory network genes, which are key components for candidiasis. Copyright © 2016 Elsevier Ltd. All rights reserved.
An Arabidopsis gene regulatory network for secondary cell wall synthesis
Taylor-Teeples, M.; Lin, L.; de Lucas, M.; ...
2014-12-24
The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. In this paper, we present a protein–DNA network between Arabidopsis thaliana transcription factors and secondary cell wall metabolic genes with gene expression regulated bymore » a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. Finally, these interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.« less
Co-expression networks reveal the tissue-specific regulation of transcription and splicing
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D.H.; Jo, Brian; Gao, Chuan; McDowell, Ian C.; Engelhardt, Barbara E.
2017-01-01
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. PMID:29021288
Jachiet, Pierre-Alain; Colson, Philippe; Lopez, Philippe; Bapteste, Eric
2014-01-01
Complex nongradual evolutionary processes such as gene remodeling are difficult to model, to visualize, and to investigate systematically. Despite these challenges, the creation of composite (or mosaic) genes by combination of genetic segments from unrelated gene families was established as an important adaptive phenomena in eukaryotic genomes. In contrast, almost no general studies have been conducted to quantify composite genes in viruses. Although viral genome mosaicism has been well-described, the extent of gene mosaicism and its rules of emergence remain largely unexplored. Applying methods from graph theory to inclusive similarity networks, and using data from more than 3,000 complete viral genomes, we provide the first demonstration that composite genes in viruses are 1) functionally biased, 2) involved in key aspects of the arm race between cells and viruses, and 3) can be classified into two distinct types of composite genes in all viral classes. Beyond the quantification of the widespread recombination of genes among different viruses of the same class, we also report a striking sharing of genetic information between viruses of different classes and with different nucleic acid types. This latter discovery provides novel evidence for the existence of a large and complex mobilome network, which appears partly bound by the sharing of genetic information and by the formation of composite genes between mobile entities with different genetic material. Considering that there are around 10E31 viruses on the planet, gene remodeling appears as a hugely significant way of generating and moving novel sequences between different kinds of organisms on Earth. PMID:25104113
2012-01-01
Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html. PMID:22531049
NDRC: A Disease-Causing Genes Prioritized Method Based on Network Diffusion and Rank Concordance.
Fang, Minghong; Hu, Xiaohua; Wang, Yan; Zhao, Junmin; Shen, Xianjun; He, Tingting
2015-07-01
Disease-causing genes prioritization is very important to understand disease mechanisms and biomedical applications, such as design of drugs. Previous studies have shown that promising candidate genes are mostly ranked according to their relatedness to known disease genes or closely related disease genes. Therefore, a dangling gene (isolated gene) with no edges in the network can not be effectively prioritized. These approaches tend to prioritize those genes that are highly connected in the PPI network while perform poorly when they are applied to loosely connected disease genes. To address these problems, we propose a new disease-causing genes prioritization method that based on network diffusion and rank concordance (NDRC). The method is evaluated by leave-one-out cross validation on 1931 diseases in which at least one gene is known to be involved, and it is able to rank the true causal gene first in 849 of all 2542 cases. The experimental results suggest that NDRC significantly outperforms other existing methods such as RWR, VAVIEN, DADA and PRINCE on identifying loosely connected disease genes and successfully put dangling genes as potential candidate disease genes. Furthermore, we apply NDRC method to study three representative diseases, Meckel syndrome 1, Protein C deficiency and Peroxisome biogenesis disorder 1A (Zellweger). Our study has also found that certain complex disease-causing genes can be divided into several modules that are closely associated with different disease phenotype.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Chen, Bor-Sen; Lin, Ying-Po
2013-01-01
Robust stabilization and environmental disturbance attenuation are ubiquitous systematic properties that are observed in biological systems at many different levels. The underlying principles for robust stabilization and environmental disturbance attenuation are universal to both complex biological systems and sophisticated engineering systems. In many biological networks, network robustness should be large enough to confer: intrinsic robustness for tolerating intrinsic parameter fluctuations; genetic robustness for buffering genetic variations; and environmental robustness for resisting environmental disturbances. Network robustness is needed so phenotype stability of biological network can be maintained, guaranteeing phenotype robustness. Synthetic biology is foreseen to have important applications in biotechnology and medicine; it is expected to contribute significantly to a better understanding of functioning of complex biological systems. This paper presents a unifying mathematical framework for investigating the principles of both robust stabilization and environmental disturbance attenuation for synthetic gene networks in synthetic biology. Further, from the unifying mathematical framework, we found that the phenotype robustness criterion for synthetic gene networks is the following: if intrinsic robustness + genetic robustness + environmental robustness ≦ network robustness, then the phenotype robustness can be maintained in spite of intrinsic parameter fluctuations, genetic variations, and environmental disturbances. Therefore, the trade-offs between intrinsic robustness, genetic robustness, environmental robustness, and network robustness in synthetic biology can also be investigated through corresponding phenotype robustness criteria from the systematic point of view. Finally, a robust synthetic design that involves network evolution algorithms with desired behavior under intrinsic parameter fluctuations, genetic variations, and environmental disturbances, is also proposed, together with a simulation example. PMID:23515190
Romero-Garcia, Rafael; Whitaker, Kirstie J; Váša, František; Seidlitz, Jakob; Shinn, Maxwell; Fonagy, Peter; Dolan, Raymond J; Jones, Peter B; Goodyer, Ian M; Bullmore, Edward T; Vértes, Petra E
2018-05-01
Complex network topology is characteristic of many biological systems, including anatomical and functional brain networks (connectomes). Here, we first constructed a structural covariance network from MRI measures of cortical thickness on 296 healthy volunteers, aged 14-24 years. Next, we designed a new algorithm for matching sample locations from the Allen Brain Atlas to the nodes of the SCN. Subsequently we used this to define, transcriptomic brain networks by estimating gene co-expression between pairs of cortical regions. Finally, we explored the hypothesis that transcriptional networks and structural MRI connectomes are coupled. A transcriptional brain network (TBN) and a structural covariance network (SCN) were correlated across connection weights and showed qualitatively similar complex topological properties: assortativity, small-worldness, modularity, and a rich-club. In both networks, the weight of an edge was inversely related to the anatomical (Euclidean) distance between regions. There were differences between networks in degree and distance distributions: the transcriptional network had a less fat-tailed degree distribution and a less positively skewed distance distribution than the SCN. However, cortical areas connected to each other within modules of the SCN had significantly higher levels of whole genome co-expression than expected by chance. Nodes connected in the SCN had especially high levels of expression and co-expression of a human supragranular enriched (HSE) gene set that has been specifically located to supragranular layers of human cerebral cortex and is known to be important for large-scale, long-distance cortico-cortical connectivity. This coupling of brain transcriptome and connectome topologies was largely but not entirely accounted for by the common constraint of physical distance on both networks. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Chen, Bor-Sen; Lin, Ying-Po
2013-01-01
Robust stabilization and environmental disturbance attenuation are ubiquitous systematic properties that are observed in biological systems at many different levels. The underlying principles for robust stabilization and environmental disturbance attenuation are universal to both complex biological systems and sophisticated engineering systems. In many biological networks, network robustness should be large enough to confer: intrinsic robustness for tolerating intrinsic parameter fluctuations; genetic robustness for buffering genetic variations; and environmental robustness for resisting environmental disturbances. Network robustness is needed so phenotype stability of biological network can be maintained, guaranteeing phenotype robustness. Synthetic biology is foreseen to have important applications in biotechnology and medicine; it is expected to contribute significantly to a better understanding of functioning of complex biological systems. This paper presents a unifying mathematical framework for investigating the principles of both robust stabilization and environmental disturbance attenuation for synthetic gene networks in synthetic biology. Further, from the unifying mathematical framework, we found that the phenotype robustness criterion for synthetic gene networks is the following: if intrinsic robustness + genetic robustness + environmental robustness ≦ network robustness, then the phenotype robustness can be maintained in spite of intrinsic parameter fluctuations, genetic variations, and environmental disturbances. Therefore, the trade-offs between intrinsic robustness, genetic robustness, environmental robustness, and network robustness in synthetic biology can also be investigated through corresponding phenotype robustness criteria from the systematic point of view. Finally, a robust synthetic design that involves network evolution algorithms with desired behavior under intrinsic parameter fluctuations, genetic variations, and environmental disturbances, is also proposed, together with a simulation example.
Beal, Jacob; Lu, Ting; Weiss, Ron
2011-01-01
Background The field of synthetic biology promises to revolutionize our ability to engineer biological systems, providing important benefits for a variety of applications. Recent advances in DNA synthesis and automated DNA assembly technologies suggest that it is now possible to construct synthetic systems of significant complexity. However, while a variety of novel genetic devices and small engineered gene networks have been successfully demonstrated, the regulatory complexity of synthetic systems that have been reported recently has somewhat plateaued due to a variety of factors, including the complexity of biology itself and the lag in our ability to design and optimize sophisticated biological circuitry. Methodology/Principal Findings To address the gap between DNA synthesis and circuit design capabilities, we present a platform that enables synthetic biologists to express desired behavior using a convenient high-level biologically-oriented programming language, Proto. The high level specification is compiled, using a regulatory motif based mechanism, to a gene network, optimized, and then converted to a computational simulation for numerical verification. Through several example programs we illustrate the automated process of biological system design with our platform, and show that our compiler optimizations can yield significant reductions in the number of genes () and latency of the optimized engineered gene networks. Conclusions/Significance Our platform provides a convenient and accessible tool for the automated design of sophisticated synthetic biological systems, bridging an important gap between DNA synthesis and circuit design capabilities. Our platform is user-friendly and features biologically relevant compiler optimizations, providing an important foundation for the development of sophisticated biological systems. PMID:21850228
Beal, Jacob; Lu, Ting; Weiss, Ron
2011-01-01
The field of synthetic biology promises to revolutionize our ability to engineer biological systems, providing important benefits for a variety of applications. Recent advances in DNA synthesis and automated DNA assembly technologies suggest that it is now possible to construct synthetic systems of significant complexity. However, while a variety of novel genetic devices and small engineered gene networks have been successfully demonstrated, the regulatory complexity of synthetic systems that have been reported recently has somewhat plateaued due to a variety of factors, including the complexity of biology itself and the lag in our ability to design and optimize sophisticated biological circuitry. To address the gap between DNA synthesis and circuit design capabilities, we present a platform that enables synthetic biologists to express desired behavior using a convenient high-level biologically-oriented programming language, Proto. The high level specification is compiled, using a regulatory motif based mechanism, to a gene network, optimized, and then converted to a computational simulation for numerical verification. Through several example programs we illustrate the automated process of biological system design with our platform, and show that our compiler optimizations can yield significant reductions in the number of genes (~ 50%) and latency of the optimized engineered gene networks. Our platform provides a convenient and accessible tool for the automated design of sophisticated synthetic biological systems, bridging an important gap between DNA synthesis and circuit design capabilities. Our platform is user-friendly and features biologically relevant compiler optimizations, providing an important foundation for the development of sophisticated biological systems.
Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra
NASA Astrophysics Data System (ADS)
Schadt, Eric
2005-03-01
The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.
Disease-aging network reveals significant roles of aging genes in connecting genetic diseases.
Wang, Jiguang; Zhang, Shihua; Wang, Yong; Chen, Luonan; Zhang, Xiang-Sun
2009-09-01
One of the challenging problems in biology and medicine is exploring the underlying mechanisms of genetic diseases. Recent studies suggest that the relationship between genetic diseases and the aging process is important in understanding the molecular mechanisms of complex diseases. Although some intricate associations have been investigated for a long time, the studies are still in their early stages. In this paper, we construct a human disease-aging network to study the relationship among aging genes and genetic disease genes. Specifically, we integrate human protein-protein interactions (PPIs), disease-gene associations, aging-gene associations, and physiological system-based genetic disease classification information in a single graph-theoretic framework and find that (1) human disease genes are much closer to aging genes than expected by chance; and (2) diseases can be categorized into two types according to their relationships with aging. Type I diseases have their genes significantly close to aging genes, while type II diseases do not. Furthermore, we examine the topological characters of the disease-aging network from a systems perspective. Theoretical results reveal that the genes of type I diseases are in a central position of a PPI network while type II are not; (3) more importantly, we define an asymmetric closeness based on the PPI network to describe relationships between diseases, and find that aging genes make a significant contribution to associations among diseases, especially among type I diseases. In conclusion, the network-based study provides not only evidence for the intricate relationship between the aging process and genetic diseases, but also biological implications for prying into the nature of human diseases.
Yu, Hua; Jiao, Bingke; Lu, Lu; Wang, Pengfei; Chen, Shuangcheng; Liang, Chengzhi; Liu, Wei
2018-01-01
Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
The structure of a gene co-expression network reveals biological functions underlying eQTLs.
Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali
2013-01-01
What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.
[Weighted gene co-expression network analysis in biomedicine research].
Liu, Wei; Li, Li; Ye, Hua; Tu, Wei
2017-11-25
High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.
Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong
2014-05-15
In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.
The Transcriptome of the Reference Potato Genome Solanum tuberosum Group Phureja Clone DM1-3 516R44
Massa, Alicia N.; Childs, Kevin L.; Lin, Haining; Bryan, Glenn J.; Giuliano, Giovanni; Buell, C. Robin
2011-01-01
Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family. PMID:22046362
2011-01-01
Background To make sense out of gene expression profiles, such analyses must be pushed beyond the mere listing of affected genes. For example, if a group of genes persistently display similar changes in expression levels under particular experimental conditions, and the proteins encoded by these genes interact and function in the same cellular compartments, this could be taken as very strong indicators for co-regulated protein complexes. One of the key requirements is having appropriate tools to detect such regulatory patterns. Results We have analyzed the global adaptations in gene expression patterns in the budding yeast when the Hsp90 molecular chaperone complex is perturbed either pharmacologically or genetically. We integrated these results with publicly accessible expression, protein-protein interaction and intracellular localization data. But most importantly, all experimental conditions were simultaneously and dynamically visualized with an animation. This critically facilitated the detection of patterns of gene expression changes that suggested underlying regulatory networks that a standard analysis by pairwise comparison and clustering could not have revealed. Conclusions The results of the animation-assisted detection of changes in gene regulatory patterns make predictions about the potential roles of Hsp90 and its co-chaperone p23 in regulating whole sets of genes. The simultaneous dynamic visualization of microarray experiments, represented in networks built by integrating one's own experimental with publicly accessible data, represents a powerful discovery tool that allows the generation of new interpretations and hypotheses. PMID:21672238
Bogenpohl, James W; Mignogna, Kristin M; Smith, Maren L; Miles, Michael F
2017-01-01
Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce nonbiased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA, and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x environmental interactions affecting brain functioning in health and disease.
Bogenpohl, James W.; Mignogna, Kristin M.; Smith, Maren L.; Miles, Michael F.
2016-01-01
Complex behavioral traits, such as alcohol abuse, are caused by an interplay of genetic and environmental factors, producing deleterious functional adaptations in the central nervous system. The long-term behavioral consequences of such changes are of substantial cost to both the individual and society. Substantial progress has been made in the last two decades in understanding elements of brain mechanisms underlying responses to ethanol in animal models and risk factors for alcohol use disorder (AUD) in humans. However, treatments for AUD remain largely ineffective and few medications for this disease state have been licensed. Genome-wide genetic polymorphism analysis (GWAS) in humans, behavioral genetic studies in animal models and brain gene expression studies produced by microarrays or RNA-seq have the potential to produce non-biased and novel insight into the underlying neurobiology of AUD. However, the complexity of such information, both statistical and informational, has slowed progress toward identifying new targets for intervention in AUD. This chapter describes one approach for integrating behavioral, genetic, and genomic information across animal model and human studies. The goal of this approach is to identify networks of genes functioning in the brain that are most relevant to the underlying mechanisms of a complex disease such as AUD. We illustrate an example of how genomic studies in animal models can be used to produce robust gene networks that have functional implications, and to integrate such animal model genomic data with human genetic studies such as GWAS for AUD. We describe several useful analysis tools for such studies: ComBAT, WGCNA and EW_dmGWAS. The end result of this analysis is a ranking of gene networks and identification of their cognate hub genes, which might provide eventual targets for future therapeutic development. Furthermore, this combined approach may also improve our understanding of basic mechanisms underlying gene x environmental interactions affecting brain functioning in health and disease. PMID:27933543
NASA Astrophysics Data System (ADS)
di Bernardo, Diego
2016-07-01
The review by Martin et al. deals with a long standing problem at the interface of complex systems and molecular biology, that is the relationship between the topology of a complex network and its function. In biological terms the problem translates to relating the topology of gene regulatory networks (GRNs) to specific cellular functions. GRNs control the spatial and temporal activity of the genes encoded in the cell's genome by means of specialised proteins called Transcription Factors (TFs). A TF is able to recognise and bind specifically to a sequence (TF biding site) of variable length (order of magnitude of 10) found upstream of the sequence encoding one or more genes (at least in prokaryotes) and thus activating or repressing their transcription. TFs can thus be distinguished in activator and repressor. The picture can become more complex since some classes of TFs can form hetero-dimers consisting of a protein complex whose subunits are the individual TFs. Heterodimers can have completely different binding sites and activity compared to their individual parts. In this review the authors limit their attention to prokaryotes where the complexity of GRNs is somewhat reduced. Moreover they exploit a unique feature of living systems, i.e. evolution, to understand whether function can shape network topology. Indeed, prokaryotes such as bacteria are among the oldest living systems that have become perfectly adapted to their environment over geological scales and thus have reached an evolutionary steady-state where the fitness of the population has reached a plateau. By integrating in silico analysis and comparative evolution, the authors show that indeed function does tend to shape the structure of a GRN, however this trend is not always present and depends on the properties of the network being examined. Interestingly, the trend is more apparent for sparse networks, i.e. where the density of edges is very low. Sparsity is indeed one of the most prominent features of natural occurring GRNs, and more specifically GRNs have been found to approximate a power-law ;scale-free; degree distribution by Barabasi and Albert [2]. Why sparsity arises is still under debate, but Price in 1976 proposed a model [1], later renamed ;preferential attachment; by Barabasi and Albert [2], able to give rise to sparse scale-free networks. In this model, a network grows over time (such as GRN during evolution) by sequential addition of new nodes (caused by genome duplications) that attach with higher probability to nodes with higher degree. In this review, Martin et al. propose that sparsity could also be caused phenotypic constrains even in the absence of genome duplications, in order for the network to be robust against random mutations in the genome sequence, which in turn affect the specificity of TF binding sites. The authors also found that network motifs, i.e. subnetworks consisting of 3 or 4 nodes with a specific topology that are over-represented in the network, are also shaped by phenotypic constrains. Theoretical and computational approaches to understand the forces that shape network topology are of extreme interest in biology, although at this stage their impact has been limited. Neverteless, these approaches may soon have important practical applications. The era of synthetic biology is upon us, novel organisms with ;minimal genomes; are being built with the dual aim of simplifying engineering of new functions useful to humans and to understand which is the minimal set of genes needed to support life [3]. The first minimal organism has just been created [3] by randomly deleting genes and genomic regions until a minimal set supporting cell growth and replication was found. The GRN of this minimal organism has not been investigated yet, but it will be of limited complexity. What is the GRN structure in this organism? Will the cell phenotypes be robust to mutations? Is it possible to re-engineer the GRN in order to find an optimal structure that confers phenotypic robustness to the cell? All of these questions can be tackled only by understanding the guiding principles linking network topology to network function.
Kogelman, Lisette J A; Cirera, Susanna; Zhernakova, Daria V; Fredholm, Merete; Franke, Lude; Kadarmideen, Haja N
2014-09-30
Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and 34.58). Moreover, detection of differentially connected genes identified various genes previously identified to be associated with obesity in humans and rodents, e.g. CSF1R and MARC2. To our knowledge, this is the first study to apply systems biology approaches using porcine adipose tissue RNA-Sequencing data in a genetically characterized porcine model for obesity. We revealed complex networks, pathways, candidate and regulatory genes related to obesity, confirming the complexity of obesity and its association with immune-related disorders and osteoporosis.
Rough Set Soft Computing Cancer Classification and Network: One Stone, Two Birds
Zhang, Yue
2010-01-01
Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article. PMID:20706619
The Network Organization of Cancer-associated Protein Complexes in Human Tissues
Zhao, Jing; Lee, Sang Hoon; Huss, Mikael; Holme, Petter
2013-01-01
Differential gene expression profiles for detecting disease genes have been studied intensively in systems biology. However, it is known that various biological functions achieved by proteins follow from the ability of the protein to form complexes by physically binding to each other. In other words, the functional units are often protein complexes rather than individual proteins. Thus, we seek to replace the perspective of disease-related genes by disease-related complexes, exemplifying with data on 39 human solid tissue cancers and their original normal tissues. To obtain the differential abundance levels of protein complexes, we apply an optimization algorithm to genome-wide differential expression data. From the differential abundance of complexes, we extract tissue- and cancer-selective complexes, and investigate their relevance to cancer. The method is supported by a clustering tendency of bipartite cancer-complex relationships, as well as a more concrete and realistic approach to disease-related proteomics. PMID:23567845
General and craniofacial development are complex adaptive processes influenced by diversity.
Brook, A H; O'Donnell, M Brook; Hone, A; Hart, E; Hughes, T E; Smith, R N; Townsend, G C
2014-06-01
Complex systems are present in such diverse areas as social systems, economies, ecosystems and biology and, therefore, are highly relevant to dental research, education and practice. A Complex Adaptive System in biological development is a dynamic process in which, from interacting components at a lower level, higher level phenomena and structures emerge. Diversity makes substantial contributions to the performance of complex adaptive systems. It enhances the robustness of the process, allowing multiple responses to external stimuli as well as internal changes. From diversity comes variation in outcome and the possibility of major change; outliers in the distribution enhance the tipping points. The development of the dentition is a valuable, accessible model with extensive and reliable databases for investigating the role of complex adaptive systems in craniofacial and general development. The general characteristics of such systems are seen during tooth development: self-organization; bottom-up emergence; multitasking; self-adaptation; variation; tipping points; critical phases; and robustness. Dental findings are compatible with the Random Network Model, the Threshold Model and also with the Scale Free Network Model which has a Power Law distribution. In addition, dental development shows the characteristics of Modularity and Clustering to form Hierarchical Networks. The interactions between the genes (nodes) demonstrate Small World phenomena, Subgraph Motifs and Gene Regulatory Networks. Genetic mechanisms are involved in the creation and evolution of variation during development. The genetic factors interact with epigenetic and environmental factors at the molecular level and form complex networks within the cells. From these interactions emerge the higher level tissues, tooth germs and mineralized teeth. Approaching development in this way allows investigation of why there can be variations in phenotypes from identical genotypes; the phenotype is the outcome of perturbations in the cellular systems and networks, as well as of the genotype. Understanding and applying complexity theory will bring about substantial advances not only in dental research and education but also in the organization and delivery of oral health care. © 2014 Australian Dental Association.
Integrated Genomic and Network-Based Analyses of Complex Diseases and Human Disease Network.
Al-Harazi, Olfat; Al Insaif, Sadiq; Al-Ajlan, Monirah A; Kaya, Namik; Dzimiri, Nduna; Colak, Dilek
2016-06-20
A disease phenotype generally reflects various pathobiological processes that interact in a complex network. The highly interconnected nature of the human protein interaction network (interactome) indicates that, at the molecular level, it is difficult to consider diseases as being independent of one another. Recently, genome-wide molecular measurements, data mining and bioinformatics approaches have provided the means to explore human diseases from a molecular basis. The exploration of diseases and a system of disease relationships based on the integration of genome-wide molecular data with the human interactome could offer a powerful perspective for understanding the molecular architecture of diseases. Recently, subnetwork markers have proven to be more robust and reliable than individual biomarker genes selected based on gene expression profiles alone, and achieve higher accuracy in disease classification. We have applied one of these methodologies to idiopathic dilated cardiomyopathy (IDCM) data that we have generated using a microarray and identified significant subnetworks associated with the disease. In this paper, we review the recent endeavours in this direction, and summarize the existing methodologies and computational tools for network-based analysis of complex diseases and molecular relationships among apparently different disorders and human disease network. We also discuss the future research trends and topics of this promising field. Copyright © 2015 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Evolution of SH2 domains and phosphotyrosine signalling networks
Liu, Bernard A.; Nash, Piers D.
2012-01-01
Src homology 2 (SH2) domains mediate selective protein–protein interactions with tyrosine phosphorylated proteins, and in doing so define specificity of phosphotyrosine (pTyr) signalling networks. SH2 domains and protein-tyrosine phosphatases expand alongside protein-tyrosine kinases (PTKs) to coordinate cellular and organismal complexity in the evolution of the unikont branch of the eukaryotes. Examination of conserved families of PTKs and SH2 domain proteins provides fiduciary marks that trace the evolutionary landscape for the development of complex cellular systems in the proto-metazoan and metazoan lineages. The evolutionary provenance of conserved SH2 and PTK families reveals the mechanisms by which diversity is achieved through adaptations in tissue-specific gene transcription, altered ligand binding, insertions of linear motifs and the gain or loss of domains following gene duplication. We discuss mechanisms by which pTyr-mediated signalling networks evolve through the development of novel and expanded families of SH2 domain proteins and the elaboration of connections between pTyr-signalling proteins. These changes underlie the variety of general and specific signalling networks that give rise to tissue-specific functions and increasingly complex developmental programmes. Examination of SH2 domains from an evolutionary perspective provides insight into the process by which evolutionary expansion and modification of molecular protein interaction domain proteins permits the development of novel protein-interaction networks and accommodates adaptation of signalling networks. PMID:22889907
Bioinformatics analysis on molecular mechanism of rheum officinale in treatment of jaundice
NASA Astrophysics Data System (ADS)
Shan, Si; Tu, Jun; Nie, Peng; Yan, Xiaojun
2017-01-01
Objective: To study the molecular mechanism of Rheum officinale in the treatment of Jaundice by building molecular networks and comparing canonical pathways. Methods: Target proteins of Rheum officinale and related genes of Jaundice were searched from Pubchem and Gene databases online respectively. Molecular networks and canonical pathways comparison analyses were performed by Ingenuity Pathway Analysis (IPA). Results: The molecular networks of Rheum officinale and Jaundice were complex and multifunctional. The 40 target proteins of Rheum officinale and 33 Homo sapiens genes of Jaundice were found in databases. There were 19 common pathways both related networks. Rheum officinale could regulate endothelial differentiation, Interleukin-1B (IL-1B) and Tumor Necrosis Factor (TNF) in these pathways. Conclusions: Rheum officinale treat Jaundice by regulating many effective nodes of Apoptotic pathway and cellular immunity related pathways.
INfORM: Inference of NetwOrk Response Modules.
Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario
2018-06-15
Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.
Co-expression networks reveal the tissue-specific regulation of transcription and splicing.
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H; Jo, Brian; Gao, Chuan; McDowell, Ian C; Engelhardt, Barbara E; Battle, Alexis
2017-11-01
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. © 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.
Anitha, P; Anbarasu, Anand; Ramaiah, Sudha
2014-05-01
Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Dumas, Marc-Emmanuel; Domange, Céline; Calderari, Sophie; Martínez, Andrea Rodríguez; Ayala, Rafael; Wilder, Steven P; Suárez-Zamorano, Nicolas; Collins, Stephan C; Wallis, Robert H; Gu, Quan; Wang, Yulan; Hue, Christophe; Otto, Georg W; Argoud, Karène; Navratil, Vincent; Mitchell, Steve C; Lindon, John C; Holmes, Elaine; Cazier, Jean-Baptiste; Nicholson, Jeremy K; Gauguier, Dominique
2016-09-30
The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus occurs through complex organ-specific cellular mechanisms and networks contributing to impaired insulin secretion and insulin resistance. Genome-wide gene expression profiling systems can dissect the genetic contributions to metabolome and transcriptome regulations. The integrative analysis of multiple gene expression traits and metabolic phenotypes (i.e., metabotypes) together with their underlying genetic regulation remains a challenge. Here, we introduce a systems genetics approach based on the topological analysis of a combined molecular network made of genes and metabolites identified through expression and metabotype quantitative trait locus mapping (i.e., eQTL and mQTL) to prioritise biological characterisation of candidate genes and traits. We used systematic metabotyping by 1 H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualize the shortest paths between metabolites and genes significantly associated with each genomic block. Despite strong genomic similarities (95-99 %) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting the metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific mQTLs and genome-wide eQTLs. Variation in key metabolites like glucose, succinate, lactate, or 3-hydroxybutyrate and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing the shortest path length drove prioritization of biological validations by gene silencing. These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulation and to characterize novel functional roles for genes determining tissue-specific metabolism.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks
Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways. PMID:29049295
CORUM: the comprehensive resource of mammalian protein complexes
Ruepp, Andreas; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Stransky, Michael; Waegele, Brigitte; Schmidt, Thorsten; Doudieu, Octave Noubibou; Stümpflen, Volker; Mewes, H. Werner
2008-01-01
Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes. PMID:17965090
Between “design” and “bricolage”: Genetic networks, levels of selection, and adaptive evolution
Wilkins, Adam S.
2007-01-01
The extent to which “developmental constraints” in complex organisms restrict evolutionary directions remains contentious. Yet, other forms of internal constraint, which have received less attention, may also exist. It will be argued here that a set of partial constraints below the level of phenotypes, those involving genes and molecules, influences and channels the set of possible evolutionary trajectories. At the top-most organizational level there are the genetic network modules, whose operations directly underlie complex morphological traits. The properties of these network modules, however, have themselves been set by the evolutionary history of the component genes and their interactions. Characterization of the components, structures, and operational dynamics of specific genetic networks should lead to a better understanding not only of the morphological traits they underlie but of the biases that influence the directions of evolutionary change. Furthermore, such knowledge may permit assessment of the relative degrees of probability of short evolutionary trajectories, those on the microevolutionary scale. In effect, a “network perspective” may help transform evolutionary biology into a scientific enterprise with greater predictive capability than it has hitherto possessed. PMID:17494754
Between "design" and "bricolage": genetic networks, levels of selection, and adaptive evolution.
Wilkins, Adam S
2007-05-15
The extent to which "developmental constraints" in complex organisms restrict evolutionary directions remains contentious. Yet, other forms of internal constraint, which have received less attention, may also exist. It will be argued here that a set of partial constraints below the level of phenotypes, those involving genes and molecules, influences and channels the set of possible evolutionary trajectories. At the top-most organizational level there are the genetic network modules, whose operations directly underlie complex morphological traits. The properties of these network modules, however, have themselves been set by the evolutionary history of the component genes and their interactions. Characterization of the components, structures, and operational dynamics of specific genetic networks should lead to a better understanding not only of the morphological traits they underlie but of the biases that influence the directions of evolutionary change. Furthermore, such knowledge may permit assessment of the relative degrees of probability of short evolutionary trajectories, those on the microevolutionary scale. In effect, a "network perspective" may help transform evolutionary biology into a scientific enterprise with greater predictive capability than it has hitherto possessed.
Jachiet, Pierre-Alain; Colson, Philippe; Lopez, Philippe; Bapteste, Eric
2014-08-07
Complex nongradual evolutionary processes such as gene remodeling are difficult to model, to visualize, and to investigate systematically. Despite these challenges, the creation of composite (or mosaic) genes by combination of genetic segments from unrelated gene families was established as an important adaptive phenomena in eukaryotic genomes. In contrast, almost no general studies have been conducted to quantify composite genes in viruses. Although viral genome mosaicism has been well-described, the extent of gene mosaicism and its rules of emergence remain largely unexplored. Applying methods from graph theory to inclusive similarity networks, and using data from more than 3,000 complete viral genomes, we provide the first demonstration that composite genes in viruses are 1) functionally biased, 2) involved in key aspects of the arm race between cells and viruses, and 3) can be classified into two distinct types of composite genes in all viral classes. Beyond the quantification of the widespread recombination of genes among different viruses of the same class, we also report a striking sharing of genetic information between viruses of different classes and with different nucleic acid types. This latter discovery provides novel evidence for the existence of a large and complex mobilome network, which appears partly bound by the sharing of genetic information and by the formation of composite genes between mobile entities with different genetic material. Considering that there are around 10E31 viruses on the planet, gene remodeling appears as a hugely significant way of generating and moving novel sequences between different kinds of organisms on Earth. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.
Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J
2013-11-12
Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.
Toufighi, Kiana; Yang, Jae-Seong; Luis, Nuno Miguel; Aznar Benitah, Salvador; Lehner, Ben; Serrano, Luis; Kiel, Christina
2015-01-01
The molecular details underlying the time-dependent assembly of protein complexes in cellular networks, such as those that occur during differentiation, are largely unexplored. Focusing on the calcium-induced differentiation of primary human keratinocytes as a model system for a major cellular reorganization process, we look at the expression of genes whose products are involved in manually-annotated protein complexes. Clustering analyses revealed only moderate co-expression of functionally related proteins during differentiation. However, when we looked at protein complexes, we found that the majority (55%) are composed of non-dynamic and dynamic gene products (‘di-chromatic’), 19% are non-dynamic, and 26% only dynamic. Considering three-dimensional protein structures to predict steric interactions, we found that proteins encoded by dynamic genes frequently interact with a common non-dynamic protein in a mutually exclusive fashion. This suggests that during differentiation, complex assemblies may also change through variation in the abundance of proteins that compete for binding to common proteins as found in some cases for paralogous proteins. Considering the example of the TNF-α/NFκB signaling complex, we suggest that the same core complex can guide signals into diverse context-specific outputs by addition of time specific expressed subunits, while keeping other cellular functions constant. Thus, our analysis provides evidence that complex assembly with stable core components and competition could contribute to cell differentiation. PMID:25946651
Cooperation and coexpression: How coexpression networks shift in response to multiple mutualists.
Palakurty, Sathvik X; Stinchcombe, John R; Afkhami, Michelle E
2018-04-01
A mechanistic understanding of community ecology requires tackling the nonadditive effects of multispecies interactions, a challenge that necessitates integration of ecological and molecular complexity-namely moving beyond pairwise ecological interaction studies and the "gene at a time" approach to mechanism. Here, we investigate the consequences of multispecies mutualisms for the structure and function of genomewide differential coexpression networks for the first time, using the tractable and ecologically important interaction between legume Medicago truncatula, rhizobia and mycorrhizal fungi. First, we found that genes whose expression is affected nonadditively by multiple mutualists are more highly connected in gene networks than expected by chance and had 94% greater network centrality than genes showing additive effects, suggesting that nonadditive genes may be key players in the widespread transcriptomic responses to multispecies symbioses. Second, multispecies mutualisms substantially changed coexpression network structure of 18 modules of host plant genes and 22 modules of the fungal symbionts' genes, indicating that third-party mutualists can cause significant rewiring of plant and fungal molecular networks. Third, we found that 60% of the coexpressed gene sets that explained variation in plant performance had coexpression structures that were altered by interactive effects of rhizobia and fungi. Finally, an "across-symbiosis" approach identified sets of plant and mycorrhizal genes whose coexpression structure was unique to the multiple mutualist context and suggested coupled responses across the plant-mycorrhizal interaction to rhizobial mutualists. Taken together, these results show multispecies mutualisms have substantial effects on the molecular interactions in host plants, microbes and across symbiotic boundaries. © 2018 John Wiley & Sons Ltd.
Differential C3NET reveals disease networks of direct physical interactions
2011-01-01
Background Genes might have different gene interactions in different cell conditions, which might be mapped into different networks. Differential analysis of gene networks allows spotting condition-specific interactions that, for instance, form disease networks if the conditions are a disease, such as cancer, and normal. This could potentially allow developing better and subtly targeted drugs to cure cancer. Differential network analysis with direct physical gene interactions needs to be explored in this endeavour. Results C3NET is a recently introduced information theory based gene network inference algorithm that infers direct physical gene interactions from expression data, which was shown to give consistently higher inference performances over various networks than its competitors. In this paper, we present, DC3net, an approach to employ C3NET in inferring disease networks. We apply DC3net on a synthetic and real prostate cancer datasets, which show promising results. With loose cutoffs, we predicted 18583 interactions from tumor and normal samples in total. Although there are no reference interactions databases for the specific conditions of our samples in the literature, we found verifications for 54 of our predicted direct physical interactions from only four of the biological interaction databases. As an example, we predicted that RAD50 with TRF2 have prostate cancer specific interaction that turned out to be having validation from the literature. It is known that RAD50 complex associates with TRF2 in the S phase of cell cycle, which suggests that this predicted interaction may promote telomere maintenance in tumor cells in order to allow tumor cells to divide indefinitely. Our enrichment analysis suggests that the identified tumor specific gene interactions may be potentially important in driving the growth in prostate cancer. Additionally, we found that the highest connected subnetwork of our predicted tumor specific network is enriched for all proliferation genes, which further suggests that the genes in this network may serve in the process of oncogenesis. Conclusions Our approach reveals disease specific interactions. It may help to make experimental follow-up studies more cost and time efficient by prioritizing disease relevant parts of the global gene network. PMID:21777411
Feltus, F Alex; Ficklin, Stephen P; Gibson, Scott M; Smith, Melissa C
2013-06-05
In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired.
2013-01-01
Background In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. Results A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Conclusions Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired. PMID:23738693
Stojanova, Daniela; Ceci, Michelangelo; Malerba, Donato; Dzeroski, Saso
2013-09-26
Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Our newly developed method for HMC takes into account network information in the learning phase: When used for gene function prediction in the context of PPI networks, the explicit consideration of network autocorrelation increases the predictive performance of the learned models. Overall, we found that this holds for different gene features/ descriptions, functional annotation schemes, and PPI networks: Best results are achieved when the PPI network is dense and contains a large proportion of function-relevant interactions.
Farber, Charles R
2010-11-01
Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD. © 2010 American Society for Bone and Mineral Research.
Comparison of De Novo Network Reverse Engineering Methods with Applications to Ecotoxicology
The DREAM competitions for network modeling comparisons have made several points clear: 1) incorporating knowledge beyond gene expression data may improve modeling (e.g., data from knock-out organisms), 2) most techniques do not perform better than random, and 3) more complex met...
Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou
2016-12-23
Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.
Rossouw, Debra; Næs, Tormod; Bauer, Florian F
2008-01-01
Background 'Omics' tools provide novel opportunities for system-wide analysis of complex cellular functions. Secondary metabolism is an example of a complex network of biochemical pathways, which, although well mapped from a biochemical point of view, is not well understood with regards to its physiological roles and genetic and biochemical regulation. Many of the metabolites produced by this network such as higher alcohols and esters are significant aroma impact compounds in fermentation products, and different yeast strains are known to produce highly divergent aroma profiles. Here, we investigated whether we can predict the impact of specific genes of known or unknown function on this metabolic network by combining whole transcriptome and partial exo-metabolome analysis. Results For this purpose, the gene expression levels of five different industrial wine yeast strains that produce divergent aroma profiles were established at three different time points of alcoholic fermentation in synthetic wine must. A matrix of gene expression data was generated and integrated with the concentrations of volatile aroma compounds measured at the same time points. This relatively unbiased approach to the study of volatile aroma compounds enabled us to identify candidate genes for aroma profile modification. Five of these genes, namely YMR210W, BAT1, AAD10, AAD14 and ACS1 were selected for overexpression in commercial wine yeast, VIN13. Analysis of the data show a statistically significant correlation between the changes in the exo-metabome of the overexpressing strains and the changes that were predicted based on the unbiased alignment of transcriptomic and exo-metabolomic data. Conclusion The data suggest that a comparative transcriptomics and metabolomics approach can be used to identify the metabolic impacts of the expression of individual genes in complex systems, and the amenability of transcriptomic data to direct applications of biotechnological relevance. PMID:18990252
Characterizing mutation-expression network relationships in multiple cancers.
Ghazanfar, Shila; Yang, Jean Yee Hwa
2016-08-01
Data made available through large cancer consortia like The Cancer Genome Atlas make for a rich source of information to be studied across and between cancers. In recent years, network approaches have been applied to such data in uncovering the complex interrelationships between mutational and expression profiles, but lack direct testing for expression changes via mutation. In this pan-cancer study we analyze mutation and gene expression information in an integrative manner by considering the networks generated by testing for differences in expression in direct association with specific mutations. We relate our findings among the 19 cancers examined to identify commonalities and differences as well as their characteristics. Using somatic mutation and gene expression information across 19 cancers, we generated mutation-expression networks per cancer. On evaluation we found that our generated networks were significantly enriched for known cancer-related genes, such as skin cutaneous melanoma (p<0.01 using Network of Cancer Genes 4.0). Our framework identified that while different cancers contained commonly mutated genes, there was little concordance between associated gene expression changes among cancers. Comparison between cancers showed a greater overlap of network nodes for cancers with higher overall non-silent mutation load, compared to those with a lower overall non-silent mutation load. This study offers a framework that explores network information through co-analysis of somatic mutations and gene expression profiles. Our pan-cancer application of this approach suggests that while mutations are frequently common among cancer types, the impact they have on the surrounding networks via gene expression changes varies. Despite this finding, there are some cancers for which mutation-associated network behaviour appears to be similar: suggesting a potential framework for uncovering related cancers for which similar therapeutic strategies may be applicable. Our framework for understanding relationships among cancers has been integrated into an interactive R Shiny application, PAn Cancer Mutation Expression Networks (PACMEN), containing dynamic and static network visualization of the mutation-expression networks. PACMEN also features tools for further examination of network topology characteristics among cancers. Copyright © 2016 Elsevier Ltd. All rights reserved.
Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin
2014-01-01
Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Inferring Nonlinear Gene Regulatory Networks from Gene Expression Data Based on Distance Correlation
Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin
2014-01-01
Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference. PMID:24551058
PLAU inferred from a correlation network is critical for suppressor function of regulatory T cells
He, Feng; Chen, Hairong; Probst-Kepper, Michael; Geffers, Robert; Eifes, Serge; del Sol, Antonio; Schughart, Klaus; Zeng, An-Ping; Balling, Rudi
2012-01-01
Human FOXP3+CD25+CD4+ regulatory T cells (Tregs) are essential to the maintenance of immune homeostasis. Several genes are known to be important for murine Tregs, but for human Tregs the genes and underlying molecular networks controlling the suppressor function still largely remain unclear. Here, we describe a strategy to identify the key genes directly from an undirected correlation network which we reconstruct from a very high time-resolution (HTR) transcriptome during the activation of human Tregs/CD4+ T-effector cells. We show that a predicted top-ranked new key gene PLAU (the plasminogen activator urokinase) is important for the suppressor function of both human and murine Tregs. Further analysis unveils that PLAU is particularly important for memory Tregs and that PLAU mediates Treg suppressor function via STAT5 and ERK signaling pathways. Our study demonstrates the potential for identifying novel key genes for complex dynamic biological processes using a network strategy based on HTR data, and reveals a critical role for PLAU in Treg suppressor function. PMID:23169000
Prediction of C. elegans Longevity Genes by Human and Worm Longevity Networks
de Magalhães, João Pedro; Ruvkun, Gary; Fraifeld, Vadim E.; Curran, Sean P.
2012-01-01
Intricate and interconnected pathways modulate longevity, but screens to identify the components of these pathways have not been saturating. Because biological processes are often executed by protein complexes and fine-tuned by regulatory factors, the first-order protein-protein interactors of known longevity genes are likely to participate in the regulation of longevity. Data-rich maps of protein interactions have been established for many cardinal organisms such as yeast, worms, and humans. We propose that these interaction maps could be mined for the identification of new putative regulators of longevity. For this purpose, we have constructed longevity networks in both humans and worms. We reasoned that the essential first-order interactors of known longevity-associated genes in these networks are more likely to have longevity phenotypes than randomly chosen genes. We have used C. elegans to determine whether post-developmental inactivation of these essential genes modulates lifespan. Our results suggest that the worm and human longevity networks are functionally relevant and possess a high predictive power for identifying new longevity regulators. PMID:23144747
Nim, Hieu T; Furtado, Milena B; Costa, Mauro W; Rosenthal, Nadia A; Kitano, Hiroaki; Boyd, Sarah E
2015-05-01
Existing de novo software platforms have largely overlooked a valuable resource, the expertise of the intended biologist users. Typical data representations such as long gene lists, or highly dense and overlapping transcription factor networks often hinder biologists from relating these results to their expertise. VISIONET, a streamlined visualisation tool built from experimental needs, enables biologists to transform large and dense overlapping transcription factor networks into sparse human-readable graphs via numerically filtering. The VISIONET interface allows users without a computing background to interactively explore and filter their data, and empowers them to apply their specialist knowledge on far more complex and substantial data sets than is currently possible. Applying VISIONET to the Tbx20-Gata4 transcription factor network led to the discovery and validation of Aldh1a2, an essential developmental gene associated with various important cardiac disorders, as a healthy adult cardiac fibroblast gene co-regulated by cardiogenic transcription factors Gata4 and Tbx20. We demonstrate with experimental validations the utility of VISIONET for expertise-driven gene discovery that opens new experimental directions that would not otherwise have been identified.
Identifying protein complexes based on brainstorming strategy.
Shen, Xianjun; Zhou, Jin; Yi, Li; Hu, Xiaohua; He, Tingting; Yang, Jincai
2016-11-01
Protein complexes comprising of interacting proteins in protein-protein interaction network (PPI network) play a central role in driving biological processes within cells. Recently, more and more swarm intelligence based algorithms to detect protein complexes have been emerging, which have become the research hotspot in proteomics field. In this paper, we propose a novel algorithm for identifying protein complexes based on brainstorming strategy (IPC-BSS), which is integrated into the main idea of swarm intelligence optimization and the improved K-means algorithm. Distance between the nodes in PPI network is defined by combining the network topology and gene ontology (GO) information. Inspired by human brainstorming process, IPC-BSS algorithm firstly selects the clustering center nodes, and then they are separately consolidated with the other nodes with short distance to form initial clusters. Finally, we put forward two ways of updating the initial clusters to search optimal results. Experimental results show that our IPC-BSS algorithm outperforms the other classic algorithms on yeast and human PPI networks, and it obtains many predicted protein complexes with biological significance. Copyright © 2016 Elsevier Inc. All rights reserved.
Memory functions reveal structural properties of gene regulatory networks
Perez-Carrasco, Ruben
2018-01-01
Gene regulatory networks (GRNs) control cellular function and decision making during tissue development and homeostasis. Mathematical tools based on dynamical systems theory are often used to model these networks, but the size and complexity of these models mean that their behaviour is not always intuitive and the underlying mechanisms can be difficult to decipher. For this reason, methods that simplify and aid exploration of complex networks are necessary. To this end we develop a broadly applicable form of the Zwanzig-Mori projection. By first converting a thermodynamic state ensemble model of gene regulation into mass action reactions we derive a general method that produces a set of time evolution equations for a subset of components of a network. The influence of the rest of the network, the bulk, is captured by memory functions that describe how the subnetwork reacts to its own past state via components in the bulk. These memory functions provide probes of near-steady state dynamics, revealing information not easily accessible otherwise. We illustrate the method on a simple cross-repressive transcriptional motif to show that memory functions not only simplify the analysis of the subnetwork but also have a natural interpretation. We then apply the approach to a GRN from the vertebrate neural tube, a well characterised developmental transcriptional network composed of four interacting transcription factors. The memory functions reveal the function of specific links within the neural tube network and identify features of the regulatory structure that specifically increase the robustness of the network to initial conditions. Taken together, the study provides evidence that Zwanzig-Mori projections offer powerful and effective tools for simplifying and exploring the behaviour of GRNs. PMID:29470492
Garcia, Nelson; Messing, Joachim
2017-01-01
The TEL2, TTI1, and TTI2 proteins are co-chaperones for heat shock protein 90 (HSP90) to regulate the protein folding and maturation of phosphatidylinositol 3-kinase-related kinases (PIKKs). Referred to as the TTT complex, the genes that encode them are highly conserved from man to maize. TTT complex and PIKK genes exist mostly as single copy genes in organisms where they have been characterized. Members of this interacting protein network in maize were identified and synteny analyses were performed to study their evolution. Similar to other species, there is only one copy of each of these genes in maize which was due to a loss of the duplicated copy created by ancient allotetraploidy. Moreover, the retained copies of the TTT complex and the PIKK genes tolerated extensive retrotransposon insertion in their introns that resulted in increased gene lengths and gene body methylation, without apparent effect in normal gene expression and function. The results raise an interesting question on whether the reversion to single copy was due to selection against deleterious unbalanced gene duplications between members of the complex as predicted by the gene balance hypothesis, or due to neutral loss of extra copies. Uneven alteration of dosage either by adding extra copies or modulating gene expression of complex members is being proposed as a means to investigate whether the data supports the gene balance hypothesis or not.
Networks of genetic loci and the scientific literature
NASA Astrophysics Data System (ADS)
Semeiks, J. R.; Grate, L. R.; Mian, I. S.
This work considers biological information graphs, networks in which nodes corre-spond to genetic loci (or "genes") and an (undirected) edge signifies that two genes are discussed in the same article(s) in the scientific literature ("documents"). Operations that utilize the topology of these graphs can assist researchers in the scientific discovery process. For example, a shortest path between two nodes defines an ordered series of genes and documents that can be used to explore the relationship(s) between genes of interest. This work (i) describes how topologies in which edges are likely to reflect genuine relationship(s) can be constructed from human-curated corpora of genes an-notated with documents (or vice versa), and (ii) illustrates the potential of biological information graphs in synthesizing knowledge in order to formulate new hypotheses and generate novel predictions for subsequent experimental study. In particular, the well-known LocusLink corpus is used to construct a biological information graph consisting of 10,297 nodes and 21,910 edges. The large-scale statistical properties of this gene-document network suggest that it is a new example of a power-law network. The segregation of genes on the basis of species and encoded protein molecular function indicate the presence of assortativity, the preference for nodes with similar attributes to be neighbors in a network. The practical utility of a gene-document network is illustrated by using measures such as shortest paths and centrality to analyze a subset of nodes corresponding to genes implicated in aging. Each release of a curated biomedical corpus defines a particular static graph. The topology of a gene-document network changes over time as curators add and/or remove nodes and/or edges. Such a dynamic, evolving corpus provides both the foundation for analyzing the growth and behavior of large complex networks and a substrate for examining trends in biological research.
Investigation of candidate genes for osteoarthritis based on gene expression profiles.
Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei
2016-12-01
To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
2016-01-01
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
2016-01-11
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
Lähdesmäki, Harri; Hautaniemi, Sampsa; Shmulevich, Ilya; Yli-Harja, Olli
2006-01-01
A significant amount of attention has recently been focused on modeling of gene regulatory networks. Two frequently used large-scale modeling frameworks are Bayesian networks (BNs) and Boolean networks, the latter one being a special case of its recent stochastic extension, probabilistic Boolean networks (PBNs). PBN is a promising model class that generalizes the standard rule-based interactions of Boolean networks into the stochastic setting. Dynamic Bayesian networks (DBNs) is a general and versatile model class that is able to represent complex temporal stochastic processes and has also been proposed as a model for gene regulatory systems. In this paper, we concentrate on these two model classes and demonstrate that PBNs and a certain subclass of DBNs can represent the same joint probability distribution over their common variables. The major benefit of introducing the relationships between the models is that it opens up the possibility of applying the standard tools of DBNs to PBNs and vice versa. Hence, the standard learning tools of DBNs can be applied in the context of PBNs, and the inference methods give a natural way of handling the missing values in PBNs which are often present in gene expression measurements. Conversely, the tools for controlling the stationary behavior of the networks, tools for projecting networks onto sub-networks, and efficient learning schemes can be used for DBNs. In other words, the introduced relationships between the models extend the collection of analysis tools for both model classes. PMID:17415411
Reveal, A General Reverse Engineering Algorithm for Inference of Genetic Network Architectures
NASA Technical Reports Server (NTRS)
Liang, Shoudan; Fuhrman, Stefanie; Somogyi, Roland
1998-01-01
Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.
Yu, Fu-Dong; Yang, Shao-You; Li, Yuan-Yuan; Hu, Wei
2013-04-10
Malaria continues to be one of the most severe global infectious diseases, as a major threat to human health and economic development. Network-based biological analysis is a promising approach to uncover key genes and biological processes from a network viewpoint, which could not be recognized from individual gene-based signatures. We integrated gene co-expression profile with protein-protein interaction and transcriptional regulation information to construct a comprehensive gene co-expression network of Plasmodium falciparum. Based on this network, we identified 10 core modules by using ICE (Iterative Clique Enumeration) algorithm, which were essential for malaria parasite development in intraerythrocytic developmental cycle (IDC) stages. In each module, all genes were highly correlated probably due to co-regulation or formation of a protein complex. Some of these genes were recognized to be differentially coexpressed among three close-by IDC stages. The gene of prpf8 (PFD0265w) encoding pre-mRNA processing splicing factor 8 product was identified as DCGs (differentially co-expressed genes) among IDC stages, although this gene function was seldom reported in previous researches. Integrating the species-specific gene prediction and differential co-expression gene detection, we found some modules could perform species-specific functions according to some of genes in these modules were species-specific genes, like the module 10. Furthermore, in order to reveal the underlying mechanisms of the erythrocyte invasion by P. falciparum, Steiner Tree algorithm was employed to identify the invasion subnetwork from our gene co-expression network. The subnetwork-based analysis indicated that some important Plasmodium parasite specific genes could corporate with each other and be co-regulated during the parasite invasion process, which including a head-to-head gene pair of PfRH2a (PF13_0198) and PfRH2b (MAL13P1.176). This study based on gene co-expression network could shed new insights on the mechanisms of pathogenesis, even virulence and P. falciparum development. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Identification of gene networks underlying dystocia in dairy cattle
USDA-ARS?s Scientific Manuscript database
Dystocia is a trait with a high impact in the dairy industry. Among its risk factors are calf weight, gestation length, breed and conformation. Biological networks have been proposed to capture the genetic architecture of complex traits, where GWAS show limitations. The objective of this study was t...
USDA-ARS?s Scientific Manuscript database
Puberty is a complex physiological event by which animals mature into an adult capable of sexual reproduction. In order to enhance our understanding of the genes and regulatory pathways and networks involved in puberty, we characterized the transcriptome of five reproductive tissues (i.e., hypothal...
Shen, Xianjun; Yi, Li; Jiang, Xingpeng; He, Tingting; Yang, Jincai; Xie, Wei; Hu, Po; Hu, Xiaohua
2017-01-01
How to identify protein complex is an important and challenging task in proteomics. It would make great contribution to our knowledge of molecular mechanism in cell life activities. However, the inherent organization and dynamic characteristic of cell system have rarely been incorporated into the existing algorithms for detecting protein complexes because of the limitation of protein-protein interaction (PPI) data produced by high throughput techniques. The availability of time course gene expression profile enables us to uncover the dynamics of molecular networks and improve the detection of protein complexes. In order to achieve this goal, this paper proposes a novel algorithm DCA (Dynamic Core-Attachment). It detects protein-complex core comprising of continually expressed and highly connected proteins in dynamic PPI network, and then the protein complex is formed by including the attachments with high adhesion into the core. The integration of core-attachment feature into the dynamic PPI network is responsible for the superiority of our algorithm. DCA has been applied on two different yeast dynamic PPI networks and the experimental results show that it performs significantly better than the state-of-the-art techniques in terms of prediction accuracy, hF-measure and statistical significance in biology. In addition, the identified complexes with strong biological significance provide potential candidate complexes for biologists to validate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mizrachi, Eshchar; Verbeke, Lieven; Christie, Nanette
As a consequence of their remarkable adaptability, fast growth, and superior wood properties, eucalypt tree plantations have emerged as key renewable feedstocks (over 20 million ha globally) for the production of pulp, paper, bioenergy, and other lignocellulosic products. However, most biomass properties such as growth, wood density, and wood chemistry are complex traits that are hard to improve in long-lived perennials. Systems genetics, a process of harnessing multiple levels of component trait information (e.g., transcript, protein, and metabolite variation) in populations that vary in complex traits, has proven effective for dissecting the genetics and biology of such traits. We havemore » applied a network-based data integration (NBDI) method for a systems-level analysis of genes, processes and pathways underlying biomass and bioenergy-related traits using a segregating Eucalyptus hybrid population. We show that the integrative approach can link biologically meaningful sets of genes to complex traits and at the same time reveal the molecular basis of trait variation. Gene sets identified for related woody biomass traits were found to share regulatory loci, cluster in network neighborhoods, and exhibit enrichment for molecular functions such as xylan metabolism and cell wall development. These findings offer a framework for identifying the molecular underpinnings of complex biomass and bioprocessing-related traits. Furthermore, a more thorough understanding of the molecular basis of plant biomass traits should provide additional opportunities for the establishment of a sustainable bio-based economy.« less
Mizrachi, Eshchar; Verbeke, Lieven; Christie, Nanette; ...
2017-01-17
As a consequence of their remarkable adaptability, fast growth, and superior wood properties, eucalypt tree plantations have emerged as key renewable feedstocks (over 20 million ha globally) for the production of pulp, paper, bioenergy, and other lignocellulosic products. However, most biomass properties such as growth, wood density, and wood chemistry are complex traits that are hard to improve in long-lived perennials. Systems genetics, a process of harnessing multiple levels of component trait information (e.g., transcript, protein, and metabolite variation) in populations that vary in complex traits, has proven effective for dissecting the genetics and biology of such traits. We havemore » applied a network-based data integration (NBDI) method for a systems-level analysis of genes, processes and pathways underlying biomass and bioenergy-related traits using a segregating Eucalyptus hybrid population. We show that the integrative approach can link biologically meaningful sets of genes to complex traits and at the same time reveal the molecular basis of trait variation. Gene sets identified for related woody biomass traits were found to share regulatory loci, cluster in network neighborhoods, and exhibit enrichment for molecular functions such as xylan metabolism and cell wall development. These findings offer a framework for identifying the molecular underpinnings of complex biomass and bioprocessing-related traits. Furthermore, a more thorough understanding of the molecular basis of plant biomass traits should provide additional opportunities for the establishment of a sustainable bio-based economy.« less
Mizrachi, Eshchar; Verbeke, Lieven; Christie, Nanette; Fierro, Ana C; Mansfield, Shawn D; Davis, Mark F; Gjersing, Erica; Tuskan, Gerald A; Van Montagu, Marc; Van de Peer, Yves; Marchal, Kathleen; Myburg, Alexander A
2017-01-31
As a consequence of their remarkable adaptability, fast growth, and superior wood properties, eucalypt tree plantations have emerged as key renewable feedstocks (over 20 million ha globally) for the production of pulp, paper, bioenergy, and other lignocellulosic products. However, most biomass properties such as growth, wood density, and wood chemistry are complex traits that are hard to improve in long-lived perennials. Systems genetics, a process of harnessing multiple levels of component trait information (e.g., transcript, protein, and metabolite variation) in populations that vary in complex traits, has proven effective for dissecting the genetics and biology of such traits. We have applied a network-based data integration (NBDI) method for a systems-level analysis of genes, processes and pathways underlying biomass and bioenergy-related traits using a segregating Eucalyptus hybrid population. We show that the integrative approach can link biologically meaningful sets of genes to complex traits and at the same time reveal the molecular basis of trait variation. Gene sets identified for related woody biomass traits were found to share regulatory loci, cluster in network neighborhoods, and exhibit enrichment for molecular functions such as xylan metabolism and cell wall development. These findings offer a framework for identifying the molecular underpinnings of complex biomass and bioprocessing-related traits. A more thorough understanding of the molecular basis of plant biomass traits should provide additional opportunities for the establishment of a sustainable bio-based economy.
Genome network medicine: innovation to overcome huge challenges in cancer therapy.
Roukos, Dimitrios H
2014-01-01
The post-ENCODE era shapes now a new biomedical research direction for understanding transcriptional and signaling networks driving gene expression and core cellular processes such as cell fate, survival, and apoptosis. Over the past half century, the Francis Crick 'central dogma' of single n gene/protein-phenotype (trait/disease) has defined biology, human physiology, disease, diagnostics, and drugs discovery. However, the ENCODE project and several other genomic studies using high-throughput sequencing technologies, computational strategies, and imaging techniques to visualize regulatory networks, provide evidence that transcriptional process and gene expression are regulated by highly complex dynamic molecular and signaling networks. This Focus article describes the linear experimentation-based limitations of diagnostics and therapeutics to cure advanced cancer and the need to move on from reductionist to network-based approaches. With evident a wide genomic heterogeneity, the power and challenges of next-generation sequencing (NGS) technologies to identify a patient's personal mutational landscape for tailoring the best target drugs in the individual patient are discussed. However, the available drugs are not capable of targeting aberrant signaling networks and research on functional transcriptional heterogeneity and functional genome organization is poorly understood. Therefore, the future clinical genome network medicine aiming at overcoming multiple problems in the new fields of regulatory DNA mapping, noncoding RNA, enhancer RNAs, and dynamic complexity of transcriptional circuitry are also discussed expecting in new innovation technology and strong appreciation of clinical data and evidence-based medicine. The problematic and potential solutions in the discovery of next-generation, molecular, and signaling circuitry-based biomarkers and drugs are explored. © 2013 Wiley Periodicals, Inc.
Paper-based Synthetic Gene Networks
Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.
2014-01-01
Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167
Paper-based synthetic gene networks.
Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J
2014-11-06
Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors.
Abdeltawab, Nourtan F.; Aziz, Ramy K.; Kansal, Rita; Rowe, Sarah L.; Su, Yin; Gardner, Lidia; Brannen, Charity; Nooh, Mohammed M.; Attia, Ramy R.; Abdelsamed, Hossam A.; Taylor, William L.; Lu, Lu; Williams, Robert W.; Kotb, Malak
2008-01-01
Striking individual differences in severity of group A streptococcal (GAS) sepsis have been noted, even among patients infected with the same bacterial strain. We had provided evidence that HLA class II allelic variation contributes significantly to differences in systemic disease severity by modulating host responses to streptococcal superantigens. Inasmuch as the bacteria produce additional virulence factors that participate in the pathogenesis of this complex disease, we sought to identify additional gene networks modulating GAS sepsis. Accordingly, we applied a systems genetics approach using a panel of advanced recombinant inbred mice. By analyzing disease phenotypes in the context of mice genotypes we identified a highly significant quantitative trait locus (QTL) on Chromosome 2 between 22 and 34 Mb that strongly predicts disease severity, accounting for 25%–30% of variance. This QTL harbors several polymorphic genes known to regulate immune responses to bacterial infections. We evaluated candidate genes within this QTL using multiple parameters that included linkage, gene ontology, variation in gene expression, cocitation networks, and biological relevance, and identified interleukin1 alpha and prostaglandin E synthases pathways as key networks involved in modulating GAS sepsis severity. The association of GAS sepsis with multiple pathways underscores the complexity of traits modulating GAS sepsis and provides a powerful approach for analyzing interactive traits affecting outcomes of other infectious diseases. PMID:18421376
NASA Astrophysics Data System (ADS)
Li, Yuanyuan; Jin, Suoqin; Lei, Lei; Pan, Zishu; Zou, Xiufen
2015-03-01
The early diagnosis and investigation of the pathogenic mechanisms of complex diseases are the most challenging problems in the fields of biology and medicine. Network-based systems biology is an important technique for the study of complex diseases. The present study constructed dynamic protein-protein interaction (PPI) networks to identify dynamical network biomarkers (DNBs) and analyze the underlying mechanisms of complex diseases from a systems level. We developed a model-based framework for the construction of a series of time-sequenced networks by integrating high-throughput gene expression data into PPI data. By combining the dynamic networks and molecular modules, we identified significant DNBs for four complex diseases, including influenza caused by either H3N2 or H1N1, acute lung injury and type 2 diabetes mellitus, which can serve as warning signals for disease deterioration. Function and pathway analyses revealed that the identified DNBs were significantly enriched during key events in early disease development. Correlation and information flow analyses revealed that DNBs effectively discriminated between different disease processes and that dysfunctional regulation and disproportional information flow may contribute to the increased disease severity. This study provides a general paradigm for revealing the deterioration mechanisms of complex diseases and offers new insights into their early diagnoses.
Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand
2015-01-07
Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca
2006-01-01
Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651
Jia, Peilin; Wang, Lily; Fanous, Ayman H.; Pato, Carlos N.; Edwards, Todd L.; Zhao, Zhongming
2012-01-01
With the recent success of genome-wide association studies (GWAS), a wealth of association data has been accomplished for more than 200 complex diseases/traits, proposing a strong demand for data integration and interpretation. A combinatory analysis of multiple GWAS datasets, or an integrative analysis of GWAS data and other high-throughput data, has been particularly promising. In this study, we proposed an integrative analysis framework of multiple GWAS datasets by overlaying association signals onto the protein-protein interaction network, and demonstrated it using schizophrenia datasets. Building on a dense module search algorithm, we first searched for significantly enriched subnetworks for schizophrenia in each single GWAS dataset and then implemented a discovery-evaluation strategy to identify module genes with consistent association signals. We validated the module genes in an independent dataset, and also examined them through meta-analysis of the related SNPs using multiple GWAS datasets. As a result, we identified 205 module genes with a joint effect significantly associated with schizophrenia; these module genes included a number of well-studied candidate genes such as DISC1, GNA12, GNA13, GNAI1, GPR17, and GRIN2B. Further functional analysis suggested these genes are involved in neuronal related processes. Additionally, meta-analysis found that 18 SNPs in 9 module genes had P meta<1×10−4, including the gene HLA-DQA1 located in the MHC region on chromosome 6, which was reported in previous studies using the largest cohort of schizophrenia patients to date. These results demonstrated our bi-directional network-based strategy is efficient for identifying disease-associated genes with modest signals in GWAS datasets. This approach can be applied to any other complex diseases/traits where multiple GWAS datasets are available. PMID:22792057
Guo, Wei-Feng; Zhang, Shao-Wu; Shi, Qian-Qian; Zhang, Cheng-Ming; Zeng, Tao; Chen, Luonan
2018-01-19
The advances in target control of complex networks not only can offer new insights into the general control dynamics of complex systems, but also be useful for the practical application in systems biology, such as discovering new therapeutic targets for disease intervention. In many cases, e.g. drug target identification in biological networks, we usually require a target control on a subset of nodes (i.e., disease-associated genes) with minimum cost, and we further expect that more driver nodes consistent with a certain well-selected network nodes (i.e., prior-known drug-target genes). Therefore, motivated by this fact, we pose and address a new and practical problem called as target control problem with objectives-guided optimization (TCO): how could we control the interested variables (or targets) of a system with the optional driver nodes by minimizing the total quantity of drivers and meantime maximizing the quantity of constrained nodes among those drivers. Here, we design an efficient algorithm (TCOA) to find the optional driver nodes for controlling targets in complex networks. We apply our TCOA to several real-world networks, and the results support that our TCOA can identify more precise driver nodes than the existing control-fucus approaches. Furthermore, we have applied TCOA to two bimolecular expert-curate networks. Source code for our TCOA is freely available from http://sysbio.sibcb.ac.cn/cb/chenlab/software.htm or https://github.com/WilfongGuo/guoweifeng . In the previous theoretical research for the full control, there exists an observation and conclusion that the driver nodes tend to be low-degree nodes. However, for target control the biological networks, we find interestingly that the driver nodes tend to be high-degree nodes, which is more consistent with the biological experimental observations. Furthermore, our results supply the novel insights into how we can efficiently target control a complex system, and especially many evidences on the practical strategic utility of TCOA to incorporate prior drug information into potential drug-target forecasts. Thus applicably, our method paves a novel and efficient way to identify the drug targets for leading the phenotype transitions of underlying biological networks.
Empirical Bayes conditional independence graphs for regulatory network recovery.
Mahdi, Rami; Madduri, Abishek S; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R; Crystal, Ronald G; Mezey, Jason G
2012-08-01
Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary data are available at Bioinformatics online.
Mills, James D; Iyer, Anand M; van Scheppingen, Jackelien; Bongaarts, Anika; Anink, Jasper J; Janssen, Bart; Zimmer, Till S; Spliet, Wim G; van Rijen, Peter C; Jansen, Floor E; Feucht, Martha; Hainfellner, Johannes A; Krsek, Pavel; Zamecnik, Josef; Kotulska, Katarzyna; Jozwiak, Sergiusz; Jansen, Anna; Lagae, Lieven; Curatolo, Paolo; Kwiatkowski, David J; Pasterkamp, R Jeroen; Senthilkumar, Ketharini; von Oerthel, Lars; Hoekman, Marco F; Gorter, Jan A; Crino, Peter B; Mühlebner, Angelika; Scicluna, Brendon P; Aronica, Eleonora
2017-08-14
Tuberous Sclerosis Complex (TSC) is a rare genetic disorder that results from a mutation in the TSC1 or TSC2 genes leading to constitutive activation of the mechanistic target of rapamycin complex 1 (mTORC1). TSC is associated with autism, intellectual disability and severe epilepsy. Cortical tubers are believed to represent the neuropathological substrates of these disabling manifestations in TSC. In the presented study we used high-throughput RNA sequencing in combination with systems-based computational approaches to investigate the complexity of the TSC molecular network. Overall we detected 438 differentially expressed genes and 991 differentially expressed small non-coding RNAs in cortical tubers compared to autopsy control brain tissue. We observed increased expression of genes associated with inflammatory, innate and adaptive immune responses. In contrast, we observed a down-regulation of genes associated with neurogenesis and glutamate receptor signaling. MicroRNAs represented the largest class of over-expressed small non-coding RNA species in tubers. In particular, our analysis revealed that the miR-34 family (including miR-34a, miR-34b and miR-34c) was significantly over-expressed. Functional studies demonstrated the ability of miR-34b to modulate neurite outgrowth in mouse primary hippocampal neuronal cultures. This study provides new insights into the TSC transcriptomic network along with the identification of potential new treatment targets.
Loohuis, Nikkie FM Olde; Kasri, Nael Nadif; Glennon, Jeffrey C; van Bokhoven, Hans; Hébert, Sébastien S; Kaplan, Barry B.; Martens, Gerard JM; Aschrafi, Armaz
2016-01-01
MicroRNAs (miRs) are small regulatory molecules, which orchestrate neuronal development and plasticity through modulation of complex gene networks. microRNA-137 (miR-137) is a brain-enriched RNA with a critical role in regulating brain development and in mediating synaptic plasticity. Importantly, mutations in this miR are associated with the pathoetiology of schizophrenia (SZ), and there is a widespread assumption that disruptions in miR-137 expression lead to aberrant expression of gene regulatory networks associated with SZ. To systematically identify the mRNA targets for this miR, we performed miR-137 gain- and loss-of-function experiments in primary rat hippocampal neurons and profiled differentially expressed mRNAs through next-generation sequencing. We identified 500 genes that were bidirectionally activated or repressed in their expression by the modulation of miR-137 levels. Gene ontology analysis using two independent software resources suggested functions for these miR-137-regulated genes in neurodevelopmental processes, neuronal maturation processes and cell maintenance, all of which known to be critical for proper brain circuitry formation. Since many of the putative miR-137 targets identified here also have been previously shown to be associated with SZ, we propose that this miR acts as a critical gene network hub contributing to the pathophysiology of this neurodevelopmental disorder. PMID:26925706
NASA Astrophysics Data System (ADS)
Bhajun, Ricky; Guyon, Laurent; Pitaval, Amandine; Sulpice, Eric; Combe, Stéphanie; Obeid, Patricia; Haguet, Vincent; Ghorbel, Itebeddine; Lajaunie, Christian; Gidrol, Xavier
2015-02-01
MiRNAs are key regulators of gene expression. By binding to many genes, they create a complex network of gene co-regulation. Here, using a network-based approach, we identified miRNA hub groups by their close connections and common targets. In one cluster containing three miRNAs, miR-612, miR-661 and miR-940, the annotated functions of the co-regulated genes suggested a role in small GTPase signalling. Although the three members of this cluster targeted the same subset of predicted genes, we showed that their overexpression impacted cell fates differently. miR-661 demonstrated enhanced phosphorylation of myosin II and an increase in cell invasion, indicating a possible oncogenic miRNA. On the contrary, miR-612 and miR-940 inhibit phosphorylation of myosin II and cell invasion. Finally, expression profiling in human breast tissues showed that miR-940 was consistently downregulated in breast cancer tissues
Pan, Weiran; Li, Gang; Yang, Xiaoxiao; Miao, Jinming
2015-04-01
This study aims to explore the potential mechanism of glioma through bioinformatic approaches. The gene expression profile (GSE4290) of glioma tumor and non-tumor samples was downloaded from Gene Expression Omnibus database. A total of 180 samples were available, including 23 non-tumor and 157 tumor samples. Then the raw data were preprocessed using robust multiarray analysis, and 8,890 differentially expressed genes (DEGs) were identified by using t-test (false discovery rate < 0.0005). Furthermore, 16 known glioma related genes were abstracted from Genetic Association Database. After mapping 8,890 DEGs and 16 known glioma related genes to Human Protein Reference Database, a glioma associated protein-protein interaction network (GAPN) was constructed. In addition, 51 sub-networks in GAPN were screened out through Molecular Complex Detection (score ≥ 1), and sub-network 1 was found to have the closest interaction (score = 3). What' more, for the top 10 sub-networks, Gene Ontology (GO) enrichment analysis (p value < 0.05) was performed, and DEGs involved in sub-network 1 and 2, such as BRMS1L and CCNA1, were predicted to regulate cell growth, cell cycle, and DNA replication via interacting with known glioma related genes. Finally, the overlaps of DEGs and human essential, housekeeping, tissue-specific genes were calculated (p value = 1.0, 1.0, and 0.00014, respectively) and visualized by Venn Diagram package in R. About 61% of human tissue-specific genes were DEGs as well. This research shed new light on the pathogenesis of glioma based on DEGs and GAPN, and our findings might provide potential targets for clinical glioma treatment.
Identification of hub subnetwork based on topological features of genes in breast cancer
ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO
2015-01-01
The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
From Genes to Networks: Characterizing Gene-Regulatory Interactions in Plants.
Kaufmann, Kerstin; Chen, Dijun
2017-01-01
Plants, like other eukaryotes, have evolved complex mechanisms to coordinate gene expression during development, environmental response, and cellular homeostasis. Transcription factors (TFs), accompanied by basic cofactors and posttranscriptional regulators, are key players in gene-regulatory networks (GRNs). The coordinated control of gene activity is achieved by the interplay of these factors and by physical interactions between TFs and DNA. Here, we will briefly outline recent technological progress made to elucidate GRNs in plants. We will focus on techniques that allow us to characterize physical interactions in GRNs in plants and to analyze their regulatory consequences. Targeted manipulation allows us to test the relevance of specific gene-regulatory interactions. The combination of genome-wide experimental approaches with mathematical modeling allows us to get deeper insights into key-regulatory interactions and combinatorial control of important processes in plants.
The many faces of REST oversee epigenetic programming of neuronal genes.
Ballas, Nurit; Mandel, Gail
2005-10-01
Nervous system development relies on a complex signaling network to engineer the orderly transitions that lead to the acquisition of a neural cell fate. Progression from the non-neuronal pluripotent stem cell to a restricted neural lineage is characterized by distinct patterns of gene expression, particularly the restriction of neuronal gene expression to neurons. Concurrently, cells outside the nervous system acquire and maintain a non-neuronal fate that permanently excludes expression of neuronal genes. Studies of the transcriptional repressor REST, which regulates a large network of neuronal genes, provide a paradigm for elucidating the link between epigenetic mechanisms and neurogenesis. REST orchestrates a set of epigenetic modifications that are distinct between non-neuronal cells that give rise to neurons and those that are destined to remain as nervous system outsiders.
Deep conservation of cis-regulatory elements in metazoans
Maeso, Ignacio; Irimia, Manuel; Tena, Juan J.; Casares, Fernando; Gómez-Skarmeta, José Luis
2013-01-01
Despite the vast morphological variation observed across phyla, animals share multiple basic developmental processes orchestrated by a common ancestral gene toolkit. These genes interact with each other building complex gene regulatory networks (GRNs), which are encoded in the genome by cis-regulatory elements (CREs) that serve as computational units of the network. Although GRN subcircuits involved in ancient developmental processes are expected to be at least partially conserved, identification of CREs that are conserved across phyla has remained elusive. Here, we review recent studies that revealed such deeply conserved CREs do exist, discuss the difficulties associated with their identification and describe new approaches that will facilitate this search. PMID:24218633
Cooperative Adaptive Responses in Gene Regulatory Networks with Many Degrees of Freedom
Inoue, Masayo; Kaneko, Kunihiko
2013-01-01
Cells generally adapt to environmental changes by first exhibiting an immediate response and then gradually returning to their original state to achieve homeostasis. Although simple network motifs consisting of a few genes have been shown to exhibit such adaptive dynamics, they do not reflect the complexity of real cells, where the expression of a large number of genes activates or represses other genes, permitting adaptive behaviors. Here, we investigated the responses of gene regulatory networks containing many genes that have undergone numerical evolution to achieve high fitness due to the adaptive response of only a single target gene; this single target gene responds to changes in external inputs and later returns to basal levels. Despite setting a single target, most genes showed adaptive responses after evolution. Such adaptive dynamics were not due to common motifs within a few genes; even without such motifs, almost all genes showed adaptation, albeit sometimes partial adaptation, in the sense that expression levels did not always return to original levels. The genes split into two groups: genes in the first group exhibited an initial increase in expression and then returned to basal levels, while genes in the second group exhibited the opposite changes in expression. From this model, genes in the first group received positive input from other genes within the first group, but negative input from genes in the second group, and vice versa. Thus, the adaptation dynamics of genes from both groups were consolidated. This cooperative adaptive behavior was commonly observed if the number of genes involved was larger than the order of ten. These results have implications in the collective responses of gene expression networks in microarray measurements of yeast Saccharomyces cerevisiae and the significance to the biological homeostasis of systems with many components. PMID:23592959
Wang, Zhuo; Danziger, Samuel A; Heavner, Benjamin D; Ma, Shuyi; Smith, Jennifer J; Li, Song; Herricks, Thurston; Simeonidis, Evangelos; Baliga, Nitin S; Aitchison, John D; Price, Nathan D
2017-05-01
Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
The Genome-Wide Interaction Network of Nutrient Stress Genes in Escherichia coli.
Côté, Jean-Philippe; French, Shawn; Gehrke, Sebastian S; MacNair, Craig R; Mangat, Chand S; Bharat, Amrita; Brown, Eric D
2016-11-22
Conventional efforts to describe essential genes in bacteria have typically emphasized nutrient-rich growth conditions. Of note, however, are the set of genes that become essential when bacteria are grown under nutrient stress. For example, more than 100 genes become indispensable when the model bacterium Escherichia coli is grown on nutrient-limited media, and many of these nutrient stress genes have also been shown to be important for the growth of various bacterial pathogens in vivo To better understand the genetic network that underpins nutrient stress in E. coli, we performed a genome-scale cross of strains harboring deletions in some 82 nutrient stress genes with the entire E. coli gene deletion collection (Keio) to create 315,400 double deletion mutants. An analysis of the growth of the resulting strains on rich microbiological media revealed an average of 23 synthetic sick or lethal genetic interactions for each nutrient stress gene, suggesting that the network defining nutrient stress is surprisingly complex. A vast majority of these interactions involved genes of unknown function or genes of unrelated pathways. The most profound synthetic lethal interactions were between nutrient acquisition and biosynthesis. Further, the interaction map reveals remarkable metabolic robustness in E. coli through pathway redundancies. In all, the genetic interaction network provides a powerful tool to mine and identify missing links in nutrient synthesis and to further characterize genes of unknown function in E. coli Moreover, understanding of bacterial growth under nutrient stress could aid in the development of novel antibiotic discovery platforms. With the rise of antibiotic drug resistance, there is an urgent need for new antibacterial drugs. Here, we studied a group of genes that are essential for the growth of Escherichia coli under nutrient limitation, culture conditions that arguably better represent nutrient availability during an infection than rich microbiological media. Indeed, many such nutrient stress genes are essential for infection in a variety of pathogens. Thus, the respective proteins represent a pool of potential new targets for antibacterial drugs that have been largely unexplored. We have created all possible double deletion mutants through a genetic cross of nutrient stress genes and the E. coli deletion collection. An analysis of the growth of the resulting clones on rich media revealed a robust, dense, and complex network for nutrient acquisition and biosynthesis. Importantly, our data reveal new genetic connections to guide innovative approaches for the development of new antibacterial compounds targeting bacteria under nutrient stress. Copyright © 2016 Côté et al.
Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C.
2014-01-01
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology. PMID:24921649
Chen, Guocai; Cairelli, Michael J; Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C
2014-06-01
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.
Functional Module Analysis for Gene Coexpression Networks with Network Integration.
Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K
2015-01-01
Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.
Kim, Yongsoo; Kim, Taek-Kyun; Kim, Yungu; Yoo, Jiho; You, Sungyong; Lee, Inyoul; Carlson, George; Hood, Leroy; Choi, Seungjin; Hwang, Daehee
2011-01-01
Motivation: Systems biology attempts to describe complex systems behaviors in terms of dynamic operations of biological networks. However, there is lack of tools that can effectively decode complex network dynamics over multiple conditions. Results: We present principal network analysis (PNA) that can automatically capture major dynamic activation patterns over multiple conditions and then generate protein and metabolic subnetworks for the captured patterns. We first demonstrated the utility of this method by applying it to a synthetic dataset. The results showed that PNA correctly captured the subnetworks representing dynamics in the data. We further applied PNA to two time-course gene expression profiles collected from (i) MCF7 cells after treatments of HRG at multiple doses and (ii) brain samples of four strains of mice infected with two prion strains. The resulting subnetworks and their interactions revealed network dynamics associated with HRG dose-dependent regulation of cell proliferation and differentiation and early PrPSc accumulation during prion infection. Availability: The web-based software is available at: http://sbm.postech.ac.kr/pna. Contact: dhhwang@postech.ac.kr; seungjin@postech.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193522
On the origins of hierarchy in complex networks
Corominas-Murtra, Bernat; Goñi, Joaquín; Solé, Ricard V.; Rodríguez-Caso, Carlos
2013-01-01
Hierarchy seems to pervade complexity in both living and artificial systems. Despite its relevance, no general theory that captures all features of hierarchy and its origins has been proposed yet. Here we present a formal approach resulting from the convergence of theoretical morphology and network theory that allows constructing a 3D morphospace of hierarchies and hence comparing the hierarchical organization of ecological, cellular, technological, and social networks. Embedded within large voids in the morphospace of all possible hierarchies, four major groups are identified. Two of them match the expected from random networks with similar connectivity, thus suggesting that nonadaptive factors are at work. Ecological and gene networks define the other two, indicating that their topological order is the result of functional constraints. These results are consistent with an exploration of the morphospace, using in silico evolved networks. PMID:23898177
Wang, Haiying; Zheng, Huiru; Browne, Fiona; Roehe, Rainer; Dewhurst, Richard J; Engel, Felix; Hemmje, Matthias; Lu, Xiangwu; Walsh, Paul
2017-07-15
Methane is one of the major contributors to global warming. The rumen microbiota is directly involved in methane production in cattle. The link between variation in rumen microbial communities and host genetics has important applications and implications in bioscience. Having the potential to reveal the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis holds great promise in this endeavour. This study investigates the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. Based on the relative abundance of 1570 microbial genes identified in a metagenomics analysis, the co-abundance network was constructed and functional modules of microbial genes were identified. One of the main contributions is to develop a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network. The resulting network, consisting of 549 microbial genes and 3349 connections, exhibits a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p<10 -11 ). One third of genes are involved in methane metabolism pathways. The further examination of abundance profiles across 8 samples of genes highlights that the revealed pattern of metagenomics abundance has a strong association with methane emissions. Furthermore, the module is significantly enriched with microbial genes encoding enzymes that are directly involved in methanogenesis (hypergeometric test, p<10 -9 ). Copyright © 2017 Elsevier Inc. All rights reserved.
Liu, Rong; Guo, Cheng-Xian; Zhou, Hong-Hao
2015-01-01
This study aims to identify effective gene networks and prognostic biomarkers associated with estrogen receptor positive (ER+) breast cancer using human mRNA studies. Weighted gene coexpression network analysis was performed with a complex ER+ breast cancer transcriptome to investigate the function of networks and key genes in the prognosis of breast cancer. We found a significant correlation of an expression module with distant metastasis-free survival (HR = 2.25; 95% CI .21.03-4.88 in discovery set; HR = 1.78; 95% CI = 1.07-2.93 in validation set). This module contained genes enriched in the biological process of the M phase. From this module, we further identified and validated 5 hub genes (CDK1, DLGAP5, MELK, NUSAP1, and RRM2), the expression levels of which were strongly associated with poor survival. Highly expressed MELK indicated poor survival in luminal A and luminal B breast cancer molecular subtypes. This gene was also found to be associated with tamoxifen resistance. Results indicated that a network-based approach may facilitate the discovery of biomarkers for the prognosis of ER+ breast cancer and may also be used as a basis for establishing personalized therapies. Nevertheless, before the application of this approach in clinical settings, in vivo and in vitro experiments and multi-center randomized controlled clinical trials are still needed.
Identifying critical transitions and their leading biomolecular networks in complex diseases.
Liu, Rui; Li, Meiyi; Liu, Zhi-Ping; Wu, Jiarui; Chen, Luonan; Aihara, Kazuyuki
2012-01-01
Identifying a critical transition and its leading biomolecular network during the initiation and progression of a complex disease is a challenging task, but holds the key to early diagnosis and further elucidation of the essential mechanisms of disease deterioration at the network level. In this study, we developed a novel computational method for identifying early-warning signals of the critical transition and its leading network during a disease progression, based on high-throughput data using a small number of samples. The leading network makes the first move from the normal state toward the disease state during a transition, and thus is causally related with disease-driving genes or networks. Specifically, we first define a state-transition-based local network entropy (SNE), and prove that SNE can serve as a general early-warning indicator of any imminent transitions, regardless of specific differences among systems. The effectiveness of this method was validated by functional analysis and experimental data.
Predicting disease-related proteins based on clique backbone in protein-protein interaction network.
Yang, Lei; Zhao, Xudong; Tang, Xianglong
2014-01-01
Network biology integrates different kinds of data, including physical or functional networks and disease gene sets, to interpret human disease. A clique (maximal complete subgraph) in a protein-protein interaction network is a topological module and possesses inherently biological significance. A disease-related clique possibly associates with complex diseases. Fully identifying disease components in a clique is conductive to uncovering disease mechanisms. This paper proposes an approach of predicting disease proteins based on cliques in a protein-protein interaction network. To tolerate false positive and negative interactions in protein networks, extending cliques and scoring predicted disease proteins with gene ontology terms are introduced to the clique-based method. Precisions of predicted disease proteins are verified by disease phenotypes and steadily keep to more than 95%. The predicted disease proteins associated with cliques can partly complement mapping between genotype and phenotype, and provide clues for understanding the pathogenesis of serious diseases.
Obayashi, Takeshi; Kinoshita, Kengo
2010-05-01
Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.
Xing, Li-Bo; Zhang, Dong; Li, You-Mei; Shen, Ya-Wen; Zhao, Cai-Ping; Ma, Juan-Juan; An, Na; Han, Ming-Yu
2015-10-01
Flower induction in apple (Malus domestica Borkh.) is regulated by complex gene networks that involve multiple signal pathways to ensure flower bud formation in the next year, but the molecular determinants of apple flower induction are still unknown. In this research, transcriptomic profiles from differentiating buds allowed us to identify genes potentially involved in signaling pathways that mediate the regulatory mechanisms of flower induction. A hypothetical model for this regulatory mechanism was obtained by analysis of the available transcriptomic data, suggesting that sugar-, hormone- and flowering-related genes, as well as those involved in cell-cycle induction, participated in the apple flower induction process. Sugar levels and metabolism-related gene expression profiles revealed that sucrose is the initiation signal in flower induction. Complex hormone regulatory networks involved in cytokinin (CK), abscisic acid (ABA) and gibberellic acid pathways also induce apple flower formation. CK plays a key role in the regulation of cell formation and differentiation, and in affecting flowering-related gene expression levels during these processes. Meanwhile, ABA levels and ABA-related gene expression levels gradually increased, as did those of sugar metabolism-related genes, in developing buds, indicating that ABA signals regulate apple flower induction by participating in the sugar-mediated flowering pathway. Furthermore, changes in sugar and starch deposition levels in buds can be affected by ABA content and the expression of the genes involved in the ABA signaling pathway. Thus, multiple pathways, which are mainly mediated by crosstalk between sugar and hormone signals, regulate the molecular network involved in bud growth and flower induction in apple trees. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Chen, Chengjie; Zhang, Yafeng; Xu, Zhiqiang; Luan, Aiping; Mao, Qi; Feng, Junting; Xie, Tao; Gong, Xue; Wang, Xiaoshuang; Chen, Hao; He, Yehua
2016-01-01
The pineapple (Ananas comosus) is cold sensitive. Most cultivars are injured during winter periods, especially in sub-tropical regions. There is a lack of molecular information on the pineapple’s response to cold stress. In this study, high-throughput transcriptome sequencing and gene expression analysis were performed on plantlets of a cold-tolerant genotype of the pineapple cultivar ‘Shenwan’ before and after cold treatment. A total of 1,186 candidate cold responsive genes were identified, and their credibility was confirmed by RT-qPCR. Gene set functional enrichment analysis indicated that genes related to cell wall properties, stomatal closure and ABA and ROS signal transduction play important roles in pineapple cold tolerance. In addition, a protein association network of CORs (cold responsive genes) was predicted, which could serve as an entry point to dissect the complex cold response network. Our study found a series of candidate genes and their association network, which will be helpful to cold stress response studies and pineapple breeding for cold tolerance. PMID:27656892
Hou, Chunyu; Wang, Fei; Liu, Xuewen; Chang, Guangming; Wang, Feng; Geng, Xin
2017-08-01
Telomerase reverse transcriptase (TERT) is the protein component of telomerase complex. Evidence has accumulated showing that the nontelomeric functions of TERT are independent of telomere elongation. However, the mechanisms governing the interaction between TERT and its target genes are not clearly revealed. The biological functions of TERT are not fully elucidated and have thus far been underestimated. To further explore these functions, we investigated TERT interaction networks using multiple bioinformatic databases, including BioGRID, STRING, DAVID, GeneCards, GeneMANIA, PANTHER, miRWalk, mirTarBase, miRNet, miRDB, and TargetScan. In addition, network diagrams were built using Cytoscape software. As competing endogenous RNAs (ceRNAs) are endogenous transcripts that compete for the binding of microRNAs (miRNAs) by using shared miRNA recognition elements, they are involved in creating widespread regulatory networks. Therefore, the ceRNA regulatory networks of TERT were also investigated in this study. Interestingly, we found that the three genes PABPC1, SLC7A11, and TP53 were present in both TERT interaction networks and ceRNAs target genes. It was predicted that TERT might play nontelomeric roles in the generation or development of some rare diseases, such as Rift Valley fever and dyscalculia. Thus, our data will help to decipher the interaction networks of TERT and reveal the unknown functions of telomerase in cancer and aging-related diseases.
TRACING CO-REGULATORY NETWORK DYNAMICS IN NOISY, SINGLE-CELL TRANSCRIPTOME TRAJECTORIES.
Cordero, Pablo; Stuart, Joshua M
2017-01-01
The availability of gene expression data at the single cell level makes it possible to probe the molecular underpinnings of complex biological processes such as differentiation and oncogenesis. Promising new methods have emerged for reconstructing a progression 'trajectory' from static single-cell transcriptome measurements. However, it remains unclear how to adequately model the appreciable level of noise in these data to elucidate gene regulatory network rewiring. Here, we present a framework called Single Cell Inference of MorphIng Trajectories and their Associated Regulation (SCIMITAR) that infers progressions from static single-cell transcriptomes by employing a continuous parametrization of Gaussian mixtures in high-dimensional curves. SCIMITAR yields rich models from the data that highlight genes with expression and co-expression patterns that are associated with the inferred progression. Further, SCIMITAR extracts regulatory states from the implicated trajectory-evolvingco-expression networks. We benchmark the method on simulated data to show that it yields accurate cell ordering and gene network inferences. Applied to the interpretation of a single-cell human fetal neuron dataset, SCIMITAR finds progression-associated genes in cornerstone neural differentiation pathways missed by standard differential expression tests. Finally, by leveraging the rewiring of gene-gene co-expression relations across the progression, the method reveals the rise and fall of co-regulatory states and trajectory-dependent gene modules. These analyses implicate new transcription factors in neural differentiation including putative co-factors for the multi-functional NFAT pathway.
Following the Footsteps of Chlamydial Gene Regulation
Domman, D.; Horn, M.
2015-01-01
Regulation of gene expression ensures an organism responds to stimuli and undergoes proper development. Although the regulatory networks in bacteria have been investigated in model microorganisms, nearly nothing is known about the evolution and plasticity of these networks in obligate, intracellular bacteria. The phylum Chlamydiae contains a vast array of host-associated microbes, including several human pathogens. The Chlamydiae are unique among obligate, intracellular bacteria as they undergo a complex biphasic developmental cycle in which large swaths of genes are temporally regulated. Coupled with the low number of transcription factors, these organisms offer a model to study the evolution of regulatory networks in intracellular organisms. We provide the first comprehensive analysis exploring the diversity and evolution of regulatory networks across the phylum. We utilized a comparative genomics approach to construct predicted coregulatory networks, which unveiled genus- and family-specific regulatory motifs and architectures, most notably those of virulence-associated genes. Surprisingly, our analysis suggests that few regulatory components are conserved across the phylum, and those that are conserved are involved in the exploitation of the intracellular niche. Our study thus lends insight into a component of chlamydial evolution that has otherwise remained largely unexplored. PMID:26424812
Genomic Methods for Clinical and Translational Pain Research
Wang, Dan; Kim, Hyungsuk; Wang, Xiao-Min; Dionne, Raymond
2012-01-01
Pain is a complex sensory experience for which the molecular mechanisms are yet to be fully elucidated. Individual differences in pain sensitivity are mediated by a complex network of multiple gene polymorphisms, physiological and psychological processes, and environmental factors. Here, we present the methods for applying unbiased molecular-genetic approaches, genome-wide association study (GWAS), and global gene expression analysis, to help better understand the molecular basis of pain sensitivity in humans and variable responses to analgesic drugs. PMID:22351080
Liu, Shiwei; Liu, Yihui; Zhao, Jiawei; Cai, Shitao; Qian, Hongmei; Zuo, Kaijing; Zhao, Lingxia; Zhang, Lida
2017-04-01
Rice (Oryza sativa) is one of the most important staple foods for more than half of the global population. Many rice traits are quantitative, complex and controlled by multiple interacting genes. Thus, a full understanding of genetic relationships will be critical to systematically identify genes controlling agronomic traits. We developed a genome-wide rice protein-protein interaction network (RicePPINet, http://netbio.sjtu.edu.cn/riceppinet) using machine learning with structural relationship and functional information. RicePPINet contained 708 819 predicted interactions for 16 895 non-transposable element related proteins. The power of the network for discovering novel protein interactions was demonstrated through comparison with other publicly available protein-protein interaction (PPI) prediction methods, and by experimentally determined PPI data sets. Furthermore, global analysis of domain-mediated interactions revealed RicePPINet accurately reflects PPIs at the domain level. Our studies showed the efficiency of the RicePPINet-based method in prioritizing candidate genes involved in complex agronomic traits, such as disease resistance and drought tolerance, was approximately 2-11 times better than random prediction. RicePPINet provides an expanded landscape of computational interactome for the genetic dissection of agronomically important traits in rice. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Multi-omics approach identifies molecular mechanisms of plant-fungus mycorrhizal interaction
Larsen, Peter E.; Sreedasyam, Avinash; Trivedi, Geetika; ...
2016-01-19
In mycorrhizal symbiosis, plant roots form close, mutually beneficial interactions with soil fungi. Before this mycorrhizal interaction can be established however, plant roots must be capable of detecting potential beneficial fungal partners and initiating the gene expression patterns necessary to begin symbiosis. To predict a plant root – mycorrhizal fungi sensor systems, we analyzed in vitro experiments of Populus tremuloides (aspen tree) and Laccaria bicolor (mycorrhizal fungi) interaction and leveraged over 200 previously published transcriptomic experimental data sets, 159 experimentally validated plant transcription factor binding motifs, and more than 120-thousand experimentally validated protein-protein interactions to generate models of pre-mycorrhizal sensormore » systems in aspen root. These sensor mechanisms link extracellular signaling molecules with gene regulation through a network comprised of membrane receptors, signal cascade proteins, transcription factors, and transcription factor biding DNA motifs. Modeling predicted four pre-mycorrhizal sensor complexes in aspen that interact with fifteen transcription factors to regulate the expression of 1184 genes in response to extracellular signals synthesized by Laccaria. Predicted extracellular signaling molecules include common signaling molecules such as phenylpropanoids, salicylate, and, jasmonic acid. Lastly, this multi-omic computational modeling approach for predicting the complex sensory networks yielded specific, testable biological hypotheses for mycorrhizal interaction signaling compounds, sensor complexes, and mechanisms of gene regulation.« less
Multi-omics approach identifies molecular mechanisms of plant-fungus mycorrhizal interaction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, Peter E.; Sreedasyam, Avinash; Trivedi, Geetika
In mycorrhizal symbiosis, plant roots form close, mutually beneficial interactions with soil fungi. Before this mycorrhizal interaction can be established however, plant roots must be capable of detecting potential beneficial fungal partners and initiating the gene expression patterns necessary to begin symbiosis. To predict a plant root – mycorrhizal fungi sensor systems, we analyzed in vitro experiments of Populus tremuloides (aspen tree) and Laccaria bicolor (mycorrhizal fungi) interaction and leveraged over 200 previously published transcriptomic experimental data sets, 159 experimentally validated plant transcription factor binding motifs, and more than 120-thousand experimentally validated protein-protein interactions to generate models of pre-mycorrhizal sensormore » systems in aspen root. These sensor mechanisms link extracellular signaling molecules with gene regulation through a network comprised of membrane receptors, signal cascade proteins, transcription factors, and transcription factor biding DNA motifs. Modeling predicted four pre-mycorrhizal sensor complexes in aspen that interact with fifteen transcription factors to regulate the expression of 1184 genes in response to extracellular signals synthesized by Laccaria. Predicted extracellular signaling molecules include common signaling molecules such as phenylpropanoids, salicylate, and, jasmonic acid. Lastly, this multi-omic computational modeling approach for predicting the complex sensory networks yielded specific, testable biological hypotheses for mycorrhizal interaction signaling compounds, sensor complexes, and mechanisms of gene regulation.« less
The computational core and fixed point organization in Boolean networks
NASA Astrophysics Data System (ADS)
Correale, L.; Leone, M.; Pagnani, A.; Weigt, M.; Zecchina, R.
2006-03-01
In this paper, we analyse large random Boolean networks in terms of a constraint satisfaction problem. We first develop an algorithmic scheme which allows us to prune simple logical cascades and underdetermined variables, returning thereby the computational core of the network. Second, we apply the cavity method to analyse the number and organization of fixed points. We find in particular a phase transition between an easy and a complex regulatory phase, the latter being characterized by the existence of an exponential number of macroscopically separated fixed point clusters. The different techniques developed are reinterpreted as algorithms for the analysis of single Boolean networks, and they are applied in the analysis of and in silico experiments on the gene regulatory networks of baker's yeast (Saccharomyces cerevisiae) and the segment-polarity genes of the fruitfly Drosophila melanogaster.
Combinatorial explosion in model gene networks
NASA Astrophysics Data System (ADS)
Edwards, R.; Glass, L.
2000-09-01
The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such networks are extremely rich and they offer novel ways to think about how mutations can alter dynamics.
Combinatorial explosion in model gene networks.
Edwards, R.; Glass, L.
2000-09-01
The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such networks are extremely rich and they offer novel ways to think about how mutations can alter dynamics. (c) 2000 American Institute of Physics.
Formal modeling and analysis of ER-α associated Biological Regulatory Network in breast cancer.
Khalid, Samra; Hanif, Rumeza; Tareen, Samar H K; Siddiqa, Amnah; Bibi, Zurah; Ahmad, Jamil
2016-01-01
Breast cancer (BC) is one of the leading cause of death among females worldwide. The increasing incidence of BC is due to various genetic and environmental changes which lead to the disruption of cellular signaling network(s). It is a complex disease in which several interlinking signaling cascades play a crucial role in establishing a complex regulatory network. The logical modeling approach of René Thomas has been applied to analyze the behavior of estrogen receptor-alpha (ER- α ) associated Biological Regulatory Network (BRN) for a small part of complex events that leads to BC metastasis. A discrete model was constructed using the kinetic logic formalism and its set of logical parameters were obtained using the model checking technique implemented in the SMBioNet software which is consistent with biological observations. The discrete model was further enriched with continuous dynamics by converting it into an equivalent Petri Net (PN) to analyze the logical parameters of the involved entities. In-silico based discrete and continuous modeling of ER- α associated signaling network involved in BC provides information about behaviors and gene-gene interaction in detail. The dynamics of discrete model revealed, imperative behaviors represented as cyclic paths and trajectories leading to pathogenic states such as metastasis. Results suggest that the increased expressions of receptors ER- α , IGF-1R and EGFR slow down the activity of tumor suppressor genes (TSGs) such as BRCA1, p53 and Mdm2 which can lead to metastasis. Therefore, IGF-1R and EGFR are considered as important inhibitory targets to control the metastasis in BC. The in-silico approaches allow us to increase our understanding of the functional properties of living organisms. It opens new avenues of investigations of multiple inhibitory targets (ER- α , IGF-1R and EGFR) for wet lab experiments as well as provided valuable insights in the treatment of cancers such as BC.
Liang, Yajun; Wu, Heng; Lei, Rong; Chong, Robert A.; Wei, Yong; Lu, Xin; Tagkopoulos, Ilias; Kung, Sun-Yuan; Yang, Qifeng; Hu, Guohong; Kang, Yibin
2012-01-01
The application of functional genomic analysis of breast cancer metastasis has led to the identification of a growing number of organ-specific metastasis genes, which often function in concert to facilitate different steps of the metastatic cascade. However, the gene regulatory network that controls the expression of these metastasis genes remains largely unknown. Here, we demonstrate a computational approach for the deconvolution of transcriptional networks to discover master regulators of breast cancer bone metastasis. Several known regulators of breast cancer bone metastasis such as Smad4 and HIF1 were identified in our analysis. Experimental validation of the networks revealed BACH1, a basic leucine zipper transcription factor, as the common regulator of several functional metastasis genes, including MMP1 and CXCR4. Ectopic expression of BACH1 enhanced the malignance of breast cancer cells, and conversely, BACH1 knockdown significantly reduced bone metastasis. The expression of BACH1 and its target genes was linked to the higher risk of breast cancer recurrence in patients. This study established BACH1 as the master regulator of breast cancer bone metastasis and provided a paradigm to identify molecular determinants in complex pathological processes. PMID:22875853
Fritzsch, Bernd; Jahan, Israt; Pan, Ning; Elliott, Karen L.
2014-01-01
Understanding the evolution of the neurosensory system of man, able to reflect on its own origin, is one of the major goals of comparative neurobiology. Details of the origin of neurosensory cells, their aggregation into central nervous systems and associated sensory organs, their localized patterning into remarkably different cell types aggregated into variably sized parts of the central nervous system begin to emerge. Insights at the cellular and molecular level begin to shed some light on the evolution of neurosensory cells, partially covered in this review. Molecular evidence suggests that high mobility group (HMG) proteins of pre-metazoans evolved into the definitive Sox [SRY (sex determining region Y)-box] genes used for neurosensory precursor specification in metazoans. Likewise, pre-metazoan basic helix-loop-helix (bHLH) genes evolved in metazoans into the group A bHLH genes dedicated to neurosensory differentiation in bilaterians. Available evidence suggests that the Sox and bHLH genes evolved a cross-regulatory network able to synchronize expansion of precursor populations and their subsequent differentiation into novel parts of the brain or sensory organs. Molecular evidence suggests metazoans evolved patterning gene networks early and not dedicated to neuronal development. Only later in evolution were these patterning gene networks tied into the increasing complexity of diffusible factors, many of which were already present in pre-metazoans, to drive local patterning events. It appears that the evolving molecular basis of neurosensory cell development may have led, in interaction with differentially expressed patterning genes, to local network modifications guiding unique specializations of neurosensory cells into sensory organs and various areas of the central nervous system. PMID:25416504
Fritzsch, Bernd; Jahan, Israt; Pan, Ning; Elliott, Karen L
2015-01-01
Understanding the evolution of the neurosensory system of man, able to reflect on its own origin, is one of the major goals of comparative neurobiology. Details of the origin of neurosensory cells, their aggregation into central nervous systems and associated sensory organs and their localized patterning leading to remarkably different cell types aggregated into variably sized parts of the central nervous system have begun to emerge. Insights at the cellular and molecular level have begun to shed some light on the evolution of neurosensory cells, partially covered in this review. Molecular evidence suggests that high mobility group (HMG) proteins of pre-metazoans evolved into the definitive Sox [SRY (sex determining region Y)-box] genes used for neurosensory precursor specification in metazoans. Likewise, pre-metazoan basic helix-loop-helix (bHLH) genes evolved in metazoans into the group A bHLH genes dedicated to neurosensory differentiation in bilaterians. Available evidence suggests that the Sox and bHLH genes evolved a cross-regulatory network able to synchronize expansion of precursor populations and their subsequent differentiation into novel parts of the brain or sensory organs. Molecular evidence suggests metazoans evolved patterning gene networks early, which were not dedicated to neuronal development. Only later in evolution were these patterning gene networks tied into the increasing complexity of diffusible factors, many of which were already present in pre-metazoans, to drive local patterning events. It appears that the evolving molecular basis of neurosensory cell development may have led, in interaction with differentially expressed patterning genes, to local network modifications guiding unique specializations of neurosensory cells into sensory organs and various areas of the central nervous system.
Chen, Yuefeng; Wei, Tao; Yan, Lei; Lawrence, Frank; Qian, Hui-Rong; Burkholder, Timothy P; Starling, James J; Yingling, Jonathan M; Shou, Jianyong
2008-01-01
Background Tumor angiogenesis is a highly regulated process involving intercellular communication as well as the interactions of multiple downstream signal transduction pathways. Disrupting one or even a few angiogenesis pathways is often insufficient to achieve sustained therapeutic benefits due to the complexity of angiogenesis. Targeting multiple angiogenic pathways has been increasingly recognized as a viable strategy. However, translation of the polypharmacology of a given compound to its antiangiogenic efficacy remains a major technical challenge. Developing a global functional association network among angiogenesis-related genes is much needed to facilitate holistic understanding of angiogenesis and to aid the development of more effective anti-angiogenesis therapeutics. Results We constructed a comprehensive gene functional association network or interactome by transcript profiling an in vitro angiogenesis model, in which human umbilical vein endothelial cells (HUVECs) formed capillary structures when co-cultured with normal human dermal fibroblasts (NHDFs). HUVEC competence and NHDF supportiveness of cord formation were found to be highly cell-passage dependent. An enrichment test of Biological Processes (BP) of differentially expressed genes (DEG) revealed that angiogenesis related BP categories significantly changed with cell passages. Built upon 2012 DEGs identified from two microarray studies, the resulting interactome captured 17226 functional gene associations and displayed characteristics of a scale-free network. The interactome includes the involvement of oncogenes and tumor suppressor genes in angiogenesis. We developed a network walking algorithm to extract connectivity information from the interactome and applied it to simulate the level of network perturbation by three multi-targeted anti-angiogenic kinase inhibitors. Simulated network perturbation correlated with observed anti-angiogenesis activity in a cord formation bioassay. Conclusion We established a comprehensive gene functional association network to model in vitro angiogenesis regulation. The present study provided a proof-of-concept pilot of applying network perturbation analysis to drug phenotypic activity assessment. PMID:18518970
Liu, Guangming; Wang, Yiwei; Zhao, Pengyao; Zhu, Yizhun; Yang, Xiaohan; Zheng, Tiezheng; Zhou, Xuezhong; Jin, Weilin; Sun, Changkai
2017-01-01
Epilepsy is a complex neurological disorder and a significant health problem. The pathogenesis of epilepsy remains obscure in a significant number of patients and the current treatment options are not adequate in about a third of individuals which were known as refractory epilepsies (RE). Network medicine provides an effective approach for studying the molecular mechanisms underlying complex diseases. Here we integrated 1876 disease-gene associations of RE and located those genes to human protein-protein interaction (PPI) network to obtain 42 significant RE-associated disease modules. The functional analysis of these disease modules showed novel molecular pathological mechanisms of RE, such as the novel enriched pathways (e.g., “presynaptic nicotinic acetylcholine receptors”, “signaling by insulin receptor”). Further analysis on the relationships between current drug targets and the RE-related disease genes showed the rational mechanisms of most antiepileptic drugs. In addition, we detected ten potential novel drug targets (e.g., KCNA1, KCNA4-6, KCNC3, KCND2, KCNMA1, CAMK2G, CACNB4 and GRM1) located in three RE related disease modules, which might provide novel insights into the new drug discovery for RE therapy. PMID:28388656
Structures and Boolean Dynamics in Gene Regulatory Networks
NASA Astrophysics Data System (ADS)
Szedlak, Anthony
This dissertation discusses the topological and dynamical properties of GRNs in cancer, and is divided into four main chapters. First, the basic tools of modern complex network theory are introduced. These traditional tools as well as those developed by myself (set efficiency, interset efficiency, and nested communities) are crucial for understanding the intricate topological properties of GRNs, and later chapters recall these concepts. Second, the biology of gene regulation is discussed, and a method for disease-specific GRN reconstruction developed by our collaboration is presented. This complements the traditional exhaustive experimental approach of building GRNs edge-by-edge by quickly inferring the existence of as of yet undiscovered edges using correlations across sets of gene expression data. This method also provides insight into the distribution of common mutations across GRNs. Third, I demonstrate that the structures present in these reconstructed networks are strongly related to the evolutionary histories of their constituent genes. Investigation of how the forces of evolution shaped the topology of GRNs in multicellular organisms by growing outward from a core of ancient, conserved genes can shed light upon the ''reverse evolution'' of normal cells into unicellular-like cancer states. Next, I simulate the dynamics of the GRNs of cancer cells using the Hopfield model, an infinite range spin-glass model designed with the ability to encode Boolean data as attractor states. This attractor-driven approach facilitates the integration of gene expression data into predictive mathematical models. Perturbations representing therapeutic interventions are applied to sets of genes, and the resulting deviations from their attractor states are recorded, suggesting new potential drug targets for experimentation. Finally, I extend the Hopfield model to modular networks, cyclic attractors, and complex attractors, and apply these concepts to simulations of the cell cycle process. Futher development of these and other theoretical and computational tools is necessary to analyze the deluge of experimental data produced by modern and future biological high throughput methods. (Abstract shortened by ProQuest.).
Biomechanical cell regulatory networks as complex adaptive systems in relation to cancer.
Feller, Liviu; Khammissa, Razia Abdool Gafaar; Lemmer, Johan
2017-01-01
Physiological structure and function of cells are maintained by ongoing complex dynamic adaptive processes in the intracellular molecular pathways controlling the overall profile of gene expression, and by genes in cellular gene regulatory circuits. Cytogenetic mutations and non-genetic factors such as chronic inflammation or repetitive trauma, intrinsic mechanical stresses within extracellular matrix may induce redirection of gene regulatory circuits with abnormal reactivation of embryonic developmental programmes which can now drive cell transformation and cancer initiation, and later cancer progression and metastasis. Some of the non-genetic factors that may also favour cancerization are dysregulation in epithelial-mesenchymal interactions, in cell-to-cell communication, in extracellular matrix turnover, in extracellular matrix-to-cell interactions and in mechanotransduction pathways. Persistent increase in extracellular matrix stiffness, for whatever reason, has been shown to play an important role in cell transformation, and later in cancer cell invasion. In this article we review certain cell regulatory networks driving carcinogenesis, focussing on the role of mechanical stresses modulating structure and function of cells and their extracellular matrices.
Programming Morphogenesis through Systems and Synthetic Biology.
Velazquez, Jeremy J; Su, Emily; Cahan, Patrick; Ebrahimkhani, Mo R
2018-04-01
Mammalian tissue development is an intricate, spatiotemporal process of self-organization that emerges from gene regulatory networks of differentiating stem cells. A major goal in stem cell biology is to gain a sufficient understanding of gene regulatory networks and cell-cell interactions to enable the reliable and robust engineering of morphogenesis. Here, we review advances in synthetic biology, single cell genomics, and multiscale modeling, which, when synthesized, provide a framework to achieve the ambitious goal of programming morphogenesis in complex tissues and organoids. Copyright © 2017 Elsevier Ltd. All rights reserved.
Machine Learning for Detecting Gene-Gene Interactions
McKinney, Brett A.; Reif, David M.; Ritchie, Marylyn D.; Moore, Jason H.
2011-01-01
Complex interactions among genes and environmental factors are known to play a role in common human disease aetiology. There is a growing body of evidence to suggest that complex interactions are ‘the norm’ and, rather than amounting to a small perturbation to classical Mendelian genetics, interactions may be the predominant effect. Traditional statistical methods are not well suited for detecting such interactions, especially when the data are high dimensional (many attributes or independent variables) or when interactions occur between more than two polymorphisms. In this review, we discuss machine-learning models and algorithms for identifying and characterising susceptibility genes in common, complex, multifactorial human diseases. We focus on the following machine-learning methods that have been used to detect gene-gene interactions: neural networks, cellular automata, random forests, and multifactor dimensionality reduction. We conclude with some ideas about how these methods and others can be integrated into a comprehensive and flexible framework for data mining and knowledge discovery in human genetics. PMID:16722772
Lakshminarasimhan, Mahadevan; Boanca, Gina; Banks, Charles A. S.; Hattem, Gaye L.; Gabriel, Ana E.; Groppe, Brad D.; Smoyer, Christine; Malanowski, Kate E.; Peak, Allison; Florens, Laurence; Washburn, Michael P.
2016-01-01
The highly conserved yeast R2TP complex, consisting of Rvb1, Rvb2, Pih1, and Tah1, participates in diverse cellular processes ranging from assembly of protein complexes to apoptosis. Rvb1 and Rvb2 are closely related proteins belonging to the AAA+ superfamily and are essential for cell survival. Although Rvbs have been shown to be associated with various protein complexes including the Ino80 and Swr1chromatin remodeling complexes, we performed a systematic quantitative proteomic analysis of their associated proteins and identified two additional complexes that associate with Rvb1 and Rvb2: the chaperonin-containing T-complex and the 19S regulatory particle of the proteasome complex. We also analyzed Rvb1 and Rvb2 purified from yeast strains devoid of PIH1 and TAH1. These analyses revealed that both Rvb1 and Rvb2 still associated with Hsp90 and were highly enriched with RNA polymerase II complex components. Our analyses also revealed that both Rvb1 and Rvb2 were recruited to the Ino80 and Swr1 chromatin remodeling complexes even in the absence of Pih1 and Tah1 proteins. Using further biochemical analysis, we showed that Rvb1 and Rvb2 directly interacted with Hsp90 as well as with the RNA polymerase II complex. RNA-Seq analysis of the deletion strains compared with the wild-type strains revealed an up-regulation of ribosome biogenesis and ribonucleoprotein complex biogenesis genes, down-regulation of response to abiotic stimulus genes, and down-regulation of response to temperature stimulus genes. A Gene Ontology analysis of the 80 proteins whose protein associations were altered in the PIH1 or TAH1 deletion strains found ribonucleoprotein complex proteins to be the most enriched category. This suggests an important function of the R2TP complex in ribonucleoprotein complex biogenesis at both the proteomic and genomic levels. Finally, these results demonstrate that deletion network analyses can provide novel insights into cellular systems. PMID:26831523
Local and global responses in complex gene regulation networks
NASA Astrophysics Data System (ADS)
Tsuchiya, Masa; Selvarajoo, Kumar; Piras, Vincent; Tomita, Masaru; Giuliani, Alessandro
2009-04-01
An exacerbated sensitivity to apparently minor stimuli and a general resilience of the entire system stay together side-by-side in biological systems. This apparent paradox can be explained by the consideration of biological systems as very strongly interconnected network systems. Some nodes of these networks, thanks to their peculiar location in the network architecture, are responsible for the sensitivity aspects, while the large degree of interconnection is at the basis of the resilience properties of the system. One relevant feature of the high degree of connectivity of gene regulation networks is the emergence of collective ordered phenomena influencing the entire genome and not only a specific portion of transcripts. The great majority of existing gene regulation models give the impression of purely local ‘hard-wired’ mechanisms disregarding the emergence of global ordered behavior encompassing thousands of genes while the general, genome wide, aspects are less known. Here we address, on a data analysis perspective, the discrimination between local and global scale regulations, this goal was achieved by means of the examination of two biological systems: innate immune response in macrophages and oscillating growth dynamics in yeast. Our aim was to reconcile the ‘hard-wired’ local view of gene regulation with a global continuous and scalable one borrowed from statistical physics. This reconciliation is based on the network paradigm in which the local ‘hard-wired’ activities correspond to the activation of specific crucial nodes in the regulation network, while the scalable continuous responses can be equated to the collective oscillations of the network after a perturbation.
eQTL networks unveil enriched mRNA master integrators downstream of complex disease-associated SNPs.
Li, Haiquan; Pouladi, Nima; Achour, Ikbel; Gardeux, Vincent; Li, Jianrong; Li, Qike; Zhang, Hao Helen; Martinez, Fernando D; 'Skip' Garcia, Joe G N; Lussier, Yves A
2015-12-01
The causal and interplay mechanisms of Single Nucleotide Polymorphisms (SNPs) associated with complex diseases (complex disease SNPs) investigated in genome-wide association studies (GWAS) at the transcriptional level (mRNA) are poorly understood despite recent advancements such as discoveries reported in the Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTex). Protein interaction network analyses have successfully improved our understanding of both single gene diseases (Mendelian diseases) and complex diseases. Whether the mRNAs downstream of complex disease genes are central or peripheral in the genetic information flow relating DNA to mRNA remains unclear and may be disease-specific. Using expression Quantitative Trait Loci (eQTL) that provide DNA to mRNA associations and network centrality metrics, we hypothesize that we can unveil the systems properties of information flow between SNPs and the transcriptomes of complex diseases. We compare different conditions such as naïve SNP assignments and stringent linkage disequilibrium (LD) free assignments for transcripts to remove confounders from LD. Additionally, we compare the results from eQTL networks between lymphoblastoid cell lines and liver tissue. Empirical permutation resampling (p<0.001) and theoretic Mann-Whitney U test (p<10(-30)) statistics indicate that mRNAs corresponding to complex disease SNPs via eQTL associations are likely to be regulated by a larger number of SNPs than expected. We name this novel property mRNA hubness in eQTL networks, and further term mRNAs with high hubness as master integrators. mRNA master integrators receive and coordinate the perturbation signals from large numbers of polymorphisms and respond to the personal genetic architecture integratively. This genetic signal integration contrasts with the mechanism underlying some Mendelian diseases, where a genetic polymorphism affecting a single protein hub produces a divergent signal that affects a large number of downstream proteins. Indeed, we verify that this property is independent of the hubness in protein networks for which these mRNAs are transcribed. Our findings provide novel insights into the pleiotropy of mRNAs targeted by complex disease polymorphisms and the architecture of the information flow between the genetic polymorphisms and transcriptomes of complex diseases. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Erdelyi, Peter; Wang, Xing; Suleski, Marina; Wicky, Chantal
2016-01-01
Mi2 proteins are evolutionarily conserved, ATP-dependent chromatin remodelers of the CHD family that play key roles in stem cell differentiation and reprogramming. In Caenorhabditis elegans, the let-418 gene encodes one of the two Mi2 homologs, which is part of at least two chromatin complexes, namely the Nucleosome Remodeling and histone Deacetylase (NuRD) complex and the MEC complex, and functions in larval development, vulval morphogenesis, lifespan regulation, and cell fate determination. To explore the mechanisms involved in the action of LET-418/Mi2, we performed a genome-wide RNA interference (RNAi) screen for suppressors of early larval arrest associated with let-418 mutations. We identified 29 suppressor genes, of which 24 encode chromatin regulators, mostly orthologs of proteins present in transcriptional activator complexes. The remaining five genes vary broadly in their predicted functions. All suppressor genes could suppress multiple aspects of the let-418 phenotype, including developmental arrest and ectopic expression of germline genes in the soma. Analysis of available transcriptomic data and quantitative PCR revealed that LET-418 and the suppressors of early larval arrest are regulating common target genes. These suppressors might represent direct competitors of LET-418 complexes for chromatin regulation of crucial genes involved in the transition to postembryonic development. PMID:28007841
Erdelyi, Peter; Wang, Xing; Suleski, Marina; Wicky, Chantal
2017-02-09
Mi2 proteins are evolutionarily conserved, ATP-dependent chromatin remodelers of the CHD family that play key roles in stem cell differentiation and reprogramming. In Caenorhabditis elegans , the let-418 gene encodes one of the two Mi2 homologs, which is part of at least two chromatin complexes, namely the Nucleosome Remodeling and histone Deacetylase (NuRD) complex and the MEC complex, and functions in larval development, vulval morphogenesis, lifespan regulation, and cell fate determination. To explore the mechanisms involved in the action of LET-418/Mi2, we performed a genome-wide RNA interference (RNAi) screen for suppressors of early larval arrest associated with let-418 mutations. We identified 29 suppressor genes, of which 24 encode chromatin regulators, mostly orthologs of proteins present in transcriptional activator complexes. The remaining five genes vary broadly in their predicted functions. All suppressor genes could suppress multiple aspects of the let-418 phenotype, including developmental arrest and ectopic expression of germline genes in the soma. Analysis of available transcriptomic data and quantitative PCR revealed that LET-418 and the suppressors of early larval arrest are regulating common target genes. These suppressors might represent direct competitors of LET-418 complexes for chromatin regulation of crucial genes involved in the transition to postembryonic development. Copyright © 2017 Erdelyi et al.
Pan, Joshua; Meyers, Robin M; Michel, Brittany C; Mashtalir, Nazar; Sizemore, Ann E; Wells, Jonathan N; Cassel, Seth H; Vazquez, Francisca; Weir, Barbara A; Hahn, William C; Marsh, Joseph A; Tsherniak, Aviad; Kadoch, Cigall
2018-05-23
Protein complexes are assemblies of subunits that have co-evolved to execute one or many coordinated functions in the cellular environment. Functional annotation of mammalian protein complexes is critical to understanding biological processes, as well as disease mechanisms. Here, we used genetic co-essentiality derived from genome-scale RNAi- and CRISPR-Cas9-based fitness screens performed across hundreds of human cancer cell lines to assign measures of functional similarity. From these measures, we systematically built and characterized functional similarity networks that recapitulate known structural and functional features of well-studied protein complexes and resolve novel functional modules within complexes lacking structural resolution, such as the mammalian SWI/SNF complex. Finally, by integrating functional networks with large protein-protein interaction networks, we discovered novel protein complexes involving recently evolved genes of unknown function. Taken together, these findings demonstrate the utility of genetic perturbation screens alone, and in combination with large-scale biophysical data, to enhance our understanding of mammalian protein complexes in normal and disease states. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Revealing networks from dynamics: an introduction
NASA Astrophysics Data System (ADS)
Timme, Marc; Casadiego, Jose
2014-08-01
What can we learn from the collective dynamics of a complex network about its interaction topology? Taking the perspective from nonlinear dynamics, we briefly review recent progress on how to infer structural connectivity (direct interactions) from accessing the dynamics of the units. Potential applications range from interaction networks in physics, to chemical and metabolic reactions, protein and gene regulatory networks as well as neural circuits in biology and electric power grids or wireless sensor networks in engineering. Moreover, we briefly mention some standard ways of inferring effective or functional connectivity.
Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.
Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju
2017-04-27
Our understanding of the molecular mechanisms underlying Alzheimer's disease (AD) remains incomplete. Previous studies have revealed that genetic factors provide a significant contribution to the pathogenesis and development of AD. In the past years, numerous genes implicated in this disease have been identified via genetic association studies on candidate genes or at the genome-wide level. However, in many cases, the roles of these genes and their interactions in AD are still unclear. A comprehensive and systematic analysis focusing on the biological function and interactions of these genes in the context of AD will therefore provide valuable insights to understand the molecular features of the disease. In this study, we collected genes potentially associated with AD by screening publications on genetic association studies deposited in PubMed. The major biological themes linked with these genes were then revealed by function and biochemical pathway enrichment analysis, and the relation between the pathways was explored by pathway crosstalk analysis. Furthermore, the network features of these AD-related genes were analyzed in the context of human interactome and an AD-specific network was inferred using the Steiner minimal tree algorithm. We compiled 430 human genes reported to be associated with AD from 823 publications. Biological theme analysis indicated that the biological processes and biochemical pathways related to neurodevelopment, metabolism, cell growth and/or survival, and immunology were enriched in these genes. Pathway crosstalk analysis then revealed that the significantly enriched pathways could be grouped into three interlinked modules-neuronal and metabolic module, cell growth/survival and neuroendocrine pathway module, and immune response-related module-indicating an AD-specific immune-endocrine-neuronal regulatory network. Furthermore, an AD-specific protein network was inferred and novel genes potentially associated with AD were identified. By means of network and pathway-based methodology, we explored the pathogenetic mechanism underlying AD at a systems biology level. Results from our work could provide valuable clues for understanding the molecular mechanism underlying AD. In addition, the framework proposed in this study could be used to investigate the pathological molecular network and genes relevant to other complex diseases or phenotypes.
Identification of Causal Genes, Networks, and Transcriptional Regulators of REM Sleep and Wake
Millstein, Joshua; Winrow, Christopher J.; Kasarskis, Andrew; Owens, Joseph R.; Zhou, Lili; Summa, Keith C.; Fitzpatrick, Karrie; Zhang, Bin; Vitaterna, Martha H.; Schadt, Eric E.; Renger, John J.; Turek, Fred W.
2011-01-01
Study Objective: Sleep-wake traits are well-known to be under substantial genetic control, but the specific genes and gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Thus, the aim of this study was to use systems genetics and statistical approaches to uncover the genetic networks underlying 2 primary sleep traits in the mouse: 24-h duration of REM sleep and wake. Design: Genome-wide RNA expression data from 3 tissues (anterior cortex, hypothalamus, thalamus/midbrain) were used in conjunction with high-density genotyping to identify candidate causal genes and networks mediating the effects of 2 QTL regulating the 24-h duration of REM sleep and one regulating the 24-h duration of wake. Setting: Basic sleep research laboratory. Patients or Participants: Male [C57BL/6J × (BALB/cByJ × C57BL/6J*) F1] N2 mice (n = 283). Interventions: None. Measurements and Results: The genetic variation of a mouse N2 mapping cross was leveraged against sleep-state phenotypic variation as well as quantitative gene expression measurement in key brain regions using integrative genomics approaches to uncover multiple causal sleep-state regulatory genes, including several surprising novel candidates, which interact as components of networks that modulate REM sleep and wake. In particular, it was discovered that a core network module, consisting of 20 genes, involved in the regulation of REM sleep duration is conserved across the cortex, hypothalamus, and thalamus. A novel application of a formal causal inference test was also used to identify those genes directly regulating sleep via control of expression. Conclusion: Systems genetics approaches reveal novel candidate genes, complex networks and specific transcriptional regulators of REM sleep and wake duration in mammals. Citation: Millstein J; Winrow CJ; Kasarskis A; Owens JR; Zhou L; Summa KC; Fitzpatrick K; Zhang B; Vitaterna MH; Schadt EE; Renger JJ; Turek FW. Identification of causal genes, networks, and transcriptional regulators of REM sleep and wake. SLEEP 2011;34(11):1469-1477. PMID:22043117
A stochastic and dynamical view of pluripotency in mouse embryonic stem cells
Lee, Esther J.
2018-01-01
Pluripotent embryonic stem cells are of paramount importance for biomedical sciences because of their innate ability for self-renewal and differentiation into all major cell lines. The fateful decision to exit or remain in the pluripotent state is regulated by complex genetic regulatory networks. The rapid growth of single-cell sequencing data has greatly stimulated applications of statistical and machine learning methods for inferring topologies of pluripotency regulating genetic networks. The inferred network topologies, however, often only encode Boolean information while remaining silent about the roles of dynamics and molecular stochasticity inherent in gene expression. Herein we develop a framework for systematically extending Boolean-level network topologies into higher resolution models of networks which explicitly account for the promoter architectures and gene state switching dynamics. We show the framework to be useful for disentangling the various contributions that gene switching, external signaling, and network topology make to the global heterogeneity and dynamics of transcription factor populations. We find the pluripotent state of the network to be a steady state which is robust to global variations of gene switching rates which we argue are a good proxy for epigenetic states of individual promoters. The temporal dynamics of exiting the pluripotent state, on the other hand, is significantly influenced by the rates of genetic switching which makes cells more responsive to changes in extracellular signals. PMID:29451874
Bipartite Community Structure of eQTLs.
Platig, John; Castaldi, Peter J; DeMeo, Dawn; Quackenbush, John
2016-09-01
Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network "hub" SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community ("core SNPs") and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits.
Chatterjee, Sumantra; Sivakamasundari, V; Yap, Sook Peng; Kraus, Petra; Kumar, Vibhor; Xing, Xing; Lim, Siew Lan; Sng, Joel; Prabhakar, Shyam; Lufkin, Thomas
2014-12-05
Vertebrate organogenesis is a highly complex process involving sequential cascades of transcription factor activation or repression. Interestingly a single developmental control gene can occasionally be essential for the morphogenesis and differentiation of tissues and organs arising from vastly disparate embryological lineages. Here we elucidated the role of the mammalian homeobox gene Bapx1 during the embryogenesis of five distinct organs at E12.5 - vertebral column, spleen, gut, forelimb and hindlimb - using expression profiling of sorted wildtype and mutant cells combined with genome wide binding site analysis. Furthermore we analyzed the development of the vertebral column at the molecular level by combining transcriptional profiling and genome wide binding data for Bapx1 with similarly generated data sets for Sox9 to assemble a detailed gene regulatory network revealing genes previously not reported to be controlled by either of these two transcription factors. The gene regulatory network appears to control cell fate decisions and morphogenesis in the vertebral column along with the prevention of premature chondrocyte differentiation thus providing a detailed molecular view of vertebral column development.
2011-01-01
Background A gene's position in regulatory, protein interaction or metabolic networks can be predictive of the strength of purifying selection acting on it, but these relationships are neither universal nor invariably strong. Following work in bacteria, fungi and invertebrate animals, we explore the relationship between selective constraint and metabolic function in mammals. Results We measure the association between selective constraint, estimated by the ratio of nonsynonymous (Ka) to synonymous (Ks) substitutions, and several, primarily metabolic, measures of gene function. We find significant differences between the selective constraints acting on enzyme-coding genes from different cellular compartments, with the nucleus showing higher constraint than genes from either the cytoplasm or the mitochondria. Among metabolic genes, the centrality of an enzyme in the metabolic network is significantly correlated with Ka/Ks. In contrast to yeasts, gene expression magnitude does not appear to be the primary predictor of selective constraint in these organisms. Conclusions Our results imply that the relationship between selective constraint and enzyme centrality is complex: the strength of selective constraint acting on mammalian genes is quite variable and does not appear to exclusively follow patterns seen in other organisms. PMID:21470417
Co-Option and De Novo Gene Evolution Underlie Molluscan Shell Diversity
Aguilera, Felipe; McDougall, Carmel
2017-01-01
Abstract Molluscs fabricate shells of incredible diversity and complexity by localized secretions from the dorsal epithelium of the mantle. Although distantly related molluscs express remarkably different secreted gene products, it remains unclear if the evolution of shell structure and pattern is underpinned by the differential co-option of conserved genes or the integration of lineage-specific genes into the mantle regulatory program. To address this, we compare the mantle transcriptomes of 11 bivalves and gastropods of varying relatedness. We find that each species, including four Pinctada (pearl oyster) species that diverged within the last 20 Ma, expresses a unique mantle secretome. Lineage- or species-specific genes comprise a large proportion of each species’ mantle secretome. A majority of these secreted proteins have unique domain architectures that include repetitive, low complexity domains (RLCDs), which evolve rapidly, and have a proclivity to expand, contract and rearrange in the genome. There are also a large number of secretome genes expressed in the mantle that arose before the origin of gastropods and bivalves. Each species expresses a unique set of these more ancient genes consistent with their independent co-option into these mantle gene regulatory networks. From this analysis, we infer lineage-specific secretomes underlie shell diversity, and include both rapidly evolving RLCD-containing proteins, and the continual recruitment and loss of both ancient and recently evolved genes into the periphery of the regulatory network controlling gene expression in the mantle epithelium. PMID:28053006
Wu, Min; Kwoh, Chee-Keong; Li, Xiaoli; Zheng, Jie
2014-09-11
The regulatory mechanism of recombination is one of the most fundamental problems in genomics, with wide applications in genome wide association studies (GWAS), birth-defect diseases, molecular evolution, cancer research, etc. Recombination events cluster into short genomic regions called "recombination hotspots". Recently, a zinc finger protein PRDM9 was reported to regulate recombination hotspots in human and mouse genomes. In addition, a 13-mer motif contained in the binding sites of PRDM9 is found to be enriched in human hotspots. However, this 13-mer motif only covers a fraction of hotspots, indicating that PRDM9 is not the only regulator of recombination hotspots. Therefore, the challenge of discovering other regulators of recombination hotspots becomes significant. Furthermore, recombination is a complex process. Hence, multiple proteins acting as machinery, rather than individual proteins, are more likely to carry out this process in a precise and stable manner. Therefore, the extension of the prediction of individual trans-regulators to protein complexes is also highly desired. In this paper, we introduce a pipeline to identify genes and protein complexes associated with recombination hotspots. First, we prioritize proteins associated with hotspots based on their preference of binding to hotspots and coldspots. Second, using the above identified genes as seeds, we apply the Random Walk with Restart algorithm (RWR) to propagate their influences to other proteins in protein-protein interaction (PPI) networks. Hence, many proteins without DNA-binding information will also be assigned a score to implicate their roles in recombination hotspots. Third, we construct sub-PPI networks induced by top genes ranked by RWR for various species (e.g., yeast, human and mouse) and detect protein complexes in those sub-PPI networks. The GO term analysis show that our prioritizing methods and the RWR algorithm are capable of identifying novel genes associated with recombination hotspots. The trans-regulators predicted by our pipeline are enriched with epigenetic functions (e.g., histone modifications), demonstrating the epigenetic regulatory mechanisms of recombination hotspots. The identified protein complexes also provide us with candidates to further investigate the molecular machineries for recombination hotspots. Moreover, the experimental data and results are available on our web site http://www.ntu.edu.sg/home/zhengjie/data/RecombinationHotspot/NetPipe/.
SWIM: a computational tool to unveiling crucial nodes in complex biological networks.
Paci, Paola; Colombo, Teresa; Fiscon, Giulia; Gurtner, Aymone; Pavesi, Giulio; Farina, Lorenzo
2017-03-20
SWItchMiner (SWIM) is a wizard-like software implementation of a procedure, previously described, able to extract information contained in complex networks. Specifically, SWIM allows unearthing the existence of a new class of hubs, called "fight-club hubs", characterized by a marked negative correlation with their first nearest neighbors. Among them, a special subset of genes, called "switch genes", appears to be characterized by an unusual pattern of intra- and inter-module connections that confers them a crucial topological role, interestingly mirrored by the evidence of their clinic-biological relevance. Here, we applied SWIM to a large panel of cancer datasets from The Cancer Genome Atlas, in order to highlight switch genes that could be critically associated with the drastic changes in the physiological state of cells or tissues induced by the cancer development. We discovered that switch genes are found in all cancers we studied and they encompass protein coding genes and non-coding RNAs, recovering many known key cancer players but also many new potential biomarkers not yet characterized in cancer context. Furthermore, SWIM is amenable to detect switch genes in different organisms and cell conditions, with the potential to uncover important players in biologically relevant scenarios, including but not limited to human cancer.
Jiggins, Chris D; Wallbank, Richard W R; Hanly, Joseph J
2017-02-05
A major challenge is to understand how conserved gene regulatory networks control the wonderful diversity of form that we see among animals and plants. Butterfly wing patterns are an excellent example of this diversity. Butterfly wings form as imaginal discs in the caterpillar and are constructed by a gene regulatory network, much of which is conserved across the holometabolous insects. Recent work in Heliconius butterflies takes advantage of genomic approaches and offers insights into how the diversification of wing patterns is overlaid onto this conserved network. WntA is a patterning morphogen that alters spatial information in the wing. Optix is a transcription factor that acts later in development to paint specific wing regions red. Both of these loci fit the paradigm of conserved protein-coding loci with diverse regulatory elements and developmental roles that have taken on novel derived functions in patterning wings. These discoveries offer insights into the 'Nymphalid Ground Plan', which offers a unifying hypothesis for pattern formation across nymphalid butterflies. These loci also represent 'hotspots' for morphological change that have been targeted repeatedly during evolution. Both convergent and divergent evolution of a great diversity of patterns is controlled by complex alleles at just a few genes. We suggest that evolutionary change has become focused on one or a few genetic loci for two reasons. First, pre-existing complex cis-regulatory loci that already interact with potentially relevant transcription factors are more likely to acquire novel functions in wing patterning. Second, the shape of wing regulatory networks may constrain evolutionary change to one or a few loci. Overall, genomic approaches that have identified wing patterning loci in these butterflies offer broad insight into how gene regulatory networks evolve to produce diversity.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'. © 2016 The Author(s).
Wallbank, Richard W. R.; Hanly, Joseph J.
2017-01-01
A major challenge is to understand how conserved gene regulatory networks control the wonderful diversity of form that we see among animals and plants. Butterfly wing patterns are an excellent example of this diversity. Butterfly wings form as imaginal discs in the caterpillar and are constructed by a gene regulatory network, much of which is conserved across the holometabolous insects. Recent work in Heliconius butterflies takes advantage of genomic approaches and offers insights into how the diversification of wing patterns is overlaid onto this conserved network. WntA is a patterning morphogen that alters spatial information in the wing. Optix is a transcription factor that acts later in development to paint specific wing regions red. Both of these loci fit the paradigm of conserved protein-coding loci with diverse regulatory elements and developmental roles that have taken on novel derived functions in patterning wings. These discoveries offer insights into the ‘Nymphalid Ground Plan’, which offers a unifying hypothesis for pattern formation across nymphalid butterflies. These loci also represent ‘hotspots’ for morphological change that have been targeted repeatedly during evolution. Both convergent and divergent evolution of a great diversity of patterns is controlled by complex alleles at just a few genes. We suggest that evolutionary change has become focused on one or a few genetic loci for two reasons. First, pre-existing complex cis-regulatory loci that already interact with potentially relevant transcription factors are more likely to acquire novel functions in wing patterning. Second, the shape of wing regulatory networks may constrain evolutionary change to one or a few loci. Overall, genomic approaches that have identified wing patterning loci in these butterflies offer broad insight into how gene regulatory networks evolve to produce diversity. This article is part of the themed issue ‘Evo-devo in the genomics era, and the origins of morphological diversity’. PMID:27994126
Knowledge Discovery in Spectral Data by Means of Complex Networks
Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro
2013-01-01
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease. PMID:24957895
Knowledge discovery in spectral data by means of complex networks.
Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Jaimes-Reategui, Rider; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro
2013-03-11
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease.
Geronikolou, Styliani A; Pavlopoulou, Athanasia; Cokkinos, Dennis; Chrousos, George
2017-01-01
Obesity is a chronic disease of increasing prevalence reaching epidemic proportions. Genetic defects as well as epigenetic effects contribute to the obesity phenotype. Investigating gene (e.g. MC4R defects)-environment (behavior, infectious agents, stress) interactions is a relative new field of great research interest. In this study, we have made an effort to create an interactome (henceforth referred to as "obesidome"), where extrinsic stressors response, intrinsic predisposition, immunity response to inflammation and autonomous nervous system implications are integrated. These pathways are presented in one interactome network for the first time. In our study, obesity-related genes/gene products were found to form a complex interactions network.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
NASA Technical Reports Server (NTRS)
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Topology and Control of the Cell-Cycle-Regulated Transcriptional Circuitry
Haase, Steven B.; Wittenberg, Curt
2014-01-01
Nearly 20% of the budding yeast genome is transcribed periodically during the cell division cycle. The precise temporal execution of this large transcriptional program is controlled by a large interacting network of transcriptional regulators, kinases, and ubiquitin ligases. Historically, this network has been viewed as a collection of four coregulated gene clusters that are associated with each phase of the cell cycle. Although the broad outlines of these gene clusters were described nearly 20 years ago, new technologies have enabled major advances in our understanding of the genes comprising those clusters, their regulation, and the complex regulatory interplay between clusters. More recently, advances are being made in understanding the roles of chromatin in the control of the transcriptional program. We are also beginning to discover important regulatory interactions between the cell-cycle transcriptional program and other cell-cycle regulatory mechanisms such as checkpoints and metabolic networks. Here we review recent advances and contemporary models of the transcriptional network and consider these models in the context of eukaryotic cell-cycle controls. PMID:24395825
Rapid cell-free forward engineering of novel genetic ring oscillators
Niederholtmeyer, Henrike; Sun, Zachary Z; Hori, Yutaka; Yeung, Enoch; Verpoorte, Amanda; Murray, Richard M; Maerkl, Sebastian J
2015-01-01
While complex dynamic biological networks control gene expression in all living organisms, the forward engineering of comparable synthetic networks remains challenging. The current paradigm of characterizing synthetic networks in cells results in lengthy design-build-test cycles, minimal data collection, and poor quantitative characterization. Cell-free systems are appealing alternative environments, but it remains questionable whether biological networks behave similarly in cell-free systems and in cells. We characterized in a cell-free system the ‘repressilator’, a three-node synthetic oscillator. We then engineered novel three, four, and five-gene ring architectures, from characterization of circuit components to rapid analysis of complete networks. When implemented in cells, our novel 3-node networks produced population-wide oscillations and 95% of 5-node oscillator cells oscillated for up to 72 hr. Oscillation periods in cells matched the cell-free system results for all networks tested. An alternate forward engineering paradigm using cell-free systems can thus accurately capture cellular behavior. DOI: http://dx.doi.org/10.7554/eLife.09771.001 PMID:26430766
Inferring drug-disease associations based on known protein complexes.
Yu, Liang; Huang, Jianbin; Ma, Zhixin; Zhang, Jing; Zou, Yapeng; Gao, Lin
2015-01-01
Inferring drug-disease associations is critical in unveiling disease mechanisms, as well as discovering novel functions of available drugs, or drug repositioning. Previous work is primarily based on drug-gene-disease relationship, which throws away many important information since genes execute their functions through interacting others. To overcome this issue, we propose a novel methodology that discover the drug-disease association based on protein complexes. Firstly, the integrated heterogeneous network consisting of drugs, protein complexes, and disease are constructed, where we assign weights to the drug-disease association by using probability. Then, from the tripartite network, we get the indirect weighted relationships between drugs and diseases. The larger the weight, the higher the reliability of the correlation. We apply our method to mental disorders and hypertension, and validate the result by using comparative toxicogenomics database. Our ranked results can be directly reinforced by existing biomedical literature, suggesting that our proposed method obtains higher specificity and sensitivity. The proposed method offers new insight into drug-disease discovery. Our method is publicly available at http://1.complexdrug.sinaapp.com/Drug_Complex_Disease/Data_Download.html.
Inferring drug-disease associations based on known protein complexes
2015-01-01
Inferring drug-disease associations is critical in unveiling disease mechanisms, as well as discovering novel functions of available drugs, or drug repositioning. Previous work is primarily based on drug-gene-disease relationship, which throws away many important information since genes execute their functions through interacting others. To overcome this issue, we propose a novel methodology that discover the drug-disease association based on protein complexes. Firstly, the integrated heterogeneous network consisting of drugs, protein complexes, and disease are constructed, where we assign weights to the drug-disease association by using probability. Then, from the tripartite network, we get the indirect weighted relationships between drugs and diseases. The larger the weight, the higher the reliability of the correlation. We apply our method to mental disorders and hypertension, and validate the result by using comparative toxicogenomics database. Our ranked results can be directly reinforced by existing biomedical literature, suggesting that our proposed method obtains higher specificity and sensitivity. The proposed method offers new insight into drug-disease discovery. Our method is publicly available at http://1.complexdrug.sinaapp.com/Drug_Complex_Disease/Data_Download.html. PMID:26044949
Inference of gene regulatory networks from time series by Tsallis entropy
2011-01-01
Background The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/. PMID:21545720
[Fanconi anemia: genes and function(s) revisited].
Papadopoulo, Dora; Moustacchi, Ethel
2005-01-01
Fanconi anemia (FA), a rare inherited disorder, exhibits a complex phenotype including progressive bone marrow failure, congenital malformations and increased risk of cancers, mainly acute myeloid leukaemia. At the cellular level, FA is characterized by hypersensitivity to DNA cross-linking agents and by high frequencies of induced chromosomal aberrations, a property used for diagnosis. FA results from mutations in one of the eleven FANC (FANCA to FANCJ) genes. Nine of them have been identified. In addition, FANCD1 gene has been shown to be identical to BRCA2, one of the two breast cancer susceptibility genes. Seven of the FANC proteins form a complex, which exists in four different forms depending of its subcellular localisation. Four FANC proteins (D1(BRCA2), D2, I and J) are not associated to the complex. The presence of the nuclear form of the FA core complex is necessary for the mono-ubiquitinylation of FANCD2 protein, a modification required for its re-localization to nuclear foci, likely to be sites of DNA repair. A clue towards understanding the molecular function of the FANC genes comes from the recently identified connection of FANC to the BRCA1, ATM, NBS1 and ATR genes. Two of the FANC proteins (A and D2) directly interact with BRCA1, which in turn interacts with the MRE11/RAD50/NBS1 complex, which is one of the key components in the mechanisms involved in the cellular response to DNA double strand breaks (DSB). Moreover, ATM, a protein kinase that plays a central role in the network of DSB signalling, phosphorylates in vitro and in vivo FANCD2 in response to ionising radiations. Moreover, the NBS1 protein and the monoubiquitinated form of FANCD2 seem to act together in response to DNA crosslinking agents. Taken together with the previously reported impaired DSB and DNA interstrand crosslinks repair in FA cells, the connection of FANC genes to the ATM, ATR, NBS1 and BRCA1 links the FANC genes function to the finely orchestrated network involved in the sensing, signalling and repair of DNA replication-blocking lesions.
Johnson, Michael R.; Rossetti, Tiziana; Speed, Doug; Srivastava, Prashant K.; Chadeau-Hyam, Marc; Hajji, Nabil; Dabrowska, Aleksandra; Rotival, Maxime; Razzaghi, Banafsheh; Kovac, Stjepana; Wanisch, Klaus; Grillo, Federico W.; Slaviero, Anna; Langley, Sarah R.; Shkura, Kirill; Roncon, Paolo; De, Tisham; Mattheisen, Manuel; Niehusmann, Pitt; O’Brien, Terence J.; Petrovski, Slave; von Lehe, Marec; Hoffmann, Per; Eriksson, Johan; Coffey, Alison J.; Cichon, Sven; Walker, Matthew; Simonato, Michele; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Schoch, Susanne; De Paola, Vincenzo; Kaminski, Rafal M.; Cunliffe, Vincent T.; Becker, Albert J.; Petretto, Enrico
2015-01-01
Gene-regulatory network analysis is a powerful approach to elucidate the molecular processes and pathways underlying complex disease. Here we employ systems genetics approaches to characterize the genetic regulation of pathophysiological pathways in human temporal lobe epilepsy (TLE). Using surgically acquired hippocampi from 129 TLE patients, we identify a gene-regulatory network genetically associated with epilepsy that contains a specialized, highly expressed transcriptional module encoding proconvulsive cytokines and Toll-like receptor signalling genes. RNA sequencing analysis in a mouse model of TLE using 100 epileptic and 100 control hippocampi shows the proconvulsive module is preserved across-species, specific to the epileptic hippocampus and upregulated in chronic epilepsy. In the TLE patients, we map the trans-acting genetic control of this proconvulsive module to Sestrin 3 (SESN3), and demonstrate that SESN3 positively regulates the module in macrophages, microglia and neurons. Morpholino-mediated Sesn3 knockdown in zebrafish confirms the regulation of the transcriptional module, and attenuates chemically induced behavioural seizures in vivo. PMID:25615886
Mostafavi, Sara; Gaiteri, Chris; Sullivan, Sarah E; White, Charles C; Tasaki, Shinya; Xu, Jishu; Taga, Mariko; Klein, Hans-Ulrich; Patrick, Ellis; Komashko, Vitalina; McCabe, Cristin; Smith, Robert; Bradshaw, Elizabeth M; Root, David E; Regev, Aviv; Yu, Lei; Chibnik, Lori B; Schneider, Julie A; Young-Pearse, Tracy L; Bennett, David A; De Jager, Philip L
2018-06-01
There is a need for new therapeutic targets with which to prevent Alzheimer's disease (AD), a major contributor to aging-related cognitive decline. Here we report the construction and validation of a molecular network of the aging human frontal cortex. Using RNA sequence data from 478 individuals, we first build a molecular network using modules of coexpressed genes and then relate these modules to AD and its neuropathologic and cognitive endophenotypes. We confirm these associations in two independent AD datasets. We also illustrate the use of the network in prioritizing amyloid- and cognition-associated genes for in vitro validation in human neurons and astrocytes. These analyses based on unique cohorts enable us to resolve the role of distinct cortical modules that have a direct effect on the accumulation of AD pathology from those that have a direct effect on cognitive decline, exemplifying a network approach to complex diseases.
BIOLOGICAL NETWORK EXPLORATION WITH CYTOSCAPE 3
Su, Gang; Morris, John H.; Demchak, Barry; Bader, Gary D.
2014-01-01
Cytoscape is one of the most popular open-source software tools for the visual exploration of biomedical networks composed of protein, gene and other types of interactions. It offers researchers a versatile and interactive visualization interface for exploring complex biological interconnections supported by diverse annotation and experimental data, thereby facilitating research tasks such as predicting gene function and pathway construction. Cytoscape provides core functionality to load, visualize, search, filter and save networks, and hundreds of Apps extend this functionality to address specific research needs. The latest generation of Cytoscape (version 3.0 and later) has substantial improvements in function, user interface and performance relative to previous versions. This protocol aims to jump-start new users with specific protocols for basic Cytoscape functions, such as installing Cytoscape and Cytoscape Apps, loading data, visualizing and navigating the network, visualizing network associated data (attributes) and identifying clusters. It also highlights new features that benefit experienced users. PMID:25199793
Azmi, Asfar S.; Banerjee, Sanjeev; Ali, Shadan; Wang, Zhiwei; Bao, Bin; Beck, Frances W.J.; Maitah, Main; Choi, Minsig; Shields, Tony F.; Philip, Philip A.; Sarkar, Fazlul H.; Mohammad, Ramzi M.
2011-01-01
Earlier we had shown that the MDM2 inhibitor (MI-219) belonging to the spiro-oxindole family can synergistically enhance the efficacy of platinum chemotherapeutics leading to 50% tumor free survival in a genetically complex pancreatic ductal adenocarcinoma (PDAC) xenograft model. In this report, we have taken a systems and network modeling approach in order to understand central mechanisms behind MI219-oxaliplatin synergy with validation in PDAC, colon and breast cancer cell lines. Microarray profiling of drug treatments (MI-219, oxaliplatin or their combination) in capan-2 cells reveal a similar unique set of gene alterations that is duplicated in other solid tumor cells. As single agent, MI-219 or oxaliplatin induced alterations in 48 and 761 genes respectively. The combination treatment resulted in 767 gene alterations with emergence of 286 synergy unique genes. Ingenuity network modeling of combination and synergy unique genes showed the crucial role of five key local networks CREB, CARF, EGR1, NF-kB and E Cadherin. The network signatures were validated at the protein level in all three cell lines. Individually silencing central nodes in these five hubs resulted in abrogation of MI-219-oxaliplatin activity confirming their critical role in aiding p53 mediated apoptotic response. We anticipate that our MI219-oxaliplatin network blueprints can be clinically translated in the rationale design and application of this unique therapeutic combination in a genetically pre-defined subset of patients. PMID:21623005
Genetic and environmental pathways to complex diseases.
Gohlke, Julia M; Thomas, Reuben; Zhang, Yonqing; Rosenstein, Michael C; Davis, Allan P; Murphy, Cynthia; Becker, Kevin G; Mattingly, Carolyn J; Portier, Christopher J
2009-05-05
Pathogenesis of complex diseases involves the integration of genetic and environmental factors over time, making it particularly difficult to tease apart relationships between phenotype, genotype, and environmental factors using traditional experimental approaches. Using gene-centered databases, we have developed a network of complex diseases and environmental factors through the identification of key molecular pathways associated with both genetic and environmental contributions. Comparison with known chemical disease relationships and analysis of transcriptional regulation from gene expression datasets for several environmental factors and phenotypes clustered in a metabolic syndrome and neuropsychiatric subnetwork supports our network hypotheses. This analysis identifies natural and synthetic retinoids, antipsychotic medications, Omega 3 fatty acids, and pyrethroid pesticides as potential environmental modulators of metabolic syndrome phenotypes through PPAR and adipocytokine signaling and organophosphate pesticides as potential environmental modulators of neuropsychiatric phenotypes. Identification of key regulatory pathways that integrate genetic and environmental modulators define disease associated targets that will allow for efficient screening of large numbers of environmental factors, screening that could set priorities for further research and guide public health decisions.
Moschen, Sebastián; Higgins, Janet; Di Rienzo, Julio A; Heinz, Ruth A; Paniego, Norma; Fernandez, Paula
2016-06-06
In recent years, high throughput technologies have led to an increase of datasets from omics disciplines allowing the understanding of the complex regulatory networks associated with biological processes. Leaf senescence is a complex mechanism controlled by multiple genetic and environmental variables, which has a strong impact on crop yield. Transcription factors (TFs) are key proteins in the regulation of gene expression, regulating different signaling pathways; their function is crucial for triggering and/or regulating different aspects of the leaf senescence process. The study of TF interactions and their integration with metabolic profiles under different developmental conditions, especially for a non-model organism such as sunflower, will open new insights into the details of gene regulation of leaf senescence. Weighted Gene Correlation Network Analysis (WGCNA) and BioSignature Discoverer (BioSD, Gnosis Data Analysis, Heraklion, Greece) were used to integrate transcriptomic and metabolomic data. WGCNA allowed the detection of 10 metabolites and 13 TFs whereas BioSD allowed the detection of 1 metabolite and 6 TFs as potential biomarkers. The comparative analysis demonstrated that three transcription factors were detected through both methodologies, highlighting them as potentially robust biomarkers associated with leaf senescence in sunflower. The complementary use of network and BioSignature Discoverer analysis of transcriptomic and metabolomic data provided a useful tool for identifying candidate genes and metabolites which may have a role during the triggering and development of the leaf senescence process. The WGCNA tool allowed us to design and test a hypothetical network in order to infer relationships across selected transcription factor and metabolite candidate biomarkers involved in leaf senescence, whereas BioSignature Discoverer selected transcripts and metabolites which discriminate between different ages of sunflower plants. The methodology presented here would help to elucidate and predict novel networks and potential biomarkers of leaf senescence in sunflower.
Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology
2010-01-01
Background In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships. Results The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches. Conclusions The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms. PMID:21172053
Dai, Jiajuan; Wang, Xusheng; Chen, Ying; Wang, Xiaodong; Zhu, Jun; Lu, Lu
2009-11-01
Previous studies have revealed that the subunit alpha 2 (Gabra2) of the gamma-aminobutyric acid receptor plays a critical role in the stress response. However, little is known about the gentetic regulatory network for Gabra2 and the stress response. We combined gene expression microarray analysis and quantitative trait loci (QTL) mapping to characterize the genetic regulatory network for Gabra2 expression in the hippocampus of BXD recombinant inbred (RI) mice. Our analysis found that the expression level of Gabra2 exhibited much variation in the hippocampus across the BXD RI strains and between the parental strains, C57BL/6J, and DBA/2J. Expression QTL (eQTL) mapping showed three microarray probe sets of Gabra2 to have highly significant linkage likelihood ratio statistic (LRS) scores. Gene co-regulatory network analysis showed that 10 genes, including Gria3, Chka, Drd3, Homer1, Grik2, Odz4, Prkag2, Grm5, Gabrb1, and Nlgn1 are directly or indirectly associated with stress responses. Eleven genes were implicated as Gabra2 downstream genes through mapping joint modulation. The genetical genomics approach demonstrates the importance and the potential power of the eQTL studies in identifying genetic regulatory networks that contribute to complex traits, such as stress responses.
NASA Astrophysics Data System (ADS)
Chen, Ye; Wolanyk, Nathaniel; Ilker, Tunc; Gao, Shouguo; Wang, Xujing
Methods developed based on bifurcation theory have demonstrated their potential in driving network identification for complex human diseases, including the work by Chen, et al. Recently bifurcation theory has been successfully applied to model cellular differentiation. However, there one often faces a technical challenge in driving network prediction: time course cellular differentiation study often only contains one sample at each time point, while driving network prediction typically require multiple samples at each time point to infer the variation and interaction structures of candidate genes for the driving network. In this study, we investigate several methods to identify both the critical time point and the driving network through examination of how each time point affects the autocorrelation and phase locking. We apply these methods to a high-throughput sequencing (RNA-Seq) dataset of 42 subsets of thymocytes and mature peripheral T cells at multiple time points during their differentiation (GSE48138 from GEO). We compare the predicted driving genes with known transcription regulators of cellular differentiation. We will discuss the advantages and limitations of our proposed methods, as well as potential further improvements of our methods.
Pegoraro, Camila; Tadiello, Alice; Girardi, César L; Chaves, Fábio C; Quecini, Vera; de Oliveira, Antonio Costa; Trainotti, Livio; Rombaldi, Cesar Valmor
2015-11-18
Postharvest fruit conservation relies on low temperatures and manipulations of hormone metabolism to maintain sensory properties. Peaches are susceptible to chilling injuries, such as 'woolliness' that is caused by juice loss leading to a 'wooly' fruit texture. Application of gibberellic acid at the initial stages of pit hardening impairs woolliness incidence, however the mechanisms controlling the response remain unknown. We have employed genome wide transcriptional profiling to investigate the effects of gibberellic acid application and cold storage on harvested peaches. Approximately half of the investigated genes exhibited significant differential expression in response to the treatments. Cellular and developmental process gene ontologies were overrepresented among the differentially regulated genes, whereas sequences in cell death and immune response categories were underrepresented. Gene set enrichment demonstrated a predominant role of cold storage in repressing the transcription of genes associated to cell wall metabolism. In contrast, genes involved in hormone responses exhibited a more complex transcriptional response, indicating an extensive network of crosstalk between hormone signaling and low temperatures. Time course transcriptional analyses demonstrate the large contribution of gene expression regulation on the biochemical changes leading to woolliness in peach. Overall, our results provide insights on the mechanisms controlling the complex phenotypes associated to postharvest textural changes in peach and suggest that hormone mediated reprogramming previous to pit hardening affects the onset of chilling injuries.
Ji, Jiadong; He, Di; Feng, Yang; He, Yong; Xue, Fuzhong; Xie, Lei
2017-10-01
A complex disease is usually driven by a number of genes interwoven into networks, rather than a single gene product. Network comparison or differential network analysis has become an important means of revealing the underlying mechanism of pathogenesis and identifying clinical biomarkers for disease classification. Most studies, however, are limited to network correlations that mainly capture the linear relationship among genes, or rely on the assumption of a parametric probability distribution of gene measurements. They are restrictive in real application. We propose a new Joint density based non-parametric Differential Interaction Network Analysis and Classification (JDINAC) method to identify differential interaction patterns of network activation between two groups. At the same time, JDINAC uses the network biomarkers to build a classification model. The novelty of JDINAC lies in its potential to capture non-linear relations between molecular interactions using high-dimensional sparse data as well as to adjust confounding factors, without the need of the assumption of a parametric probability distribution of gene measurements. Simulation studies demonstrate that JDINAC provides more accurate differential network estimation and lower classification error than that achieved by other state-of-the-art methods. We apply JDINAC to a Breast Invasive Carcinoma dataset, which includes 114 patients who have both tumor and matched normal samples. The hub genes and differential interaction patterns identified were consistent with existing experimental studies. Furthermore, JDINAC discriminated the tumor and normal sample with high accuracy by virtue of the identified biomarkers. JDINAC provides a general framework for feature selection and classification using high-dimensional sparse omics data. R scripts available at https://github.com/jijiadong/JDINAC. lxie@iscb.org. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Markov Logic Networks in the Analysis of Genetic Data
Sakhanenko, Nikita A.
2010-01-01
Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249
Evolutionary trends and functional anatomy of the human expanded autophagy network
Till, Andreas; Saito, Rintaro; Merkurjev, Daria; Liu, Jing-Jing; Syed, Gulam Hussain; Kolnik, Martin; Siddiqui, Aleem; Glas, Martin; Scheffler, Björn; Ideker, Trey; Subramani, Suresh
2015-01-01
All eukaryotic cells utilize autophagy for protein and organelle turnover, thus assuring subcellular quality control, homeostasis, and survival. In order to address recent advances in identification of human autophagy associated genes, and to describe autophagy on a system-wide level, we established an autophagy-centered gene interaction network by merging various primary data sets and by retrieving respective interaction data. The resulting network (‘AXAN’) was analyzed with respect to subnetworks, e.g. the prime gene subnetwork (including the core machinery, signaling pathways and autophagy receptors) and the transcription subnetwork. To describe aspects of evolution within this network, we assessed the presence of protein orthologs across 99 eukaryotic model organisms. We visualized evolutionary trends for prime gene categories and evolutionary tracks for selected AXAN genes. This analysis confirms the eukaryotic origin of autophagy core genes while it points to a diverse evolutionary history of autophagy receptors. Next, we used module identification to describe the functional anatomy of the network at the level of pathway modules. In addition to obvious pathways (e.g., lysosomal degradation, insulin signaling) our data unveil the existence of context-related modules such as Rho GTPase signaling. Last, we used a tripartite, image-based RNAi – screen to test candidate genes predicted to play a role in regulation of autophagy. We verified the Rho GTPase, CDC42, as a novel regulator of autophagy-related signaling. This study emphasizes the applicability of system-wide approaches to gain novel insights into a complex biological process and to describe the human autophagy pathway at a hitherto unprecedented level of detail. PMID:26103419
Brorsson, C.; Hansen, N. T.; Lage, K.; Bergholdt, R.; Brunak, S.; Pociot, F.
2009-01-01
Aim To develop novel methods for identifying new genes that contribute to the risk of developing type 1 diabetes within the Major Histocompatibility Complex (MHC) region on chromosome 6, independently of the known linkage disequilibrium (LD) between human leucocyte antigen (HLA)-DRB1, -DQA1, -DQB1 genes. Methods We have developed a novel method that combines single nucleotide polymorphism (SNP) genotyping data with protein–protein interaction (ppi) networks to identify disease-associated network modules enriched for proteins encoded from the MHC region. Approximately 2500 SNPs located in the 4 Mb MHC region were analysed in 1000 affected offspring trios generated by the Type 1 Diabetes Genetics Consortium (T1DGC). The most associated SNP in each gene was chosen and genes were mapped to ppi networks for identification of interaction partners. The association testing and resulting interacting protein modules were statistically evaluated using permutation. Results A total of 151 genes could be mapped to nodes within the protein interaction network and their interaction partners were identified. Five protein interaction modules reached statistical significance using this approach. The identified proteins are well known in the pathogenesis of T1D, but the modules also contain additional candidates that have been implicated in β-cell development and diabetic complications. Conclusions The extensive LD within the MHC region makes it important to develop new methods for analysing genotyping data for identification of additional risk genes for T1D. Combining genetic data with knowledge about functional pathways provides new insight into mechanisms underlying T1D. PMID:19143816
Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.
Quevedo-Tumailli, Viviana F; Ortega-Tenezaca, Bernabé; González-Díaz, Humbert
2018-03-02
The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and specificity 70-80% in training and external validation series. All of these results may point to a possible biological relevance of gene orientation inversion not directly dependent on genetic sequence information. This work opens the gate to the use of GOINs as a tool for the study of the structure of chromosomes and the study of protein function in proteome research.
Woznica, Arielle; Haeussler, Maximilian; Starobinska, Ella; Jemmett, Jessica; Li, Younan; Mount, David; Davidson, Brad
2012-08-01
The complex, partially redundant gene regulatory architecture underlying vertebrate heart formation has been difficult to characterize. Here, we dissect the primary cardiac gene regulatory network in the invertebrate chordate, Ciona intestinalis. The Ciona heart progenitor lineage is first specified by Fibroblast Growth Factor/Map Kinase (FGF/MapK) activation of the transcription factor Ets1/2 (Ets). Through microarray analysis of sorted heart progenitor cells, we identified the complete set of primary genes upregulated by FGF/Ets shortly after heart progenitor emergence. Combinatorial sequence analysis of these co-regulated genes generated a hypothetical regulatory code consisting of Ets binding sites associated with a specific co-motif, ATTA. Through extensive reporter analysis, we confirmed the functional importance of the ATTA co-motif in primary heart progenitor gene regulation. We then used the Ets/ATTA combination motif to successfully predict a number of additional heart progenitor gene regulatory elements, including an intronic element driving expression of the core conserved cardiac transcription factor, GATAa. This work significantly advances our understanding of the Ciona heart gene network. Furthermore, this work has begun to elucidate the precise regulatory architecture underlying the conserved, primary role of FGF/Ets in chordate heart lineage specification. Copyright © 2012 Elsevier Inc. All rights reserved.
Detecting gene subnetworks under selection in biological pathways.
Gouy, Alexandre; Daub, Joséphine T; Excoffier, Laurent
2017-09-19
Advances in high throughput sequencing technologies have created a gap between data production and functional data analysis. Indeed, phenotypes result from interactions between numerous genes, but traditional methods treat loci independently, missing important knowledge brought by network-level emerging properties. Therefore, detecting selection acting on multiple genes affecting the evolution of complex traits remains challenging. In this context, gene network analysis provides a powerful framework to study the evolution of adaptive traits and facilitates the interpretation of genome-wide data. We developed a method to analyse gene networks that is suitable to evidence polygenic selection. The general idea is to search biological pathways for subnetworks of genes that directly interact with each other and that present unusual evolutionary features. Subnetwork search is a typical combinatorial optimization problem that we solve using a simulated annealing approach. We have applied our methodology to find signals of adaptation to high-altitude in human populations. We show that this adaptation has a clear polygenic basis and is influenced by many genetic components. Our approach, implemented in the R package signet, improves on gene-level classical tests for selection by identifying both new candidate genes and new biological processes involved in adaptation to altitude. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Williamson, Cait M.; Franks, Becca; Curley, James P.
2016-01-01
Laboratory studies of social behavior have typically focused on dyadic interactions occurring within a limited spatiotemporal context. However, this strategy prevents analyses of the dynamics of group social behavior and constrains identification of the biological pathways mediating individual differences in behavior. In the current study, we aimed to identify the spatiotemporal dynamics and hierarchical organization of a large social network of male mice. We also sought to determine if standard assays of social and exploratory behavior are predictive of social behavior in this social network and whether individual network position was associated with the mRNA expression of two plasticity-related genes, DNA methyltransferase 1 and 3a. Mice were observed to form a hierarchically organized social network and self-organized into two separate social network communities. Members of both communities exhibited distinct patterns of socio-spatial organization within the vivaria that was not limited to only agonistic interactions. We further established that exploratory and social behaviors in standard behavioral assays conducted prior to placing the mice into the large group was predictive of initial network position and behavior but were not associated with final social network position. Finally, we determined that social network position is associated with variation in mRNA levels of two neural plasticity genes, DNMT1 and DNMT3a, in the hippocampus but not the mPOA. This work demonstrates the importance of understanding the role of social context and complex social dynamics in determining the relationship between individual differences in social behavior and brain gene expression. PMID:27540359
2013-01-01
Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize deregulated genes and group them into gene modules by simultaneously considering gene expression level changes and gene-gene co-regulations. When applied to both simulated and empirical data, nDGE outperforms the traditional DGE method. More specifically, when applied to smoker and non-smoker lung cancer sets, nDGE results illustrate the molecular differences between smoker and non-smoker lung cancer. PMID:24341432
Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen
2018-01-01
Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.
A link prediction method for heterogeneous networks based on BP neural network
NASA Astrophysics Data System (ADS)
Li, Ji-chao; Zhao, Dan-ling; Ge, Bing-Feng; Yang, Ke-Wei; Chen, Ying-Wu
2018-04-01
Most real-world systems, composed of different types of objects connected via many interconnections, can be abstracted as various complex heterogeneous networks. Link prediction for heterogeneous networks is of great significance for mining missing links and reconfiguring networks according to observed information, with considerable applications in, for example, friend and location recommendations and disease-gene candidate detection. In this paper, we put forward a novel integrated framework, called MPBP (Meta-Path feature-based BP neural network model), to predict multiple types of links for heterogeneous networks. More specifically, the concept of meta-path is introduced, followed by the extraction of meta-path features for heterogeneous networks. Next, based on the extracted meta-path features, a supervised link prediction model is built with a three-layer BP neural network. Then, the solution algorithm of the proposed link prediction model is put forward to obtain predicted results by iteratively training the network. Last, numerical experiments on the dataset of examples of a gene-disease network and a combat network are conducted to verify the effectiveness and feasibility of the proposed MPBP. It shows that the MPBP with very good performance is superior to the baseline methods.
Lung evolution as a cipher for physiology
Torday, J. S.; Rehan, V. K.
2009-01-01
In the postgenomic era, we need an algorithm to readily translate genes into physiologic principles. The failure to advance biomedicine is due to the false hope raised in the wake of the Human Genome Project (HGP) by the promise of systems biology as a ready means of reconstructing physiology from genes. like the atom in physics, the cell, not the gene, is the smallest completely functional unit of biology. Trying to reassemble gene regulatory networks without accounting for this fundamental feature of evolution will result in a genomic atlas, but not an algorithm for functional genomics. For example, the evolution of the lung can be “deconvoluted” by applying cell-cell communication mechanisms to all aspects of lung biology development, homeostasis, and regeneration/repair. Gene regulatory networks common to these processes predict ontogeny, phylogeny, and the disease-related consequences of failed signaling. This algorithm elucidates characteristics of vertebrate physiology as a cascade of emergent and contingent cellular adaptational responses. By reducing complex physiological traits to gene regulatory networks and arranging them hierarchically in a self-organizing map, like the periodic table of elements in physics, the first principles of physiology will emerge. PMID:19366785
NASA Astrophysics Data System (ADS)
Palla, Gergely; Derenyi, Imre; Farkas, Illes J.; Vicsek, Tamas
2006-03-01
Most tasks in a cell are performed not by individual proteins, but by functional groups of proteins (either physically interacting with each other or associated in other ways). In gene (protein) association networks these groups show up as sets of densely connected nodes. In the yeast, Saccharomyces cerevisiae, known physically interacting groups of proteins (called protein complexes) strongly overlap: the total number of proteins contained by these complexes by far underestimates the sum of their sizes (2750 vs. 8932). Thus, most functional groups of proteins, both physically interacting and other, are likely to share many of their members with other groups. However, current algorithms searching for dense groups of nodes in networks usually exclude overlaps. With the aim to discover both novel functions of individual proteins and novel protein functional groups we combine in protein association networks (i) a search for overlapping dense subgraphs based on the Clique Percolation Method (CPM) (Palla, G., et.al. Nature 435, 814-818 (2005), http://angel.elte.hu/clustering), which explicitly allows for overlaps among the groups, and (ii) a verification and characterization of the identified groups of nodes (proteins) with the help of standard annotation databases listing known functions.
Singh, Ajeet Pratap; Archer, Trevor K.
2014-01-01
The regulatory networks of differentiation programs and the molecular mechanisms of lineage-specific gene regulation in mammalian embryos remain only partially defined. We document differential expression and temporal switching of BRG1-associated factor (BAF) subunits, core pluripotency factors and cardiac-specific genes during post-implantation development and subsequent early organogenesis. Using affinity purification of BRG1 ATPase coupled to mass spectrometry, we characterized the cardiac-enriched remodeling complexes present in E8.5 mouse embryos. The relative abundance and combinatorial assembly of the BAF subunits provides functional specificity to Switch/Sucrose NonFermentable (SWI/SNF) complexes resulting in a unique gene expression profile in the developing heart. Remarkably, the specific depletion of the BAF250a subunit demonstrated differential effects on cardiac-specific gene expression and resulted in arrhythmic contracting cardiomyocytes in vitro. Indeed, the BAF250a physically interacts and functionally cooperates with Nucleosome Remodeling and Histone Deacetylase (NURD) complex subunits to repressively regulate chromatin structure of the cardiac genes by switching open and poised chromatin marks associated with active and repressed gene expression. Finally, BAF250a expression modulates BRG1 occupancy at the loci of cardiac genes regulatory regions in P19 cell differentiation. These findings reveal specialized and novel cardiac-enriched SWI/SNF chromatin-remodeling complexes, which are required for heart formation and critical for cardiac gene expression regulation at the early stages of heart development. PMID:24335282
RRW: repeated random walks on genome-scale protein networks for local cluster discovery
Macropol, Kathy; Can, Tolga; Singh, Ambuj K
2009-01-01
Background We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e.g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins. Results We apply the proposed technique on a functional network of yeast genes and accurately identify statistically significant clusters of proteins. We validate the biological significance of the results using known complexes in the MIPS complex catalogue database and well-characterized biological processes. We find that 90% of the created clusters have the majority of their catalogued proteins belonging to the same MIPS complex, and about 80% have the majority of their proteins involved in the same biological process. We compare our method to various other clustering techniques, such as the Markov Clustering Algorithm (MCL), and find a significant improvement in the RRW clusters' precision and accuracy values. Conclusion RRW, which is a technique that exploits the topology of the network, is more precise and robust in finding local clusters. In addition, it has the added flexibility of being able to find multi-functional proteins by allowing overlapping clusters. PMID:19740439
Functional networks inference from rule-based machine learning models.
Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume
2016-01-01
Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html.
NASA Astrophysics Data System (ADS)
Christensen, Claire Petra
Across diverse fields ranging from physics to biology, sociology, and economics, the technological advances of the past decade have engendered an unprecedented explosion of data on highly complex systems with thousands, if not millions of interacting components. These systems exist at many scales of size and complexity, and it is becoming ever-more apparent that they are, in fact, universal, arising in every field of study. Moreover, they share fundamental properties---chief among these, that the individual interactions of their constituent parts may be well-understood, but the characteristic behaviour produced by the confluence of these interactions---by these complex networks---is unpredictable; in a nutshell, the whole is more than the sum of its parts. There is, perhaps, no better illustration of this concept than the discoveries being made regarding complex networks in the biological sciences. In particular, though the sequencing of the human genome in 2003 was a remarkable feat, scientists understand that the "cellular-level blueprints" for the human being are cellular-level parts lists, but they say nothing (explicitly) about cellular-level processes. The challenge of modern molecular biology is to understand these processes in terms of the networks of parts---in terms of the interactions among proteins, enzymes, genes, and metabolites---as it is these processes that ultimately differentiate animate from inanimate, giving rise to life! It is the goal of systems biology---an umbrella field encapsulating everything from molecular biology to epidemiology in social systems---to understand processes in terms of fundamental networks of core biological parts, be they proteins or people. By virtue of the fact that there are literally countless complex systems, not to mention tools and techniques used to infer, simulate, analyze, and model these systems, it is impossible to give a truly comprehensive account of the history and study of complex systems. The author's own publications have contributed network inference, simulation, modeling, and analysis methods to the much larger body of work in systems biology, and indeed, in network science. The aim of this thesis is therefore twofold: to present this original work in the historical context of network science, but also to provide sufficient review and reference regarding complex systems (with an emphasis on complex networks in systems biology) and tools and techniques for their inference, simulation, analysis, and modeling, such that the reader will be comfortable in seeking out further information on the subject. The review-like Chapters 1, 2, and 4 are intended to convey the co-evolution of network science and the slow but noticeable breakdown of boundaries between disciplines in academia as research and comparison of diverse systems has brought to light the shared properties of these systems. It is the author's hope that theses chapters impart some sense of the remarkable and rapid progress in complex systems research that has led to this unprecedented academic synergy. Chapters 3 and 5 detail the author's original work in the context of complex systems research. Chapter 3 presents the methods and results of a two-stage modeling process that generates candidate gene-regulatory networks of the bacterium B.subtilis from experimentally obtained, yet mathematically underdetermined microchip array data. These networks are then analyzed from a graph theoretical perspective, and their biological viability is critiqued by comparing the networks' graph theoretical properties to those of other biological systems. The results of topological perturbation analyses revealing commonalities in behavior at multiple levels of complexity are also presented, and are shown to be an invaluable means by which to ascertain the level of complexity to which the network inference process is robust to noise. Chapter 5 outlines a learning algorithm for the development of a realistic, evolving social network (a city) into which a disease is introduced. The results of simulations in populations spanning two orders of magnitude are compared to prevaccine era measles data for England and Wales and demonstrate that the simulations are able to capture the quantitative and qualitative features of epidemics in populations as small as 10,000 people. The work presented in Chapter 5 validates the utility of network simulation in concurrently probing contact network dynamics and disease dynamics.
The common transcriptional subnetworks of the grape berry skin in the late stages of ripening.
Ghan, Ryan; Petereit, Juli; Tillett, Richard L; Schlauch, Karen A; Toubiana, David; Fait, Aaron; Cramer, Grant R
2017-05-30
Wine grapes are important economically in many countries around the world. Defining the optimum time for grape harvest is a major challenge to the grower and winemaker. Berry skins are an important source of flavor, color and other quality traits in the ripening stage. Senescent-like processes such as chloroplast disorganization and cell death characterize the late ripening stage. To better understand the molecular and physiological processes involved in the late stages of berry ripening, RNA-seq analysis of the skins of seven wine grape cultivars (Cabernet Franc, Cabernet Sauvignon, Merlot, Pinot Noir, Chardonnay, Sauvignon Blanc and Semillon) was performed. RNA-seq analysis identified approximately 2000 common differentially expressed genes for all seven cultivars across four different berry sugar levels (20 to 26 °Brix). Network analyses, both a posteriori (standard) and a priori (gene co-expression network analysis), were used to elucidate transcriptional subnetworks and hub genes associated with traits in the berry skins of the late stages of berry ripening. These independent approaches revealed genes involved in photosynthesis, catabolism, and nucleotide metabolism. The transcript abundance of most photosynthetic genes declined with increasing sugar levels in the berries. The transcript abundance of other processes increased such as nucleic acid metabolism, chromosome organization and lipid catabolism. Weighted gene co-expression network analysis (WGCNA) identified 64 gene modules that were organized into 12 subnetworks of three modules or more and six higher order gene subnetworks. Some gene subnetworks were highly correlated with sugar levels and some subnetworks were highly enriched in the chloroplast and nucleus. The petal R package was utilized independently to construct a true small-world and scale-free complex gene co-expression network model. A subnetwork of 216 genes with the highest connectivity was elucidated, consistent with the module results from WGCNA. Hub genes in these subnetworks were identified including numerous members of the core circadian clock, RNA splicing, proteolysis and chromosome organization. An integrated model was constructed linking light sensing with alternative splicing, chromosome remodeling and the circadian clock. A common set of differentially expressed genes and gene subnetworks from seven different cultivars were examined in the skin of the late stages of grapevine berry ripening. A densely connected gene subnetwork was elucidated involving a complex interaction of berry senescent processes (autophagy), catabolism, the circadian clock, RNA splicing, proteolysis and epigenetic regulation. Hypotheses were induced from these data sets involving sugar accumulation, light, autophagy, epigenetic regulation, and fruit development. This work provides a better understanding of berry development and the transcriptional processes involved in the late stages of ripening.
From pull-down data to protein interaction networks and complexes with biological relevance.
Zhang, Bing; Park, Byung-Hoon; Karpinets, Tatiana; Samatova, Nagiza F
2008-04-01
Recent improvements in high-throughput Mass Spectrometry (MS) technology have expedited genome-wide discovery of protein-protein interactions by providing a capability of detecting protein complexes in a physiological setting. Computational inference of protein interaction networks and protein complexes from MS data are challenging. Advances are required in developing robust and seamlessly integrated procedures for assessment of protein-protein interaction affinities, mathematical representation of protein interaction networks, discovery of protein complexes and evaluation of their biological relevance. A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data is introduced. It assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data. It constructs a protein interaction network by adopting a knowledge-guided threshold selection method. Based on the network, it identifies protein complexes and infers their core components using a graph-theoretical approach. It deploys a statistical evaluation procedure to assess biological relevance of each found complex. On Saccharomyces cerevisiae pull-down data, the framework outperformed other more complicated schemes by at least 10% in F(1)-measure and identified 610 protein complexes with high-functional homogeneity based on the enrichment in Gene Ontology (GO) annotation. Manual examination of the complexes brought forward the hypotheses on cause of false identifications. Namely, co-purification of different protein complexes as mediated by a common non-protein molecule, such as DNA, might be a source of false positives. Protein identification bias in pull-down technology, such as the hydrophilic bias could result in false negatives.
Vermeirssen, Vanessa; De Clercq, Inge; Van Parys, Thomas; Van Breusegem, Frank; Van de Peer, Yves
2014-01-01
The abiotic stress response in plants is complex and tightly controlled by gene regulation. We present an abiotic stress gene regulatory network of 200,014 interactions for 11,938 target genes by integrating four complementary reverse-engineering solutions through average rank aggregation on an Arabidopsis thaliana microarray expression compendium. This ensemble performed the most robustly in benchmarking and greatly expands upon the availability of interactions currently reported. Besides recovering 1182 known regulatory interactions, cis-regulatory motifs and coherent functionalities of target genes corresponded with the predicted transcription factors. We provide a valuable resource of 572 abiotic stress modules of coregulated genes with functional and regulatory information, from which we deduced functional relationships for 1966 uncharacterized genes and many regulators. Using gain- and loss-of-function mutants of seven transcription factors grown under control and salt stress conditions, we experimentally validated 141 out of 271 predictions (52% precision) for 102 selected genes and mapped 148 additional transcription factor-gene regulatory interactions (49% recall). We identified an intricate core oxidative stress regulatory network where NAC13, NAC053, ERF6, WRKY6, and NAC032 transcription factors interconnect and function in detoxification. Our work shows that ensemble reverse-engineering can generate robust biological hypotheses of gene regulation in a multicellular eukaryote that can be tested by medium-throughput experimental validation. PMID:25549671
Vermeirssen, Vanessa; De Clercq, Inge; Van Parys, Thomas; Van Breusegem, Frank; Van de Peer, Yves
2014-12-01
The abiotic stress response in plants is complex and tightly controlled by gene regulation. We present an abiotic stress gene regulatory network of 200,014 interactions for 11,938 target genes by integrating four complementary reverse-engineering solutions through average rank aggregation on an Arabidopsis thaliana microarray expression compendium. This ensemble performed the most robustly in benchmarking and greatly expands upon the availability of interactions currently reported. Besides recovering 1182 known regulatory interactions, cis-regulatory motifs and coherent functionalities of target genes corresponded with the predicted transcription factors. We provide a valuable resource of 572 abiotic stress modules of coregulated genes with functional and regulatory information, from which we deduced functional relationships for 1966 uncharacterized genes and many regulators. Using gain- and loss-of-function mutants of seven transcription factors grown under control and salt stress conditions, we experimentally validated 141 out of 271 predictions (52% precision) for 102 selected genes and mapped 148 additional transcription factor-gene regulatory interactions (49% recall). We identified an intricate core oxidative stress regulatory network where NAC13, NAC053, ERF6, WRKY6, and NAC032 transcription factors interconnect and function in detoxification. Our work shows that ensemble reverse-engineering can generate robust biological hypotheses of gene regulation in a multicellular eukaryote that can be tested by medium-throughput experimental validation. © 2014 American Society of Plant Biologists. All rights reserved.
NASA Technical Reports Server (NTRS)
Kim, Myoung K.; Jeon, Jae-Heung; Davin, Laurence B.; Lewis, Norman G.
2002-01-01
The discovery of a nine-member multigene dirigent family involved in control of monolignol radical-radical coupling in the ancient gymnosperm, western red cedar, suggested that a complex multidimensional network had evolved to regulate such processes in vascular plants. Accordingly, in this study, the corresponding promoter regions for each dirigent multigene member were obtained by genome-walking, with Arabidopsis being subsequently transformed to express each promoter fused to the beta-glucuronidase (GUS) reporter gene. It was found that each component gene of the proposed network is apparently differentially expressed in individual tissues, organs and cells at all stages of plant growth and development. The data so obtained thus further support the hypothesis that a sophisticated monolignol radical-radical coupling network exists in plants which has been highly conserved throughout vascular plant evolution.
A transcription factor hierarchy defines an environmental stress response network.
Song, Liang; Huang, Shao-Shan Carol; Wise, Aaron; Castanon, Rosa; Nery, Joseph R; Chen, Huaming; Watanabe, Marina; Thomas, Jerushah; Bar-Joseph, Ziv; Ecker, Joseph R
2016-11-04
Environmental stresses are universally encountered by microbes, plants, and animals. Yet systematic studies of stress-responsive transcription factor (TF) networks in multicellular organisms have been limited. The phytohormone abscisic acid (ABA) influences the expression of thousands of genes, allowing us to characterize complex stress-responsive regulatory networks. Using chromatin immunoprecipitation sequencing, we identified genome-wide targets of 21 ABA-related TFs to construct a comprehensive regulatory network in Arabidopsis thaliana Determinants of dynamic TF binding and a hierarchy among TFs were defined, illuminating the relationship between differential gene expression patterns and ABA pathway feedback regulation. By extrapolating regulatory characteristics of observed canonical ABA pathway components, we identified a new family of transcriptional regulators modulating ABA and salt responsiveness and demonstrated their utility to modulate plant resilience to osmotic stress. Copyright © 2016, American Association for the Advancement of Science.
Fung, David C Y; Wilkins, Marc R; Hart, David; Hong, Seok-Hee
2010-07-01
The force-directed layout is commonly used in computer-generated visualizations of protein-protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein-protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.
Extending gene ontology with gene association networks.
Peng, Jiajie; Wang, Tao; Wang, Jixuan; Wang, Yadong; Chen, Jin
2016-04-15
Gene ontology (GO) is a widely used resource to describe the attributes for gene products. However, automatic GO maintenance remains to be difficult because of the complex logical reasoning and the need of biological knowledge that are not explicitly represented in the GO. The existing studies either construct whole GO based on network data or only infer the relations between existing GO terms. None is purposed to add new terms automatically to the existing GO. We proposed a new algorithm 'GOExtender' to efficiently identify all the connected gene pairs labeled by the same parent GO terms. GOExtender is used to predict new GO terms with biological network data, and connect them to the existing GO. Evaluation tests on biological process and cellular component categories of different GO releases showed that GOExtender can extend new GO terms automatically based on the biological network. Furthermore, we applied GOExtender to the recent release of GO and discovered new GO terms with strong support from literature. Software and supplementary document are available at www.msu.edu/%7Ejinchen/GOExtender jinchen@msu.edu or ydwang@hit.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Roy, Sarah H; Tobin, David V; Memar, Nadin; Beltz, Eleanor; Holmen, Jenna; Clayton, Joseph E; Chiu, Daniel J; Young, Laura D; Green, Travis H; Lubin, Isabella; Liu, Yuying; Conradt, Barbara; Saito, R Mako
2014-02-28
The development and homeostasis of multicellular animals requires precise coordination of cell division and differentiation. We performed a genome-wide RNA interference screen in Caenorhabditis elegans to reveal the components of a regulatory network that promotes developmentally programmed cell-cycle quiescence. The 107 identified genes are predicted to constitute regulatory networks that are conserved among higher animals because almost half of the genes are represented by clear human orthologs. Using a series of mutant backgrounds to assess their genetic activities, the RNA interference clones displaying similar properties were clustered to establish potential regulatory relationships within the network. This approach uncovered four distinct genetic pathways controlling cell-cycle entry during intestinal organogenesis. The enhanced phenotypes observed for animals carrying compound mutations attest to the collaboration between distinct mechanisms to ensure strict developmental regulation of cell cycles. Moreover, we characterized ubc-25, a gene encoding an E2 ubiquitin-conjugating enzyme whose human ortholog, UBE2Q2, is deregulated in several cancers. Our genetic analyses suggested that ubc-25 acts in a linear pathway with cul-1/Cul1, in parallel to pathways employing cki-1/p27 and lin-35/pRb to promote cell-cycle quiescence. Further investigation of the potential regulatory mechanism demonstrated that ubc-25 activity negatively regulates CYE-1/cyclin E protein abundance in vivo. Together, our results show that the ubc-25-mediated pathway acts within a complex network that integrates the actions of multiple molecular mechanisms to control cell cycles during development. Copyright © 2014 Roy et al.
Iida, M; Takemoto, K
2018-09-30
Environmental contaminant exposure can pose significant risks to human health. Therefore, evaluating the impact of this exposure is of great importance; however, it is often difficult because both the molecular mechanism of disease and the mode of action of the contaminants are complex. We used network biology techniques to quantitatively assess the impact of environmental contaminants on the human interactome and diseases with a particular focus on seven major contaminant categories: persistent organic pollutants (POPs), dioxins, polycyclic aromatic hydrocarbons (PAHs), pesticides, perfluorochemicals (PFCs), metals, and pharmaceutical and personal care products (PPCPs). We integrated publicly available data on toxicogenomics, the diseasome, protein-protein interactions (PPIs), and gene essentiality and found that a few contaminants were targeted to many genes, and a few genes were targeted by many contaminants. The contaminant targets were hub proteins in the human PPI network, whereas the target proteins in most categories did not contain abundant essential proteins. Generally, contaminant targets and disease-associated proteins were closely associated with the PPI network, and the closeness of the associations depended on the disease type and chemical category. Network biology techniques were used to identify environmental contaminants with broad effects on the human interactome and contaminant-sensitive biomarkers. Moreover, this method enabled us to quantify the relationship between environmental contaminants and human diseases, which was supported by epidemiological and experimental evidence. These methods and findings have facilitated the elucidation of the complex relationship between environmental exposure and adverse health outcomes. Copyright © 2018 Elsevier Inc. All rights reserved.
Genetic interaction networks: better understand to better predict
Boucher, Benjamin; Jenna, Sarah
2013-01-01
A genetic interaction (GI) between two genes generally indicates that the phenotype of a double mutant differs from what is expected from each individual mutant. In the last decade, genome scale studies of quantitative GIs were completed using mainly synthetic genetic array technology and RNA interference in yeast and Caenorhabditis elegans. These studies raised questions regarding the functional interpretation of GIs, the relationship of genetic and molecular interaction networks, the usefulness of GI networks to infer gene function and co-functionality, the evolutionary conservation of GI, etc. While GIs have been used for decades to dissect signaling pathways in genetic models, their functional interpretations are still not trivial. The existence of a GI between two genes does not necessarily imply that these two genes code for interacting proteins or that the two genes are even expressed in the same cell. In fact, a GI only implies that the two genes share a functional relationship. These two genes may be involved in the same biological process or pathway; or they may also be involved in compensatory pathways with unrelated apparent function. Considering the powerful opportunity to better understand gene function, genetic relationship, robustness and evolution, provided by a genome-wide mapping of GIs, several in silico approaches have been employed to predict GIs in unicellular and multicellular organisms. Most of these methods used weighted data integration. In this article, we will review the later knowledge acquired on GI networks in metazoans by looking more closely into their relationship with pathways, biological processes and molecular complexes but also into their modularity and organization. We will also review the different in silico methods developed to predict GIs and will discuss how the knowledge acquired on GI networks can be used to design predictive tools with higher performances. PMID:24381582
A gene network simulator to assess reverse engineering algorithms.
Di Camillo, Barbara; Toffolo, Gianna; Cobelli, Claudio
2009-03-01
In the context of reverse engineering of biological networks, simulators are helpful to test and compare the accuracy of different reverse-engineering approaches in a variety of experimental conditions. A novel gene-network simulator is presented that resembles some of the main features of transcriptional regulatory networks related to topology, interaction among regulators of transcription, and expression dynamics. The simulator generates network topology according to the current knowledge of biological network organization, including scale-free distribution of the connectivity and clustering coefficient independent of the number of nodes in the network. It uses fuzzy logic to represent interactions among the regulators of each gene, integrated with differential equations to generate continuous data, comparable to real data for variety and dynamic complexity. Finally, the simulator accounts for saturation in the response to regulation and transcription activation thresholds and shows robustness to perturbations. It therefore provides a reliable and versatile test bed for reverse engineering algorithms applied to microarray data. Since the simulator describes regulatory interactions and expression dynamics as two distinct, although interconnected aspects of regulation, it can also be used to test reverse engineering approaches that use both microarray and protein-protein interaction data in the process of learning. A first software release is available at http://www.dei.unipd.it/~dicamill/software/netsim as an R programming language package.
Modularity and design principles in the sea urchin embryo gene regulatory network
Peter, Isabelle S.; Davidson, Eric H.
2010-01-01
The gene regulatory network (GRN) established experimentally for the pre-gastrular sea urchin embryo provides causal explanations of the biological functions required for spatial specification of embryonic regulatory states. Here we focus on the structure of the GRN which controls the progressive increase in complexity of territorial regulatory states during embryogenesis; and on the types of modular subcircuits of which the GRN is composed. Each of these subcircuit topologies executes a particular operation of spatial information processing. The GRN architecture reflects the particular mode of embryogenesis represented by sea urchin development. Network structure not only specifies the linkages constituting the genomic regulatory code for development, but also indicates the various regulatory requirements of regional developmental processes. PMID:19932099
Martínez-Romero, Marcos; Vázquez-Naya, José M; Rabuñal, Juan R; Pita-Fernández, Salvador; Macenlle, Ramiro; Castro-Alvariño, Javier; López-Roses, Leopoldo; Ulla, José L; Martínez-Calvo, Antonio V; Vázquez, Santiago; Pereira, Javier; Porto-Pazos, Ana B; Dorado, Julián; Pazos, Alejandro; Munteanu, Cristian R
2010-05-01
Colorectal cancer is one of the most frequent types of cancer in the world and generates important social impact. The understanding of the specific metabolism of this disease and the transformations of the specific drugs will allow finding effective prevention, diagnosis and treatment of the colorectal cancer. All the terms that describe the drug metabolism contribute to the construction of ontology in order to help scientists to link the correlated information and to find the most useful data about this topic. The molecular components involved in this metabolism are included in complex network such as metabolic pathways in order to describe all the molecular interactions in the colorectal cancer. The graphical method of processing biological information such as graphs and complex networks leads to the numerical characterization of the colorectal cancer drug metabolic network by using invariant values named topological indices. Thus, this method can help scientists to study the most important elements in the metabolic pathways and the dynamics of the networks during mutations, denaturation or evolution for any type of disease. This review presents the last studies regarding ontology and complex networks of the colorectal cancer drug metabolism and a basic topology characterization of the drug metabolic process sub-ontology from the Gene Ontology.
Cytoscape: a software environment for integrated models of biomolecular interaction networks.
Shannon, Paul; Markiel, Andrew; Ozier, Owen; Baliga, Nitin S; Wang, Jonathan T; Ramage, Daniel; Amin, Nada; Schwikowski, Benno; Ideker, Trey
2003-11-01
Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
The genetic basis of alcoholism: multiple phenotypes, many genes, complex networks.
Morozova, Tatiana V; Goldman, David; Mackay, Trudy F C; Anholt, Robert R H
2012-02-20
Alcoholism is a significant public health problem. A picture of the genetic architecture underlying alcohol-related phenotypes is emerging from genome-wide association studies and work on genetically tractable model organisms.
SWIM: a computational tool to unveiling crucial nodes in complex biological networks
Paci, Paola; Colombo, Teresa; Fiscon, Giulia; Gurtner, Aymone; Pavesi, Giulio; Farina, Lorenzo
2017-01-01
SWItchMiner (SWIM) is a wizard-like software implementation of a procedure, previously described, able to extract information contained in complex networks. Specifically, SWIM allows unearthing the existence of a new class of hubs, called “fight-club hubs”, characterized by a marked negative correlation with their first nearest neighbors. Among them, a special subset of genes, called “switch genes”, appears to be characterized by an unusual pattern of intra- and inter-module connections that confers them a crucial topological role, interestingly mirrored by the evidence of their clinic-biological relevance. Here, we applied SWIM to a large panel of cancer datasets from The Cancer Genome Atlas, in order to highlight switch genes that could be critically associated with the drastic changes in the physiological state of cells or tissues induced by the cancer development. We discovered that switch genes are found in all cancers we studied and they encompass protein coding genes and non-coding RNAs, recovering many known key cancer players but also many new potential biomarkers not yet characterized in cancer context. Furthermore, SWIM is amenable to detect switch genes in different organisms and cell conditions, with the potential to uncover important players in biologically relevant scenarios, including but not limited to human cancer. PMID:28317894
Yeap, Wan-Chin; Lee, Fong-Chin; Shabari Shan, Dilip Kumar; Musa, Hamidah; Appleton, David Ross; Kulaveerasingam, Harikrishna
2017-07-01
The oil biosynthesis pathway must be tightly controlled to maximize oil yield. Oil palm accumulates exceptionally high oil content in its mesocarp, suggesting the existence of a unique fruit-specific fatty acid metabolism transcriptional network. We report the complex fruit-specific network of transcription factors responsible for modulation of oil biosynthesis genes in oil palm mesocarp. Transcriptional activation of EgWRI1-1 encoding a key master regulator that activates expression of oil biosynthesis genes, is activated by three ABA-responsive transcription factors, EgNF-YA3, EgNF-YC2 and EgABI5. Overexpression of EgWRI1-1 and its activators in Arabidopsis accelerated flowering, increased seed size and oil content, and altered expression levels of oil biosynthesis genes. Protein-protein interaction experiments demonstrated that EgNF-YA3 interacts directly with EgWRI1-1, forming a transcription complex with EgNF-YC2 and EgABI5 to modulate transcription of oil biosynthesis pathway genes. Furthermore, EgABI5 acts downstream of EgWRKY40, a repressor that interacts with EgWRKY2 to inhibit the transcription of oil biosynthesis genes. We showed that expression of these activators and repressors in oil biosynthesis can be induced by phytohormones coordinating fruit development in oil palm. We propose a model highlighting a hormone signaling network coordinating fruit development and fatty acid biosynthesis. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
2012-01-01
Background The use of biological molecular network information for diagnostic and prognostic purposes and elucidation of molecular disease mechanism is a key objective in systems biomedicine. The network of regulatory miRNA-target and functional protein interactions is a rich source of information to elucidate the function and the prognostic value of miRNAs in cancer. The objective of this study is to identify miRNAs that have high influence on target protein complexes in prostate cancer as a case study. This could provide biomarkers or therapeutic targets relevant for prostate cancer treatment. Results Our findings demonstrate that a miRNA’s functional role can be explained by its target protein connectivity within a physical and functional interaction network. To detect miRNAs with high influence on target protein modules, we integrated miRNA and mRNA expression profiles with a sequence based miRNA-target network and human functional and physical protein interactions (FPI). miRNAs with high influence on target protein complexes play a role in prostate cancer progression and are promising diagnostic or prognostic biomarkers. We uncovered several miRNA-regulated protein modules which were enriched in focal adhesion and prostate cancer genes. Several miRNAs such as miR-96, miR-182, and miR-143 demonstrated high influence on their target protein complexes and could explain most of the gene expression changes in our analyzed prostate cancer data set. Conclusions We describe a novel method to identify active miRNA-target modules relevant to prostate cancer progression and outcome. miRNAs with high influence on protein networks are valuable biomarkers that can be used in clinical investigations for prostate cancer treatment. PMID:22929553
SOX9 Duplication Linked to Intersex in Deer
Kropatsch, Regina; Dekomien, Gabriele; Akkad, Denis A.; Gerding, Wanda M.; Petrasch-Parwez, Elisabeth; Young, Neil D.; Altmüller, Janine; Nürnberg, Peter; Gasser, Robin B.; Epplen, Jörg T.
2013-01-01
A complex network of genes determines sex in mammals. Here, we studied a European roe deer with an intersex phenotype that was consistent with a XY genotype with incomplete male-determination. Whole genome sequencing and quantitative real-time PCR analyses revealed a triple dose of the SOX9 gene, allowing insights into a new genetic defect in a wild animal. PMID:24040047
Monte Carlo simulation of a simple gene network yields new evolutionary insights.
Andrecut, M; Cloud, D; Kauffman, S A
2008-02-07
Monte Carlo simulations of a genetic toggle switch show that its behavior can be more complex than analytic models would suggest. We show here that as a result of the interplay between frequent and infrequent reaction events, such a switch can have more stable states than an analytic model would predict, and that the number and character of these states depend to a large extent on the propensity of transcription factors to bind to and dissociate from promoters. The effects of gene duplications differ even more; in analytic models, these seem to result in the disappearance of bi-stability and thus a loss of the switching function, but a Monte Carlo simulation shows that they can result in the appearance of new stable states without the loss of old ones, and thus in an increase of the complexity of the switch's behavior which may facilitate the evolution of new cellular functions. These differences are of interest with respect to the evolution of gene networks, particularly in clonal lines of cancer cells, where the duplication of active genes is an extremely common event, and often seems to result in the appearance of viable new cellular phenotypes.
Network Reconstruction Using Nonparametric Additive ODE Models
Henderson, James; Michailidis, George
2014-01-01
Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative estimation. PMID:24732037
Su, Lining; Wang, Chunjie; Zheng, Chenqing; Wei, Huiping; Song, Xiaoqing
2018-04-13
Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation. We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network. Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs. Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our study can determine whether a genetic variant is associated with PD. Overall, these findings will help guide our study of the complex molecular mechanism of PD.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Graphite Web: web tool for gene set analysis exploiting pathway topology
Sales, Gabriele; Calura, Enrica; Martini, Paolo; Romualdi, Chiara
2013-01-01
Graphite web is a novel web tool for pathway analyses and network visualization for gene expression data of both microarray and RNA-seq experiments. Several pathway analyses have been proposed either in the univariate or in the global and multivariate context to tackle the complexity and the interpretation of expression results. These methods can be further divided into ‘topological’ and ‘non-topological’ methods according to their ability to gain power from pathway topology. Biological pathways are, in fact, not only gene lists but can be represented through a network where genes and connections are, respectively, nodes and edges. To this day, the most used approaches are non-topological and univariate although they miss the relationship among genes. On the contrary, topological and multivariate approaches are more powerful, but difficult to be used by researchers without bioinformatic skills. Here we present Graphite web, the first public web server for pathway analysis on gene expression data that combines topological and multivariate pathway analyses with an efficient system of interactive network visualizations for easy results interpretation. Specifically, Graphite web implements five different gene set analyses on three model organisms and two pathway databases. Graphite Web is freely available at http://graphiteweb.bio.unipd.it/. PMID:23666626
Core regulatory network motif underlies the ocellar complex patterning in Drosophila melanogaster
NASA Astrophysics Data System (ADS)
Aguilar-Hidalgo, D.; Lemos, M. C.; Córdoba, A.
2015-03-01
During organogenesis, developmental programs governed by Gene Regulatory Networks (GRN) define the functionality, size and shape of the different constituents of living organisms. Robustness, thus, is an essential characteristic that GRNs need to fulfill in order to maintain viability and reproducibility in a species. In the present work we analyze the robustness of the patterning for the ocellar complex formation in Drosophila melanogaster fly. We have systematically pruned the GRN that drives the development of this visual system to obtain the minimum pathway able to satisfy this pattern. We found that the mechanism underlying the patterning obeys to the dynamics of a 3-nodes network motif with a double negative feedback loop fed by a morphogenetic gradient that triggers the inhibition in a French flag problem fashion. A Boolean modeling of the GRN confirms robustness in the patterning mechanism showing the same result for different network complexity levels. Interestingly, the network provides a steady state solution in the interocellar part of the patterning and an oscillatory regime in the ocelli. This theoretical result predicts that the ocellar pattern may underlie oscillatory dynamics in its genetic regulation.
Stojanova, Angelina; Tu, William B.; Ponzielli, Romina; Kotlyar, Max; Chan, Pak-Kei; Boutros, Paul C.; Khosravi, Fereshteh; Jurisica, Igor; Raught, Brian; Penn, Linda Z.
2016-01-01
ABSTRACT MYC is a key driver of cellular transformation and is deregulated in most human cancers. Studies of MYC and its interactors have provided mechanistic insight into its role as a regulator of gene transcription. MYC has been previously linked to chromatin regulation through its interaction with INI1 (SMARCB1/hSNF5/BAF47), a core member of the SWI/SNF chromatin remodeling complex. INI1 is a potent tumor suppressor that is inactivated in several types of cancers, most prominently as the hallmark alteration in pediatric malignant rhabdoid tumors. However, the molecular and functional interaction of MYC and INI1 remains unclear. Here, we characterize the MYC-INI1 interaction in mammalian cells, mapping their minimal binding domains to functionally significant regions of MYC (leucine zipper) and INI1 (repeat motifs), and demonstrating that the interaction does not interfere with MYC-MAX interaction. Protein-protein interaction network analysis expands the MYC-INI1 interaction to the SWI/SNF complex and a larger network of chromatin regulatory complexes. Genome-wide analysis reveals that the DNA-binding regions and target genes of INI1 significantly overlap with those of MYC. In an INI1-deficient rhabdoid tumor system, we observe that with re-expression of INI1, MYC and INI1 bind to common target genes and have opposing effects on gene expression. Functionally, INI1 re-expression suppresses cell proliferation and MYC-potentiated transformation. Our findings thus establish the antagonistic roles of the INI1 and MYC transcriptional regulators in mediating cellular and oncogenic functions. PMID:27267444
Mu, Chuang; Wang, Ruijia; Li, Tianqi; Li, Yuqiang; Tian, Meilin; Jiao, Wenqian; Huang, Xiaoting; Zhang, Lingling; Hu, Xiaoli; Wang, Shi; Bao, Zhenmin
2016-08-01
Long non-coding RNA (lncRNA) structurally resembles mRNA but cannot be translated into protein. Although the systematic identification and characterization of lncRNAs have been increasingly reported in model species, information concerning non-model species is still lacking. Here, we report the first systematic identification and characterization of lncRNAs in two sea cucumber species: (1) Apostichopus japonicus during lipopolysaccharide (LPS) challenge and in heathy tissues and (2) Holothuria glaberrima during radial organ complex regeneration, using RNA-seq datasets and bioinformatics analysis. We identified A. japonicus and H. glaberrima lncRNAs that were differentially expressed during LPS challenge and radial organ complex regeneration, respectively. Notably, the predicted lncRNA-microRNA-gene trinities revealed that, in addition to targeting protein-coding transcripts, miRNAs might also target lncRNAs, thereby participating in a potential novel layer of regulatory interactions among non-coding RNA classes in echinoderms. Furthermore, the constructed coding-non-coding network implied the potential involvement of lncRNA-gene interactions during the regulation of several important genes (e.g., Toll-like receptor 1 [TLR1] and transglutaminase-1 [TGM1]) in response to LPS challenge and radial organ complex regeneration in sea cucumbers. Overall, this pioneer systematic identification, annotation, and characterization of lncRNAs in echinoderm pave the way for similar studies and future genetic, genomic, and evolutionary research in non-model species.
Regeneration in the era of functional genomics and gene network analysis.
Smith, Joel; Morgan, Jennifer R; Zottoli, Steven J; Smith, Peter J; Buxbaum, Joseph D; Bloom, Ona E
2011-08-01
What gives an organism the ability to regrow tissues and to recover function where another organism fails is the central problem of regenerative biology. The challenge is to describe the mechanisms of regeneration at the molecular level, delivering detailed insights into the many components that are cross-regulated. In other words, a broad, yet deep dissection of the system-wide network of molecular interactions is needed. Functional genomics has been used to elucidate gene regulatory networks (GRNs) in developing tissues, which, like regeneration, are complex systems. Therefore, we reason that the GRN approach, aided by next generation technologies, can also be applied to study the molecular mechanisms underlying the complex functions of regeneration. We ask what characteristics a model system must have to support a GRN analysis. Our discussion focuses on regeneration in the central nervous system, where loss of function has particularly devastating consequences for an organism. We examine a cohort of cells conserved across all vertebrates, the reticulospinal (RS) neurons, which lend themselves well to experimental manipulations. In the lamprey, a jawless vertebrate, there are giant RS neurons whose large size and ability to regenerate make them particularly suited for a GRN analysis. Adding to their value, a distinct subset of lamprey RS neurons reproducibly fail to regenerate, presenting an opportunity for side-by-side comparison of gene networks that promote or inhibit regeneration. Thus, determining the GRN for regeneration in RS neurons will provide a mechanistic understanding of the fundamental cues that lead to success or failure to regenerate.
Deconstructing the core dynamics from a complex time-lagged regulatory biological circuit.
Eriksson, O; Brinne, B; Zhou, Y; Björkegren, J; Tegnér, J
2009-03-01
Complex regulatory dynamics is ubiquitous in molecular networks composed of genes and proteins. Recent progress in computational biology and its application to molecular data generate a growing number of complex networks. Yet, it has been difficult to understand the governing principles of these networks beyond graphical analysis or extensive numerical simulations. Here the authors exploit several simplifying biological circumstances which thereby enable to directly detect the underlying dynamical regularities driving periodic oscillations in a dynamical nonlinear computational model of a protein-protein network. System analysis is performed using the cell cycle, a mathematically well-described complex regulatory circuit driven by external signals. By introducing an explicit time delay and using a 'tearing-and-zooming' approach the authors reduce the system to a piecewise linear system with two variables that capture the dynamics of this complex network. A key step in the analysis is the identification of functional subsystems by identifying the relations between state-variables within the model. These functional subsystems are referred to as dynamical modules operating as sensitive switches in the original complex model. By using reduced mathematical representations of the subsystems the authors derive explicit conditions on how the cell cycle dynamics depends on system parameters, and can, for the first time, analyse and prove global conditions for system stability. The approach which includes utilising biological simplifying conditions, identification of dynamical modules and mathematical reduction of the model complexity may be applicable to other well-characterised biological regulatory circuits. [Includes supplementary material].
2012-01-01
Background Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. Results Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. Conclusions The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher’s discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions. PMID:22849515
Hybrid stochastic simplifications for multiscale gene networks.
Crudu, Alina; Debussche, Arnaud; Radulescu, Ovidiu
2009-09-07
Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion [1-3] which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Hybrid simplifications can be used for onion-like (multi-layered) approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach.
Impact of environmental inputs on reverse-engineering approach to network structures.
Wu, Jianhua; Sinfield, James L; Buchanan-Wollaston, Vicky; Feng, Jianfeng
2009-12-04
Uncovering complex network structures from a biological system is one of the main topic in system biology. The network structures can be inferred by the dynamical Bayesian network or Granger causality, but neither techniques have seriously taken into account the impact of environmental inputs. With considerations of natural rhythmic dynamics of biological data, we propose a system biology approach to reveal the impact of environmental inputs on network structures. We first represent the environmental inputs by a harmonic oscillator and combine them with Granger causality to identify environmental inputs and then uncover the causal network structures. We also generalize it to multiple harmonic oscillators to represent various exogenous influences. This system approach is extensively tested with toy models and successfully applied to a real biological network of microarray data of the flowering genes of the model plant Arabidopsis Thaliana. The aim is to identify those genes that are directly affected by the presence of the sunlight and uncover the interactive network structures associating with flowering metabolism. We demonstrate that environmental inputs are crucial for correctly inferring network structures. Harmonic causal method is proved to be a powerful technique to detect environment inputs and uncover network structures, especially when the biological data exhibit periodic oscillations.
Pathway mapping and development of disease-specific biomarkers: protein-based network biomarkers
Chen, Hao; Zhu, Zhitu; Zhu, Yichun; Wang, Jian; Mei, Yunqing; Cheng, Yunfeng
2015-01-01
It is known that a disease is rarely a consequence of an abnormality of a single gene, but reflects the interactions of various processes in a complex network. Annotated molecular networks offer new opportunities to understand diseases within a systems biology framework and provide an excellent substrate for network-based identification of biomarkers. The network biomarkers and dynamic network biomarkers (DNBs) represent new types of biomarkers with protein–protein or gene–gene interactions that can be monitored and evaluated at different stages and time-points during development of disease. Clinical bioinformatics as a new way to combine clinical measurements and signs with human tissue-generated bioinformatics is crucial to translate biomarkers into clinical application, validate the disease specificity, and understand the role of biomarkers in clinical settings. In this article, the recent advances and developments on network biomarkers and DNBs are comprehensively reviewed. How network biomarkers help a better understanding of molecular mechanism of diseases, the advantages and constraints of network biomarkers for clinical application, clinical bioinformatics as a bridge to the development of diseases-specific, stage-specific, severity-specific and therapy predictive biomarkers, and the potentials of network biomarkers are also discussed. PMID:25560835
Ahnert, S E; Fink, T M A
2016-07-01
Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the 'function' of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. © 2016 The Authors.
Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.
2013-01-01
SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886
NASA Astrophysics Data System (ADS)
Ehler, Martin; Rajapakse, Vinodh; Zeeberg, Barry; Brooks, Brian; Brown, Jacob; Czaja, Wojciech; Bonner, Robert F.
The gene networks underlying closure of the optic fissure during vertebrate eye development are poorly understood. We used a novel clustering method based on Laplacian Eigenmaps, a nonlinear dimension reduction method, to analyze microarray data from laser capture microdissected (LCM) cells at the site and developmental stages (days 10.5 to 12.5) of optic fissure closure. Our new method provided greater biological specificity than classical clustering algorithms in terms of identifying more biological processes and functions related to eye development as defined by Gene Ontology at lower false discovery rates. This new methodology builds on the advantages of LCM to isolate pure phenotypic populations within complex tissues and allows improved ability to identify critical gene products expressed at lower copy number. The combination of LCM of embryonic organs, gene expression microarrays, and extracting spatial and temporal co-variations appear to be a powerful approach to understanding the gene regulatory networks that specify mammalian organogenesis.
A network approach to predict pathogenic genes for Fusarium graminearum.
Liu, Xiaoping; Tang, Wei-Hua; Zhao, Xing-Ming; Chen, Luonan
2010-10-04
Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB), which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN) of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other pathogenic fungi, which demonstrate the effectiveness of the proposed method. The results presented in this paper not only can provide guidelines for future experimental verification, but also shed light on the pathogenesis of the destructive fungus F. graminearum.
FunCoup 3.0: database of genome-wide functional coupling networks
Schmitt, Thomas; Ogris, Christoph; Sonnhammer, Erik L. L.
2014-01-01
We present an update of the FunCoup database (http://FunCoup.sbc.su.se) of functional couplings, or functional associations, between genes and gene products. Identifying these functional couplings is an important step in the understanding of higher level mechanisms performed by complex cellular processes. FunCoup distinguishes between four classes of couplings: participation in the same signaling cascade, participation in the same metabolic process, co-membership in a protein complex and physical interaction. For each of these four classes, several types of experimental and statistical evidence are combined by Bayesian integration to predict genome-wide functional coupling networks. The FunCoup framework has been completely re-implemented to allow for more frequent future updates. It contains many improvements, such as a regularization procedure to automatically downweight redundant evidences and a novel method to incorporate phylogenetic profile similarity. Several datasets have been updated and new data have been added in FunCoup 3.0. Furthermore, we have developed a new Web site, which provides powerful tools to explore the predicted networks and to retrieve detailed information about the data underlying each prediction. PMID:24185702
FunCoup 3.0: database of genome-wide functional coupling networks.
Schmitt, Thomas; Ogris, Christoph; Sonnhammer, Erik L L
2014-01-01
We present an update of the FunCoup database (http://FunCoup.sbc.su.se) of functional couplings, or functional associations, between genes and gene products. Identifying these functional couplings is an important step in the understanding of higher level mechanisms performed by complex cellular processes. FunCoup distinguishes between four classes of couplings: participation in the same signaling cascade, participation in the same metabolic process, co-membership in a protein complex and physical interaction. For each of these four classes, several types of experimental and statistical evidence are combined by Bayesian integration to predict genome-wide functional coupling networks. The FunCoup framework has been completely re-implemented to allow for more frequent future updates. It contains many improvements, such as a regularization procedure to automatically downweight redundant evidences and a novel method to incorporate phylogenetic profile similarity. Several datasets have been updated and new data have been added in FunCoup 3.0. Furthermore, we have developed a new Web site, which provides powerful tools to explore the predicted networks and to retrieve detailed information about the data underlying each prediction.
The genetic basis of alcoholism: multiple phenotypes, many genes, complex networks
2012-01-01
Alcoholism is a significant public health problem. A picture of the genetic architecture underlying alcohol-related phenotypes is emerging from genome-wide association studies and work on genetically tractable model organisms. PMID:22348705
Genome-wide association and network analysis of lung function in the Framingham Heart Study.
Liao, Shu-Yi; Lin, Xihong; Christiani, David C
2014-09-01
Single nucleotide polymorphisms have been found to be associated with pulmonary function using genome-wide association studies. However, lung function is a complex trait that is likely to be influenced by multiple gene-gene interactions besides individual genes. Our goal is to build a cellular network to explore the relationship between pulmonary function and genotypes by combining SNP level and network analyses using longitudinal lung function data from the Framingham Heart Study. We analyzed 2,698 genotyped participants from the Offspring cohort that had an average of 3.35 spirometry measurements per person for a mean length of 13 years. Repeated forced expiratory volume in one second (FEV1 ) and the ratio of FEV1 to forced vital capacity (FVC) were used as outcomes. Data were analyzed using linear-mixed models for the association between lung function and alleles by accounting for the correlation among repeated measures over time within the same subject and within-family correlation. Network analyses were performed using dmGWAS and validated with data from the Third Generation cohort. Analyses identified SMAD3, TGFBR2, CD44, CTGF, VCAN, CTNNB1, SCGB1A1, PDE4D, NRG1, EPHB1, and LYN as contributors to pulmonary function. Most of these genes were novel that were not found previously using solely SNP-level analysis. These novel genes are involving the transforming growth factor beta (TGFB)-SMAD pathway, Wnt/beta-catenin pathway, etc. Therefore, combining SNP-level and network analyses using longitudinal lung function data is a useful alternative strategy to identify risk genes. © 2014 WILEY PERIODICALS, INC.
Recent Coselection in Human Populations Revealed by Protein–Protein Interaction Network
Qian, Wei; Zhou, Hang; Tang, Kun
2015-01-01
Genome-wide scans for signals of natural selection in human populations have identified a large number of candidate loci that underlie local adaptations. This is surprising given the relatively short evolutionary time since the divergence of the human population. One hypothesis that has not been formally examined is whether and how the recent human evolution may have been shaped by coselection in the context of complex molecular interactome. In this study, genome-wide signals of selection were scanned in East Asians, Europeans, and Africans using 1000 Genome data, and subsequently mapped onto the protein–protein interaction (PPI) network. We found that the candidate genes of recent positive selection localized significantly closer to each other on the PPI network than expected, revealing substantial clustering of selected genes. Furthermore, gene pairs of shorter PPI network distances showed higher similarities of their recent evolutionary paths than those further apart. Last, subnetworks enriched with recent coselection signals were identified, which are substantially overrepresented in biological pathways related to signal transduction, neurogenesis, and immune function. These results provide the first genome-wide evidence for association of recent selection signals with the PPI network, shedding light on the potential mechanisms of recent coselection in the human genome. PMID:25532814
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deng, Ye; Zhang, Ping; Qin, Yujia
When trying to discern network interactions among different species/populations in microbial communities interests have been evoked in recent years, but little information is available about temporal dynamics of microbial network interactions in response to environmental perturbations. We modified the random matrix theory-based network approach to discern network succession in groundwater microbial communities in response to emulsified vegetable oil (EVO) amendment for uranium bioremediation. Groundwater microbial communities from one control and seven monitor wells were analysed with a functional gene array (GeoChip 3.0), and functional molecular ecological networks (fMENs) at different time points were reconstructed. Our results showed that the networkmore » interactions were dramatically altered by EVO amendment. Dynamic and resilient succession was evident: fairly simple at the initial stage (Day 0), increasingly complex at the middle period (Days 4, 17, 31), most complex at Day 80, and then decreasingly complex at a later stage (140–269 days). Unlike previous studies in other habitats, negative interactions predominated in a time-series fMEN, suggesting strong competition among different microbial species in the groundwater systems after EVO injection. In particular, several keystone sulfate-reducing bacteria showed strong negative interactions with their network neighbours. These results provide mechanistic understanding of the decreased phylogenetic diversity during environmental perturbations.« less
Reyes-Gibby, Cielito C; Yuan, Christine; Wang, Jian; Yeung, Sai-Ching J; Shete, Sanjay
2015-06-05
Addictions to alcohol and tobacco, known risk factors for cancer, are complex heritable disorders. Addictive behaviors have a bidirectional relationship with pain. We hypothesize that the associations between alcohol, smoking, and opioid addiction observed in cancer patients have a genetic basis. Therefore, using bioinformatics tools, we explored the underlying genetic basis and identified new candidate genes and common biological pathways for smoking, alcohol, and opioid addiction. Literature search showed 56 genes associated with alcohol, smoking and opioid addiction. Using Core Analysis function in Ingenuity Pathway Analysis software, we found that ERK1/2 was strongly interconnected across all three addiction networks. Genes involved in immune signaling pathways were shown across all three networks. Connect function from IPA My Pathway toolbox showed that DRD2 is the gene common to both the list of genetic variations associated with all three addiction phenotypes and the components of the brain neuronal signaling network involved in substance addiction. The top canonical pathways associated with the 56 genes were: 1) calcium signaling, 2) GPCR signaling, 3) cAMP-mediated signaling, 4) GABA receptor signaling, and 5) G-alpha i signaling. Cancer patients are often prescribed opioids for cancer pain thus increasing their risk for opioid abuse and addiction. Our findings provide candidate genes and biological pathways underlying addiction phenotypes, which may be future targets for treatment of addiction. Further study of the variations of the candidate genes could allow physicians to make more informed decisions when treating cancer pain with opioid analgesics.
A network approach to analyzing highly recombinant malaria parasite genes.
Larremore, Daniel B; Clauset, Aaron; Buckee, Caroline O
2013-01-01
The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.
A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes
Larremore, Daniel B.; Clauset, Aaron; Buckee, Caroline O.
2013-01-01
The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences. PMID:24130474
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment.
Yao, Yao; Storme, Veronique; Marchal, Kathleen; Van de Peer, Yves
2016-01-01
We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population.
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment
Yao, Yao; Storme, Veronique; Marchal, Kathleen
2016-01-01
We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population. PMID:28028477
2018-01-01
Signaling pathways represent parts of the global biological molecular network which connects them into a seamless whole through complex direct and indirect (hidden) crosstalk whose structure can change during development or in pathological conditions. We suggest a novel methodology, called Googlomics, for the structural analysis of directed biological networks using spectral analysis of their Google matrices, using parallels with quantum scattering theory, developed for nuclear and mesoscopic physics and quantum chaos. We introduce analytical “reduced Google matrix” method for the analysis of biological network structure. The method allows inferring hidden causal relations between the members of a signaling pathway or a functionally related group of genes. We investigate how the structure of hidden causal relations can be reprogrammed as a result of changes in the transcriptional network layer during cancerogenesis. The suggested Googlomics approach rigorously characterizes complex systemic changes in the wiring of large causal biological networks in a computationally efficient way. PMID:29370181
Karim, Ahmad Faisal; Chandra, Pallavi; Chopra, Aanchal; Siddiqui, Zaved; Bhaskar, Ashima; Singh, Amit; Kumar, Dhiraj
2011-11-18
Global gene expression profiling has emerged as a major tool in understanding complex response patterns of biological systems to perturbations. However, a lack of unbiased analytical approaches has restricted the utility of complex microarray data to gain novel system level insights. Here we report a strategy, express path analysis (EPA), that helps to establish various pathways differentially recruited to achieve specific cellular responses under contrasting environmental conditions in an unbiased manner. The analysis superimposes differentially regulated genes between contrasting environments onto the network of functional protein associations followed by a series of iterative enrichments and network analysis. To test the utility of the approach, we infected THP1 macrophage cells with a virulent Mycobacterium tuberculosis strain (H37Rv) or the attenuated non-virulent strain H37Ra as contrasting perturbations and generated the temporal global expression profiles. EPA of the results provided details of response-specific and time-dependent host molecular network perturbations. Further analysis identified tyrosine kinase Src as the major regulatory hub discriminating the responses between wild-type and attenuated Mtb infection. We were then able to verify this novel role of Src experimentally and show that Src executes its role through regulating two vital antimicrobial processes of the host cells (i.e. autophagy and acidification of phagolysosome). These results bear significant potential for developing novel anti-tuberculosis therapy. We propose that EPA could prove extremely useful in understanding complex cellular responses for a variety of perturbations, including pathogenic infections.
Feltus, F Alex
2014-06-01
Understanding the control of any trait optimally requires the detection of causal genes, gene interaction, and mechanism of action to discover and model the biochemical pathways underlying the expressed phenotype. Functional genomics techniques, including RNA expression profiling via microarray and high-throughput DNA sequencing, allow for the precise genome localization of biological information. Powerful genetic approaches, including quantitative trait locus (QTL) and genome-wide association study mapping, link phenotype with genome positions, yet genetics is less precise in localizing the relevant mechanistic information encoded in DNA. The coupling of salient functional genomic signals with genetically mapped positions is an appealing approach to discover meaningful gene-phenotype relationships. Techniques used to define this genetic-genomic convergence comprise the field of systems genetics. This short review will address an application of systems genetics where RNA profiles are associated with genetically mapped genome positions of individual genes (eQTL mapping) or as gene sets (co-expression network modules). Both approaches can be applied for knowledge independent selection of candidate genes (and possible control mechanisms) underlying complex traits where multiple, likely unlinked, genomic regions might control specific complex traits. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells
Borel, Christelle; Mudge, Jonathan M.; Howald, Cédric; Foissac, Sylvain; Ucla, Catherine; Chrast, Jacqueline; Ribeca, Paolo; Martin, David; Murray, Ryan R.; Yang, Xinping; Ghamsari, Lila; Lin, Chenwei; Bell, Ian; Dumais, Erica; Drenkow, Jorg; Tress, Michael L.; Gelpí, Josep Lluís; Orozco, Modesto; Valencia, Alfonso; van Berkum, Nynke L.; Lajoie, Bryan R.; Vidal, Marc; Stamatoyannopoulos, John; Batut, Philippe; Dobin, Alex; Harrow, Jennifer; Hubbard, Tim; Dekker, Job; Frankish, Adam; Salehi-Ashtiani, Kourosh; Reymond, Alexandre; Antonarakis, Stylianos E.; Guigó, Roderic; Gingeras, Thomas R.
2012-01-01
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network. PMID:22238572
Analysis of gene expression profile microarray data in complex regional pain syndrome.
Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing
2017-09-01
The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Martin, Timothy M; Wysocki, Beata J; Beyersdorf, Jared P; Wysocki, Tadeusz A; Pannier, Angela K
2014-08-01
Gene delivery systems transport exogenous genetic information to cells or biological systems with the potential to directly alter endogenous gene expression and behavior with applications in functional genomics, tissue engineering, medical devices, and gene therapy. Nonviral systems offer advantages over viral systems because of their low immunogenicity, inexpensive synthesis, and easy modification but suffer from lower transfection levels. The representation of gene transfer using models offers perspective and interpretation of complex cellular mechanisms,including nonviral gene delivery where exact mechanisms are unknown. Here, we introduce a novel telecommunications model of the nonviral gene delivery process in which the delivery of the gene to a cell is synonymous with delivery of a packet of information to a destination computer within a packet-switched computer network. Such a model uses nodes and layers to simplify the complexity of modeling the transfection process and to overcome several challenges of existing models. These challenges include a limited scope and limited time frame, which often does not incorporate biological effects known to affect transfection. The telecommunication model was constructed in MATLAB to model lipoplex delivery of the gene encoding the green fluorescent protein to HeLa cells. Mitosis and toxicity events were included in the model resulting in simulation outputs of nuclear internalization and transfection efficiency that correlated with experimental data. A priori predictions based on model sensitivity analysis suggest that increasing endosomal escape and decreasing lysosomal degradation, protein degradation, and GFP-induced toxicity can improve transfection efficiency by three-fold. Application of the telecommunications model to nonviral gene delivery offers insight into the development of new gene delivery systems with therapeutically relevant transfection levels.
Signor, Sarah A; Arbeitman, Michelle N; Nuzhdin, Sergey V
2016-05-01
Animal development is the product of distinct components and interactions-genes, regulatory networks, and cells-and it exhibits emergent properties that cannot be inferred from the components in isolation. Often the focus is on the genotype-to-phenotype map, overlooking the process of development that turns one into the other. We propose a move toward micro-evolutionary analysis of development, incorporating new tools that enable cell type resolution and single-cell microscopy. Using the sex determination pathway in Drosophila to illustrate potential avenues of research, we highlight some of the questions that these emerging technologies can address. For example, they provide an unprecedented opportunity to study heterogeneity within cell populations, and the potential to add the dimension of time to gene regulatory network analysis. Challenges still remain in developing methods to analyze this data and to increase the throughput. However this line of research has the potential to bridge the gaps between previously more disparate fields, such as population genetics and development, opening up new avenues of research. © 2016 Wiley Periodicals, Inc.
Jang, Sumin; Choubey, Sandeep; Furchtgott, Leon; Zou, Ling-Nan; Doyle, Adele; Menon, Vilas; Loew, Ethan B; Krostag, Anne-Rachel; Martinez, Refugio A; Madisen, Linda; Levi, Boaz P; Ramanathan, Sharad
2017-01-01
The complexity of gene regulatory networks that lead multipotent cells to acquire different cell fates makes a quantitative understanding of differentiation challenging. Using a statistical framework to analyze single-cell transcriptomics data, we infer the gene expression dynamics of early mouse embryonic stem (mES) cell differentiation, uncovering discrete transitions across nine cell states. We validate the predicted transitions across discrete states using flow cytometry. Moreover, using live-cell microscopy, we show that individual cells undergo abrupt transitions from a naïve to primed pluripotent state. Using the inferred discrete cell states to build a probabilistic model for the underlying gene regulatory network, we further predict and experimentally verify that these states have unique response to perturbations, thus defining them functionally. Our study provides a framework to infer the dynamics of differentiation from single cell transcriptomics data and to build predictive models of the gene regulatory networks that drive the sequence of cell fate decisions during development. DOI: http://dx.doi.org/10.7554/eLife.20487.001 PMID:28296635
Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data
Zhang, Ai-Di; Dai, Shao-Xing; Huang, Jing-Fei
2013-01-01
With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases. PMID:24222897
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders
Li, Jingjing; Shi, Minyi; Ma, Zhihai; Zhao, Shuchun; Euskirchen, Ghia; Ziskin, Jennifer; Urban, Alexander; Hallmayer, Joachim; Snyder, Michael
2014-01-01
Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology. PMID:25549968
2014-01-01
Background Plant secondary metabolites are critical to various biological processes. However, the regulations of these metabolites are complex because of regulatory rewiring or crosstalk. To unveil how regulatory behaviors on secondary metabolism reshape biological processes, we constructed and analyzed a dynamic regulatory network of secondary metabolic pathways in Arabidopsis. Results The dynamic regulatory network was constructed through integrating co-expressed gene pairs and regulatory interactions. Regulatory interactions were either predicted by conserved transcription factor binding sites (TFBSs) or proved by experiments. We found that integrating two data (co-expression and predicted regulatory interactions) enhanced the number of highly confident regulatory interactions by over 10% compared with using single data. The dynamic changes of regulatory network systematically manifested regulatory rewiring to explain the mechanism of regulation, such as in terpenoids metabolism, the regulatory crosstalk of RAV1 (AT1G13260) and ATHB1 (AT3G01470) on HMG1 (hydroxymethylglutaryl-CoA reductase, AT1G76490); and regulation of RAV1 on epoxysqualene biosynthesis and sterol biosynthesis. Besides, we investigated regulatory rewiring with expression, network topology and upstream signaling pathways. Regulatory rewiring was revealed by the variability of genes’ expression: pathway genes and transcription factors (TFs) were significantly differentially expressed under different conditions (such as terpenoids biosynthetic genes in tissue experiments and E2F/DP family members in genotype experiments). Both network topology and signaling pathways supported regulatory rewiring. For example, we discovered correlation among the numbers of pathway genes, TFs and network topology: one-gene pathways (such as δ-carotene biosynthesis) were regulated by a fewer TFs, and were not critical to metabolic network because of their low degrees in topology. Upstream signaling pathways of 50 TFs were identified to comprehend the underlying mechanism of TFs’ regulatory rewiring. Conclusion Overall, this dynamic regulatory network largely improves the understanding of perplexed regulatory rewiring in secondary metabolism in Arabidopsis. PMID:24993737
Maximizing information exchange between complex networks
NASA Astrophysics Data System (ADS)
West, Bruce J.; Geneston, Elvis L.; Grigolini, Paolo
2008-10-01
Science is not merely the smooth progressive interaction of hypothesis, experiment and theory, although it sometimes has that form. More realistically the scientific study of any given complex phenomenon generates a number of explanations, from a variety of perspectives, that eventually requires synthesis to achieve a deep level of insight and understanding. One such synthesis has created the field of out-of-equilibrium statistical physics as applied to the understanding of complex dynamic networks. Over the past forty years the concept of complexity has undergone a metamorphosis. Complexity was originally seen as a consequence of memory in individual particle trajectories, in full agreement with a Hamiltonian picture of microscopic dynamics and, in principle, macroscopic dynamics could be derived from the microscopic Hamiltonian picture. The main difficulty in deriving macroscopic dynamics from microscopic dynamics is the need to take into account the actions of a very large number of components. The existence of events such as abrupt jumps, considered by the conventional continuous time random walk approach to describing complexity was never perceived as conflicting with the Hamiltonian view. Herein we review many of the reasons why this traditional Hamiltonian view of complexity is unsatisfactory. We show that as a result of technological advances, which make the observation of single elementary events possible, the definition of complexity has shifted from the conventional memory concept towards the action of non-Poisson renewal events. We show that the observation of crucial processes, such as the intermittent fluorescence of blinking quantum dots as well as the brain’s response to music, as monitored by a set of electrodes attached to the scalp, has forced investigators to go beyond the traditional concept of complexity and to establish closer contact with the nascent field of complex networks. Complex networks form one of the most challenging areas of modern research overarching all of the traditional scientific disciplines. The transportation networks of planes, highways and railroads; the economic networks of global finance and stock markets; the social networks of terrorism, governments, businesses and churches; the physical networks of telephones, the Internet, earthquakes and global warming and the biological networks of gene regulation, the human body, clusters of neurons and food webs, share a number of apparently universal properties as the networks become increasingly complex. Ubiquitous aspects of such complex networks are the appearance of non-stationary and non-ergodic statistical processes and inverse power-law statistical distributions. Herein we review the traditional dynamical and phase-space methods for modeling such networks as their complexity increases and focus on the limitations of these procedures in explaining complex networks. Of course we will not be able to review the entire nascent field of network science, so we limit ourselves to a review of how certain complexity barriers have been surmounted using newly applied theoretical concepts such as aging, renewal, non-ergodic statistics and the fractional calculus. One emphasis of this review is information transport between complex networks, which requires a fundamental change in perception that we express as a transition from the familiar stochastic resonance to the new concept of complexity matching.
The regulatory software of cellular metabolism.
Segrè, Daniel
2004-06-01
Understanding the regulation of metabolic pathways in the cell is like unraveling the 'software' that is running on the 'hardware' of the metabolic network. Transcriptional regulation of enzymes is an important component of this software. A recent systematic analysis of metabolic gene-expression data in Saccharomyces cerevisiae reveals a complex modular organization of co-expressed genes, which could increase our ability to understand and engineer cellular metabolic functions.
Xu, Ke; Schadt, Eric E.; Pollard, Katherine S.; Roussos, Panos; Dudley, Joel T.
2015-01-01
The population persistence of schizophrenia despite associated reductions in fitness and fecundity suggests that the genetic basis of schizophrenia has a complex evolutionary history. A recent meta-analysis of schizophrenia genome-wide association studies offers novel opportunities for assessment of the evolutionary trajectories of schizophrenia-associated loci. In this study, we hypothesize that components of the genetic architecture of schizophrenia are attributable to human lineage-specific evolution. Our results suggest that schizophrenia-associated loci enrich in genes near previously identified human accelerated regions (HARs). Specifically, we find that genes near HARs conserved in nonhuman primates (pHARs) are enriched for schizophrenia-associated loci, and that pHAR-associated schizophrenia genes are under stronger selective pressure than other schizophrenia genes and other pHAR-associated genes. We further evaluate pHAR-associated schizophrenia genes in regulatory network contexts to investigate associated molecular functions and mechanisms. We find that pHAR-associated schizophrenia genes significantly enrich in a GABA-related coexpression module that was previously found to be differentially regulated in schizophrenia affected individuals versus healthy controls. In another two independent networks constructed from gene expression profiles from prefrontal cortex samples, we find that pHAR-associated schizophrenia genes are located in more central positions and their average path lengths to the other nodes are significantly shorter than those of other schizophrenia genes. Together, our results suggest that HARs are associated with potentially important functional roles in the genetic architecture of schizophrenia. PMID:25681384
Yeung, Ka Yee
2016-01-01
Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface. PMID:27045593
Hung, Ling-Hong; Kristiyanto, Daniel; Lee, Sung Bong; Yeung, Ka Yee
2016-01-01
Reproducibility is vital in science. For complex computational methods, it is often necessary, not just to recreate the code, but also the software and hardware environment to reproduce results. Virtual machines, and container software such as Docker, make it possible to reproduce the exact environment regardless of the underlying hardware and operating system. However, workflows that use Graphical User Interfaces (GUIs) remain difficult to replicate on different host systems as there is no high level graphical software layer common to all platforms. GUIdock allows for the facile distribution of a systems biology application along with its graphics environment. Complex graphics based workflows, ubiquitous in systems biology, can now be easily exported and reproduced on many different platforms. GUIdock uses Docker, an open source project that provides a container with only the absolutely necessary software dependencies and configures a common X Windows (X11) graphic interface on Linux, Macintosh and Windows platforms. As proof of concept, we present a Docker package that contains a Bioconductor application written in R and C++ called networkBMA for gene network inference. Our package also includes Cytoscape, a java-based platform with a graphical user interface for visualizing and analyzing gene networks, and the CyNetworkBMA app, a Cytoscape app that allows the use of networkBMA via the user-friendly Cytoscape interface.
Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco
2016-01-01
Abstract The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks in a wide spectrum of biological systems. PMID:27124473
Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antczak, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco
2016-04-01
The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks in a wide spectrum of biological systems.
Networks of lexical borrowing and lateral gene transfer in language and genome evolution
List, Johann-Mattis; Nelson-Sathi, Shijulal; Geisler, Hans; Martin, William
2014-01-01
Like biological species, languages change over time. As noted by Darwin, there are many parallels between language evolution and biological evolution. Insights into these parallels have also undergone change in the past 150 years. Just like genes, words change over time, and language evolution can be likened to genome evolution accordingly, but what kind of evolution? There are fundamental differences between eukaryotic and prokaryotic evolution. In the former, natural variation entails the gradual accumulation of minor mutations in alleles. In the latter, lateral gene transfer is an integral mechanism of natural variation. The study of language evolution using biological methods has attracted much interest of late, most approaches focusing on language tree construction. These approaches may underestimate the important role that borrowing plays in language evolution. Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution. PMID:24375688
Constructing networks with correlation maximization methods.
Mellor, Joseph C; Wu, Jie; Delisi, Charles
2004-01-01
Problems of inference in systems biology are ideally reduced to formulations which can efficiently represent the features of interest. In the case of predicting gene regulation and pathway networks, an important feature which describes connected genes and proteins is the relationship between active and inactive forms, i.e. between the "on" and "off" states of the components. While not optimal at the limits of resolution, these logical relationships between discrete states can often yield good approximations of the behavior in larger complex systems, where exact representation of measurement relationships may be intractable. We explore techniques for extracting binary state variables from measurement of gene expression, and go on to describe robust measures for statistical significance and information that can be applied to many such types of data. We show how statistical strength and information are equivalent criteria in limiting cases, and demonstrate the application of these measures to simple systems of gene regulation.
Effects of threshold on the topology of gene co-expression networks.
Couto, Cynthia Martins Villar; Comin, César Henrique; Costa, Luciano da Fontoura
2017-09-26
Several developments regarding the analysis of gene co-expression profiles using complex network theory have been reported recently. Such approaches usually start with the construction of an unweighted gene co-expression network, therefore requiring the selection of a suitable threshold defining which pairs of vertices will be connected. We aimed at addressing such an important problem by suggesting and comparing five different approaches for threshold selection. Each of the methods considers a respective biologically-motivated criterion for electing a potentially suitable threshold. A set of 21 microarray experiments from different biological groups was used to investigate the effect of applying the five proposed criteria to several biological situations. For each experiment, we used the Pearson correlation coefficient to measure the relationship between each gene pair, and the resulting weight matrices were thresholded considering several values, generating respective adjacency matrices (co-expression networks). Each of the five proposed criteria was then applied in order to select the respective threshold value. The effects of these thresholding approaches on the topology of the resulting networks were compared by using several measurements, and we verified that, depending on the database, the impact on the topological properties can be large. However, a group of databases was verified to be similarly affected by most of the considered criteria. Based on such results, it can be suggested that when the generated networks present similar measurements, the thresholding method can be chosen with greater freedom. If the generated networks are markedly different, the thresholding method that better suits the interests of each specific research study represents a reasonable choice.
Kümpers, Britta M. C.; Smith-Unna, Richard D.; Hibberd, Julian M.
2014-01-01
With at least 60 independent origins spanning monocotyledons and dicotyledons, the C4 photosynthetic pathway represents one of the most remarkable examples of convergent evolution. The recurrent evolution of this highly complex trait involving alterations to leaf anatomy, cell biology and biochemistry allows an increase in productivity by ∼50% in tropical and subtropical areas. The extent to which separate lineages of C4 plants use the same genetic networks to maintain C4 photosynthesis is unknown. We developed a new informatics framework to enable deep evolutionary comparison of gene expression in species lacking reference genomes. We exploited this to compare gene expression in species representing two independent C4 lineages (Cleome gynandra and Zea mays) whose last common ancestor diverged ∼140 million years ago. We define a cohort of 3,335 genes that represent conserved components of leaf and photosynthetic development in these species. Furthermore, we show that genes encoding proteins of the C4 cycle are recruited into networks defined by photosynthesis-related genes. Despite the wide evolutionary separation and independent origins of the C4 phenotype, we report that these species use homologous transcription factors to both induce C4 photosynthesis and to maintain the cell specific gene expression required for the pathway to operate. We define a core molecular signature associated with leaf and photosynthetic maturation that is likely shared by angiosperm species derived from the last common ancestor of the monocotyledons and dicotyledons. We show that deep evolutionary comparisons of gene expression can reveal novel insight into the molecular convergence of highly complex phenotypes and that parallel evolution of trans-factors underpins the repeated appearance of C4 photosynthesis. Thus, exploitation of extant natural variation associated with complex traits can be used to identify regulators. Moreover, the transcription factors that are shared by independent C4 lineages are key targets for engineering the C4 pathway into C3 crops such as rice. PMID:24901697
A new multi-scale method to reveal hierarchical modular structures in biological networks.
Jiao, Qing-Ju; Huang, Yan; Shen, Hong-Bin
2016-11-15
Biological networks are effective tools for studying molecular interactions. Modular structure, in which genes or proteins may tend to be associated with functional modules or protein complexes, is a remarkable feature of biological networks. Mining modular structure from biological networks enables us to focus on a set of potentially important nodes, which provides a reliable guide to future biological experiments. The first fundamental challenge in mining modular structure from biological networks is that the quality of the observed network data is usually low owing to noise and incompleteness in the obtained networks. The second problem that poses a challenge to existing approaches to the mining of modular structure is that the organization of both functional modules and protein complexes in networks is far more complicated than was ever thought. For instance, the sizes of different modules vary considerably from each other and they often form multi-scale hierarchical structures. To solve these problems, we propose a new multi-scale protocol for mining modular structure (named ISIMB) driven by a node similarity metric, which works in an iteratively converged space to reduce the effects of the low data quality of the observed network data. The multi-scale node similarity metric couples both the local and the global topology of the network with a resolution regulator. By varying this resolution regulator to give different weightings to the local and global terms in the metric, the ISIMB method is able to fit the shape of modules and to detect them on different scales. Experiments on protein-protein interaction and genetic interaction networks show that our method can not only mine functional modules and protein complexes successfully, but can also predict functional modules from specific to general and reveal the hierarchical organization of protein complexes.
Large scale analysis of signal reachability.
Todor, Andrei; Gabr, Haitham; Dobra, Alin; Kahveci, Tamer
2014-06-15
Major disorders, such as leukemia, have been shown to alter the transcription of genes. Understanding how gene regulation is affected by such aberrations is of utmost importance. One promising strategy toward this objective is to compute whether signals can reach to the transcription factors through the transcription regulatory network (TRN). Due to the uncertainty of the regulatory interactions, this is a #P-complete problem and thus solving it for very large TRNs remains to be a challenge. We develop a novel and scalable method to compute the probability that a signal originating at any given set of source genes can arrive at any given set of target genes (i.e., transcription factors) when the topology of the underlying signaling network is uncertain. Our method tackles this problem for large networks while providing a provably accurate result. Our method follows a divide-and-conquer strategy. We break down the given network into a sequence of non-overlapping subnetworks such that reachability can be computed autonomously and sequentially on each subnetwork. We represent each interaction using a small polynomial. The product of these polynomials express different scenarios when a signal can or cannot reach to target genes from the source genes. We introduce polynomial collapsing operators for each subnetwork. These operators reduce the size of the resulting polynomial and thus the computational complexity dramatically. We show that our method scales to entire human regulatory networks in only seconds, while the existing methods fail beyond a few tens of genes and interactions. We demonstrate that our method can successfully characterize key reachability characteristics of the entire transcriptions regulatory networks of patients affected by eight different subtypes of leukemia, as well as those from healthy control samples. All the datasets and code used in this article are available at bioinformatics.cise.ufl.edu/PReach/scalable.htm. © The Author 2014. Published by Oxford University Press.
Evolution of the Max and Mlx networks in animals.
McFerrin, Lisa G; Atchley, William R
2011-01-01
Transcription factors (TFs) are essential for the regulation of gene expression and often form emergent complexes to perform vital roles in cellular processes. In this paper, we focus on the parallel Max and Mlx networks of TFs because of their critical involvement in cell cycle regulation, proliferation, growth, metabolism, and apoptosis. A basic-helix-loop-helix-zipper (bHLHZ) domain mediates the competitive protein dimerization and DNA binding among Max and Mlx network members to form a complex system of cell regulation. To understand the importance of these network interactions, we identified the bHLHZ domain of Max and Mlx network proteins across the animal kingdom and carried out several multivariate statistical analyses. The presence and conservation of Max and Mlx network proteins in animal lineages stemming from the divergence of Metazoa indicate that these networks have ancient and essential functions. Phylogenetic analysis of the bHLHZ domain identified clear relationships among protein families with distinct points of radiation and divergence. Multivariate discriminant analysis further isolated specific amino acid changes within the bHLHZ domain that classify proteins, families, and network configurations. These analyses on Max and Mlx network members provide a model for characterizing the evolution of TFs involved in essential networks.
Synthetic analog and digital circuits for cellular computation and memory.
Purcell, Oliver; Lu, Timothy K
2014-10-01
Biological computation is a major area of focus in synthetic biology because it has the potential to enable a wide range of applications. Synthetic biologists have applied engineering concepts to biological systems in order to construct progressively more complex gene circuits capable of processing information in living cells. Here, we review the current state of computational genetic circuits and describe artificial gene circuits that perform digital and analog computation. We then discuss recent progress in designing gene networks that exhibit memory, and how memory and computation have been integrated to yield more complex systems that can both process and record information. Finally, we suggest new directions for engineering biological circuits capable of computation. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
METscout: a pathfinder exploring the landscape of metabolites, enzymes and transporters.
Geffers, Lars; Tetzlaff, Benjamin; Cui, Xiao; Yan, Jun; Eichele, Gregor
2013-01-01
METscout (http://metscout.mpg.de) brings together metabolism and gene expression landscapes. It is a MySQL relational database linking biochemical pathway information with 3D patterns of gene expression determined by robotic in situ hybridization in the E14.5 mouse embryo. The sites of expression of ∼1500 metabolic enzymes and of ∼350 solute carriers (SLCs) were included and are accessible as single cell resolution images and in the form of semi-quantitative image abstractions. METscout provides several graphical web-interfaces allowing navigation through complex anatomical and metabolic information. Specifically, the database shows where in the organism each of the many metabolic reactions take place and where SLCs transport metabolites. To link enzymatic reactions and transport, the KEGG metabolic reaction network was extended to include metabolite transport. This network in conjunction with spatial expression pattern of the network genes allows for a tracing of metabolic reactions and transport processes across the entire body of the embryo.
Kiefer, Christiane; Koch, Marcus A.
2012-01-01
74 of the currently accepted 111 taxa of the North American genus Boechera (Brassicaceae) were subject to pyhlogenetic reconstruction and network analysis. The dataset comprised 911 accessions for which ITS sequences were analyzed. Phylogenetic analyses yielded largely unresolved trees. Together with the network analysis confirming this result this can be interpreted as an indication for multiple, independent, and rapid diversification events. Network analyses were superimposed with datasets describing i) geographical distribution, ii) taxonomy, iii) reproductive mode, and iv) distribution history based on phylogeographic evidence. Our results provide first direct evidence for enormous reticulate evolution in the entire genus and give further insights into the evolutionary history of this complex genus on a continental scale. In addition two novel single-copy gene markers, orthologues of the Arabidopsis thaliana genes At2g25920 and At3g18900, were analyzed for subsets of taxa and confirmed the findings obtained through the ITS data. PMID:22606266
Genomic signatures of evolutionary transitions from solitary to group living
Kapheim, Karen M.; Pan, Hailin; Li, Cai; Salzberg, Steven L.; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M.; Hudson, Matthew E.; Venkat, Aarti; Fischman, Brielle J.; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D.; Kemp, William P.; Bosch, Jordi; Waterhouse, Robert M.; Zdobnov, Evgeny M.; Stolle, Eckart; Kraus, F. Bernhard; Helbing, Sophie; Moritz, Robin F. A.; Glastad, Karl M.; Hunt, Brendan G.; Goodisman, Michael A. D.; Hauser, Frank; Grimmelikhuijzen, Cornelis J. P.; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D.; Barribeau, Seth M.; Johnson, Reed M.; Massey, Jonathan H.; Southey, Bruce R.; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F.; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J. Spencer; Hanrahan, Shawn J.; Kocher, Sarah D.; Wang, Jun; Robinson, Gene E.; Zhang, Guojie
2017-01-01
The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. PMID:25977371
Social evolution. Genomic signatures of evolutionary transitions from solitary to group living.
Kapheim, Karen M; Pan, Hailin; Li, Cai; Salzberg, Steven L; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M; Hudson, Matthew E; Venkat, Aarti; Fischman, Brielle J; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D; Kemp, William P; Bosch, Jordi; Waterhouse, Robert M; Zdobnov, Evgeny M; Stolle, Eckart; Kraus, F Bernhard; Helbing, Sophie; Moritz, Robin F A; Glastad, Karl M; Hunt, Brendan G; Goodisman, Michael A D; Hauser, Frank; Grimmelikhuijzen, Cornelis J P; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D; Barribeau, Seth M; Johnson, Reed M; Massey, Jonathan H; Southey, Bruce R; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J Spencer; Hanrahan, Shawn J; Kocher, Sarah D; Wang, Jun; Robinson, Gene E; Zhang, Guojie
2015-06-05
The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. Copyright © 2015, American Association for the Advancement of Science.
Identification of possible genetic polymorphisms involved in cancer cachexia: a systematic review.
Tan, Benjamin H L; Ross, James A; Kaasa, Stein; Skorpen, Frank; Fearon, Kenneth C H
2011-04-01
Cancer cachexia is a polygenic and complex syndrome. Genetic variations in regulation of the inflammatory response, muscle and fat metabolic pathways, and pathways in appetite regulation are likely to contribute to the susceptibility or resistance to developing cancer cachexia. A systematic search of Medline and EmBase databases, covering 1986-2008 was performed for potential candidate genes/genetic polymorphisms relating to cancer cachexia. Related genes were then identified using pathway functional analysis software. All candidate genes were reviewed for functional polymorphisms or clinically significant polymorphisms associated with cachexia using the OMIM and GeneRIF databases. Genes with variants which had functional or clinical associations with cachexia and replicated in at least one study were entered into pathway analysis software to reveal possible network associations between genes. A total of 184 polymorphisms with functional or clinical relevance to cancer cachexia were identified in 92 candidate genes. Of these, 42 polymorphisms (in 33 genes) were replicated in more than one study with 13 polymorphisms found to influence two or more hallmarks of cachexia (i.e. inflammation, loss of fat mass and/or lean mass and reduced survival). Thirty-three genes were found to be significantly interconnected in two major networks with four genes (ADIPOQ, IL6, NFKB1 and TLR4) interlinking both networks. Selection of candidate genes and polymorphisms is a key element of multigene study design. The present study provides an initial framework to select genes/polymorphisms for further study in cancer cachexia, and to develop their potential as susceptibility biomarkers of developing cachexia.
Li, Yongsheng; Xu, Juan; Chen, Hong; Zhao, Zheng; Li, Shengli; Bai, Jing; Wu, Aiwei; Jiang, Chunjie; Wang, Yuan; Su, Bin; Li, Xia
2013-01-01
DNA methylation is an essential epigenetic mechanism involved in transcriptional control. However, how genes with different methylation patterns are assembled in the protein-protein interaction network (PPIN) remains a mystery. In the present study, we systematically dissected the characterization of genes with different methylation patterns in the PPIN. A negative association was detected between the methylation levels in the brain tissues and topological centralities. By focusing on two classes of genes with considerably different methylation levels in the brain tissues, namely the low methylated genes (LMGs) and high methylated genes (HMGs), we found that their organizing principles in the PPIN are distinct. The LMGs tend to be the center of the PPIN, and attacking them causes a more deleterious effect on the network integrity. Furthermore, the LMGs express their functions in a modular pattern and substantial differences in functions are observed between the two types of genes. The LMGs are enriched in the basic biological functions, such as binding activity and regulation of transcription. More importantly, cancer genes, especially recessive cancer genes, essential genes, and aging-related genes were all found more often in the LMGs. Additionally, our analysis presented that the intra-classes communications are enhanced, but inter-classes communications are repressed. Finally, a functional complementation was revealed between methylation and miRNA regulation in the human genome. We have elucidated the assembling principles of genes with different methylation levels in the context of the PPIN, providing key insights into the complex epigenetic regulation mechanisms.
Zhao, Zheng; Li, Shengli; Bai, Jing; Wu, Aiwei; Jiang, Chunjie; Wang, Yuan; Su, Bin; Li, Xia
2013-01-01
Background DNA methylation is an essential epigenetic mechanism involved in transcriptional control. However, how genes with different methylation patterns are assembled in the protein-protein interaction network (PPIN) remains a mystery. Results In the present study, we systematically dissected the characterization of genes with different methylation patterns in the PPIN. A negative association was detected between the methylation levels in the brain tissues and topological centralities. By focusing on two classes of genes with considerably different methylation levels in the brain tissues, namely the low methylated genes (LMGs) and high methylated genes (HMGs), we found that their organizing principles in the PPIN are distinct. The LMGs tend to be the center of the PPIN, and attacking them causes a more deleterious effect on the network integrity. Furthermore, the LMGs express their functions in a modular pattern and substantial differences in functions are observed between the two types of genes. The LMGs are enriched in the basic biological functions, such as binding activity and regulation of transcription. More importantly, cancer genes, especially recessive cancer genes, essential genes, and aging-related genes were all found more often in the LMGs. Additionally, our analysis presented that the intra-classes communications are enhanced, but inter-classes communications are repressed. Finally, a functional complementation was revealed between methylation and miRNA regulation in the human genome. Conclusions We have elucidated the assembling principles of genes with different methylation levels in the context of the PPIN, providing key insights into the complex epigenetic regulation mechanisms. PMID:23776563
Ceccarelli, Michele; Cerulo, Luigi; Santone, Antonella
2014-10-01
Reverse engineering of gene regulatory relationships from genomics data is a crucial task to dissect the complex underlying regulatory mechanism occurring in a cell. From a computational point of view the reconstruction of gene regulatory networks is an undetermined problem as the large number of possible solutions is typically high in contrast to the number of available independent data points. Many possible solutions can fit the available data, explaining the data equally well, but only one of them can be the biologically true solution. Several strategies have been proposed in literature to reduce the search space and/or extend the amount of independent information. In this paper we propose a novel algorithm based on formal methods, mathematically rigorous techniques widely adopted in engineering to specify and verify complex software and hardware systems. Starting with a formal specification of gene regulatory hypotheses we are able to mathematically prove whether a time course experiment belongs or not to the formal specification, determining in fact whether a gene regulation exists or not. The method is able to detect both direction and sign (inhibition/activation) of regulations whereas most of literature methods are limited to undirected and/or unsigned relationships. We empirically evaluated the approach on experimental and synthetic datasets in terms of precision and recall. In most cases we observed high levels of accuracy outperforming the current state of art, despite the computational cost increases exponentially with the size of the network. We made available the tool implementing the algorithm at the following url: http://www.bioinformatics.unisannio.it. Copyright © 2014 Elsevier Inc. All rights reserved.
Herman, Dorota; Slabbinck, Bram; Pè, Mario Enrico
2016-01-01
Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. PMID:26754667
Baute, Joke; Herman, Dorota; Coppens, Frederik; De Block, Jolien; Slabbinck, Bram; Dell'Acqua, Matteo; Pè, Mario Enrico; Maere, Steven; Nelissen, Hilde; Inzé, Dirk
2016-03-01
Leaves are vital organs for biomass and seed production because of their role in the generation of metabolic energy and organic compounds. A better understanding of the molecular networks underlying leaf development is crucial to sustain global requirements for food and renewable energy. Here, we combined transcriptome profiling of proliferative leaf tissue with in-depth phenotyping of the fourth leaf at later stages of development in 197 recombinant inbred lines of two different maize (Zea mays) populations. Previously, correlation analysis in a classical biparental mapping population identified 1,740 genes correlated with at least one of 14 traits. Here, we extended these results with data from a multiparent advanced generation intercross population. As expected, the phenotypic variability was found to be larger in the latter population than in the biparental population, although general conclusions on the correlations among the traits are comparable. Data integration from the two diverse populations allowed us to identify a set of 226 genes that are robustly associated with diverse leaf traits. This set of genes is enriched for transcriptional regulators and genes involved in protein synthesis and cell wall metabolism. In order to investigate the molecular network context of the candidate gene set, we integrated our data with publicly available functional genomics data and identified a growth regulatory network of 185 genes. Our results illustrate the power of combining in-depth phenotyping with transcriptomics in mapping populations to dissect the genetic control of complex traits and present a set of candidate genes for use in biomass improvement. © 2016 American Society of Plant Biologists. All Rights Reserved.
s-core network decomposition: A generalization of k-core analysis to weighted networks
NASA Astrophysics Data System (ADS)
Eidsaa, Marius; Almaas, Eivind
2013-12-01
A broad range of systems spanning biology, technology, and social phenomena may be represented and analyzed as complex networks. Recent studies of such networks using k-core decomposition have uncovered groups of nodes that play important roles. Here, we present s-core analysis, a generalization of k-core (or k-shell) analysis to complex networks where the links have different strengths or weights. We demonstrate the s-core decomposition approach on two random networks (ER and configuration model with scale-free degree distribution) where the link weights are (i) random, (ii) correlated, and (iii) anticorrelated with the node degrees. Finally, we apply the s-core decomposition approach to the protein-interaction network of the yeast Saccharomyces cerevisiae in the context of two gene-expression experiments: oxidative stress in response to cumene hydroperoxide (CHP), and fermentation stress response (FSR). We find that the innermost s-cores are (i) different from innermost k-cores, (ii) different for the two stress conditions CHP and FSR, and (iii) enriched with proteins whose biological functions give insight into how yeast manages these specific stresses.
Hill, W D; Davies, G; van de Lagemaat, L N; Christoforou, A; Marioni, R E; Fernandes, C P D; Liewald, D C; Croning, M D R; Payton, A; Craig, L C A; Whalley, L J; Horan, M; Ollier, W; Hansell, N K; Wright, M J; Martin, N G; Montgomery, G W; Steen, V M; Le Hellard, S; Espeseth, T; Lundervold, A J; Reinvang, I; Starr, J M; Pendleton, N; Grant, S G N; Bates, T C; Deary, I J
2014-01-01
Differences in general cognitive ability (intelligence) account for approximately half of the variation in any large battery of cognitive tests and are predictive of important life events including health. Genome-wide analyses of common single-nucleotide polymorphisms indicate that they jointly tag between a quarter and a half of the variance in intelligence. However, no single polymorphism has been reliably associated with variation in intelligence. It remains possible that these many small effects might be aggregated in networks of functionally linked genes. Here, we tested a network of 1461 genes in the postsynaptic density and associated complexes for an enriched association with intelligence. These were ascertained in 3511 individuals (the Cognitive Ageing Genetics in England and Scotland (CAGES) consortium) phenotyped for general cognitive ability, fluid cognitive ability, crystallised cognitive ability, memory and speed of processing. By analysing the results of a genome wide association study (GWAS) using Gene Set Enrichment Analysis, a significant enrichment was found for fluid cognitive ability for the proteins found in the complexes of N-methyl-D-aspartate receptor complex; P=0.002. Replication was sought in two additional cohorts (N=670 and 2062). A meta-analytic P-value of 0.003 was found when these were combined with the CAGES consortium. The results suggest that genetic variation in the macromolecular machines formed by membrane-associated guanylate kinase (MAGUK) scaffold proteins and their interaction partners contributes to variation in intelligence. PMID:24399044
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna
Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less
Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; ...
2015-04-09
Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. In this paper, we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains, including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated themore » identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. Finally, these efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches.« less
Duncan, Katherine R.; Crüsemann, Max; Lechner, Anna; Sarkar, Anindita; Li, Jie; Ziemert, Nadine; Wang, Mingxun; Bandeira, Nuno; Moore, Bradley S.; Dorrestein, Pieter C.; Jensen, Paul R.
2015-01-01
Summary Genome sequencing has revealed that bacteria contain many more biosynthetic gene clusters than predicted based on the number of secondary metabolites discovered to date. While this biosynthetic reservoir has fostered interest in new tools for natural product discovery, there remains a gap between gene cluster detection and compound discovery. Here we apply molecular networking and the new concept of pattern-based genome mining to 35 Salinispora strains including 30 for which draft genome sequences were either available or obtained for this study. The results provide a method to simultaneously compare large numbers of complex microbial extracts, which facilitated the identification of media components, known compounds and their derivatives, and new compounds that could be prioritized for structure elucidation. These efforts revealed considerable metabolite diversity and led to several molecular family-gene cluster pairings, of which the quinomycin-type depsipeptide retimycin A was characterized and linked to gene cluster NRPS40 using pattern-based bioinformatic approaches. PMID:25865308
NASA Astrophysics Data System (ADS)
Bartell, Jennifer A.; Blazier, Anna S.; Yen, Phillip; Thøgersen, Juliane C.; Jelsbak, Lars; Goldberg, Joanna B.; Papin, Jason A.
2017-03-01
Virulence-linked pathways in opportunistic pathogens are putative therapeutic targets that may be associated with less potential for resistance than targets in growth-essential pathways. However, efficacy of virulence-linked targets may be affected by the contribution of virulence-related genes to metabolism. We evaluate the complex interrelationships between growth and virulence-linked pathways using a genome-scale metabolic network reconstruction of Pseudomonas aeruginosa strain PA14 and an updated, expanded reconstruction of P. aeruginosa strain PAO1. The PA14 reconstruction accounts for the activity of 112 virulence-linked genes and virulence factor synthesis pathways that produce 17 unique compounds. We integrate eight published genome-scale mutant screens to validate gene essentiality predictions in rich media, contextualize intra-screen discrepancies and evaluate virulence-linked gene distribution across essentiality datasets. Computational screening further elucidates interconnectivity between inhibition of virulence factor synthesis and growth. Successful validation of selected gene perturbations using PA14 transposon mutants demonstrates the utility of model-driven screening of therapeutic targets.
Molecular Interaction Map of the Mammalian Cell Cycle Control and DNA Repair Systems
Kohn, Kurt W.
1999-01-01
Eventually to understand the integrated function of the cell cycle regulatory network, we must organize the known interactions in the form of a diagram, map, and/or database. A diagram convention was designed capable of unambiguous representation of networks containing multiprotein complexes, protein modifications, and enzymes that are substrates of other enzymes. To facilitate linkage to a database, each molecular species is symbolically represented only once in each diagram. Molecular species can be located on the map by means of indexed grid coordinates. Each interaction is referenced to an annotation list where pertinent information and references can be found. Parts of the network are grouped into functional subsystems. The map shows how multiprotein complexes could assemble and function at gene promoter sites and at sites of DNA damage. It also portrays the richness of connections between the p53-Mdm2 subsystem and other parts of the network. PMID:10436023
Hybrid stochastic simplifications for multiscale gene networks
Crudu, Alina; Debussche, Arnaud; Radulescu, Ovidiu
2009-01-01
Background Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. Results We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion [1-3] which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Conclusion Hybrid simplifications can be used for onion-like (multi-layered) approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach. PMID:19735554
So, Nina; Franks, Becca; Lim, Sean; Curley, James P
2015-01-01
Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David's Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg's Hub Centrality and Bonacich's Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing and intensive behavioral characterization of group-living animals with the utilization of novel statistical methods to further our understanding of the neurobiological basis of social behavior at the individual, relationship and group levels.
So, Nina; Franks, Becca; Lim, Sean; Curley, James P.
2015-01-01
Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David’s Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg’s Hub Centrality and Bonacich’s Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing and intensive behavioral characterization of group-living animals with the utilization of novel statistical methods to further our understanding of the neurobiological basis of social behavior at the individual, relationship and group levels. PMID:26226265
Deng, Ye; Zhang, Ping; Qin, Yujia; Tu, Qichao; Yang, Yunfeng; He, Zhili; Schadt, Christopher Warren; Zhou, Jizhong
2016-01-01
Discerning network interactions among different species/populations in microbial communities has evoked substantial interests in recent years, but little information is available about temporal dynamics of microbial network interactions in response to environmental perturbations. Here, we modified the random matrix theory-based network approach to discern network succession in groundwater microbial communities in response to emulsified vegetable oil (EVO) amendment for uranium bioremediation. Groundwater microbial communities from one control and seven monitor wells were analysed with a functional gene array (GeoChip 3.0), and functional molecular ecological networks (fMENs) at different time points were reconstructed. Our results showed that the network interactions were dramatically altered by EVO amendment. Dynamic and resilient succession was evident: fairly simple at the initial stage (Day 0), increasingly complex at the middle period (Days 4, 17, 31), most complex at Day 80, and then decreasingly complex at a later stage (140-269 days). Unlike previous studies in other habitats, negative interactions predominated in a time-series fMEN, suggesting strong competition among different microbial species in the groundwater systems after EVO injection. Particularly, several keystone sulfate-reducing bacteria showed strong negative interactions with their network neighbours. These results provide mechanistic understanding of the decreased phylogenetic diversity during environmental perturbations. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.
AlignNemo: a local network alignment method to integrate homology and topology.
Ciriello, Giovanni; Mina, Marco; Guzzi, Pietro H; Cannataro, Mario; Guerra, Concettina
2012-01-01
Local network alignment is an important component of the analysis of protein-protein interaction networks that may lead to the identification of evolutionary related complexes. We present AlignNemo, a new algorithm that, given the networks of two organisms, uncovers subnetworks of proteins that relate in biological function and topology of interactions. The discovered conserved subnetworks have a general topology and need not to correspond to specific interaction patterns, so that they more closely fit the models of functional complexes proposed in the literature. The algorithm is able to handle sparse interaction data with an expansion process that at each step explores the local topology of the networks beyond the proteins directly interacting with the current solution. To assess the performance of AlignNemo, we ran a series of benchmarks using statistical measures as well as biological knowledge. Based on reference datasets of protein complexes, AlignNemo shows better performance than other methods in terms of both precision and recall. We show our solutions to be biologically sound using the concept of semantic similarity applied to Gene Ontology vocabularies. The binaries of AlignNemo and supplementary details about the algorithms and the experiments are available at: sourceforge.net/p/alignnemo.
Impact of gene gains, losses and duplication modes on the origin and diversification of vertebrates.
Cañestro, Cristian; Albalat, Ricard; Irimia, Manuel; Garcia-Fernàndez, Jordi
2013-02-01
The study of the evolutionary origin of vertebrates has been linked to the study of genome duplications since Susumo Ohno suggested that the successful diversification of vertebrate innovations was facilitated by two rounds of whole-genome duplication (2R-WGD) in the stem vertebrate. Since then, studies on the functional evolution of many genes duplicated in the vertebrate lineage have provided the grounds to support experimentally this link. This article reviews cases of gene duplications derived either from the 2R-WGD or from local gene duplication events in vertebrates, analyzing their impact on the evolution of developmental innovations. We analyze how gene regulatory networks can be rewired by the activity of transposable elements after genome duplications, discuss how different mechanisms of duplication might affect the fate of duplicated genes, and how the loss of gene duplicates might influence the fate of surviving paralogs. We also discuss the evolutionary relationships between gene duplication and alternative splicing, in particular in the vertebrate lineage. Finally, we discuss the role that the 2R-WGD might have played in the evolution of vertebrate developmental gene networks, paying special attention to those related to vertebrate key features such as neural crest cells, placodes, and the complex tripartite brain. In this context, we argue that current evidences points that the 2R-WGD may not be linked to the origin of vertebrate innovations, but to their subsequent diversification in a broad variety of complex structures and functions that facilitated the successful transition from peaceful filter-feeding non-vertebrate ancestors to voracious vertebrate predators. Copyright © 2013 Elsevier Ltd. All rights reserved.
Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J
2015-03-07
Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10(-100)), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation.
Boensch, C; Huang, S S; Connolly, D T; Huang, J S
1999-04-09
The cell surface retention sequence (CRS) binding protein-1 (CRSBP-1) is a newly identified membrane glycoprotein which is hypothesized to be responsible for cell surface retention of the oncogene v-sis and c-sis gene products and other secretory proteins containing CRSs. In simian sarcoma virus-transformed NIH 3T3 cells (SSV-NIH 3T3 cells), a fraction of CRSBP-1 was demonstrated at the cell surface and underwent internalization/recycling as revealed by cell surface 125I labeling and its resistance/sensitivity to trypsin digestion. However, the majority of CRSBP-1 was localized in intracellular compartments as evidenced by the resistance of most of the 35S-metabolically labeled CRSBP-1 to trypsin digestion, and by indirect immunofluorescent staining. CRSBP-1 appeared to form complexes with proteolytically processed forms (generated at and/or after the trans-Golgi network) of the v-sis gene product and with a approximately 140-kDa proteolytically cleaved form of the platelet-derived growth factor (PDGF) beta-type receptor, as demonstrated by metabolic labeling and co-immunoprecipitation. CRSBP-1, like the v-sis gene product and PDGF beta-type receptor, underwent rapid turnover which was blocked in the presence of 100 microM suramin. In normal and other transformed NIH 3T3 cells, CRSBP-1 was relatively stable and did not undergo rapid turnover and internalization/recycling at the cell surface. These results suggest that in SSV-NIH 3T3 cells, CRSBP-1 interacts with and forms ternary and binary complexes with the newly synthesized v-sis gene product and PDGF beta-type receptor at the trans-Golgi network and that the stable binary (CRSBP-1.v-sis gene product) complex is transported to the cell surface where it presents the v-sis gene product to unoccupied PDGF beta-type receptors during internalization/recycling.
Chen, Chi-Kan
2017-07-26
The identification of genetic regulatory networks (GRNs) provides insights into complex cellular processes. A class of recurrent neural networks (RNNs) captures the dynamics of GRN. Algorithms combining the RNN and machine learning schemes were proposed to reconstruct small-scale GRNs using gene expression time series. We present new GRN reconstruction methods with neural networks. The RNN is extended to a class of recurrent multilayer perceptrons (RMLPs) with latent nodes. Our methods contain two steps: the edge rank assignment step and the network construction step. The former assigns ranks to all possible edges by a recursive procedure based on the estimated weights of wires of RNN/RMLP (RE RNN /RE RMLP ), and the latter constructs a network consisting of top-ranked edges under which the optimized RNN simulates the gene expression time series. The particle swarm optimization (PSO) is applied to optimize the parameters of RNNs and RMLPs in a two-step algorithm. The proposed RE RNN -RNN and RE RMLP -RNN algorithms are tested on synthetic and experimental gene expression time series of small GRNs of about 10 genes. The experimental time series are from the studies of yeast cell cycle regulated genes and E. coli DNA repair genes. The unstable estimation of RNN using experimental time series having limited data points can lead to fairly arbitrary predicted GRNs. Our methods incorporate RNN and RMLP into a two-step structure learning procedure. Results show that the RE RMLP using the RMLP with a suitable number of latent nodes to reduce the parameter dimension often result in more accurate edge ranks than the RE RNN using the regularized RNN on short simulated time series. Combining by a weighted majority voting rule the networks derived by the RE RMLP -RNN using different numbers of latent nodes in step one to infer the GRN, the method performs consistently and outperforms published algorithms for GRN reconstruction on most benchmark time series. The framework of two-step algorithms can potentially incorporate with different nonlinear differential equation models to reconstruct the GRN.
A comparative study of covariance selection models for the inference of gene regulatory networks.
Stifanelli, Patrizia F; Creanza, Teresa M; Anglani, Roberto; Liuzzi, Vania C; Mukherjee, Sayan; Schena, Francesco P; Ancona, Nicola
2013-10-01
The inference, or 'reverse-engineering', of gene regulatory networks from expression data and the description of the complex dependency structures among genes are open issues in modern molecular biology. In this paper we compared three regularized methods of covariance selection for the inference of gene regulatory networks, developed to circumvent the problems raising when the number of observations n is smaller than the number of genes p. The examined approaches provided three alternative estimates of the inverse covariance matrix: (a) the 'PINV' method is based on the Moore-Penrose pseudoinverse, (b) the 'RCM' method performs correlation between regression residuals and (c) 'ℓ(2C)' method maximizes a properly regularized log-likelihood function. Our extensive simulation studies showed that ℓ(2C) outperformed the other two methods having the most predictive partial correlation estimates and the highest values of sensitivity to infer conditional dependencies between genes even when a few number of observations was available. The application of this method for inferring gene networks of the isoprenoid biosynthesis pathways in Arabidopsis thaliana allowed to enlighten a negative partial correlation coefficient between the two hubs in the two isoprenoid pathways and, more importantly, provided an evidence of cross-talk between genes in the plastidial and the cytosolic pathways. When applied to gene expression data relative to a signature of HRAS oncogene in human cell cultures, the method revealed 9 genes (p-value<0.0005) directly interacting with HRAS, sharing the same Ras-responsive binding site for the transcription factor RREB1. This result suggests that the transcriptional activation of these genes is mediated by a common transcription factor downstream of Ras signaling. Software implementing the methods in the form of Matlab scripts are available at: http://users.ba.cnr.it/issia/iesina18/CovSelModelsCodes.zip. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
C. elegans network biology: a beginning.
Piano, Fabio; Gunsalus, Kristin C; Hill, David E; Vidal, Marc
2006-01-01
The architecture and dynamics of molecular networks can provide an understanding of complex biological processes complementary to that obtained from the in-depth study of single genes and proteins. With a completely sequenced and well-annotated genome, a fully characterized cell lineage, and powerful tools available to dissect development, Caenorhabditis elegans, among metazoans, provides an optimal system to bridge cellular and organismal biology with the global properties of macromolecular networks. This chapter considers omic technologies available for C. elegans to describe molecular networks--encompassing transcriptional and phenotypic profiling as well as physical interaction mapping--and discusses how their individual and integrated applications are paving the way for a network-level understanding of C. elegans biology. PMID:18050437
BioLayout(Java): versatile network visualisation of structural and functional relationships.
Goldovsky, Leon; Cases, Ildefonso; Enright, Anton J; Ouzounis, Christos A
2005-01-01
Visualisation of biological networks is becoming a common task for the analysis of high-throughput data. These networks correspond to a wide variety of biological relationships, such as sequence similarity, metabolic pathways, gene regulatory cascades and protein interactions. We present a general approach for the representation and analysis of networks of variable type, size and complexity. The application is based on the original BioLayout program (C-language implementation of the Fruchterman-Rheingold layout algorithm), entirely re-written in Java to guarantee portability across platforms. BioLayout(Java) provides broader functionality, various analysis techniques, extensions for better visualisation and a new user interface. Examples of analysis of biological networks using BioLayout(Java) are presented.
Core and region-enriched networks of behaviorally regulated genes and the singing genome
Whitney, Osceola; Pfenning, Andreas R.; Howard, Jason T.; Blatti, Charles A; Liu, Fang; Ward, James M.; Wang, Rui; Audet, Jean-Nicolas; Kellis, Manolis; Mukherjee, Sayan; Sinha, Saurabh; Hartemink, Alexander J.; West, Anne E.; Jarvis, Erich D.
2015-01-01
Songbirds represent an important model organism for elucidating molecular mechanisms that link genes with complex behaviors, in part because they have discrete vocal learning circuits that have parallels with those that mediate human speech. We found that ~10% of the genes in the avian genome were regulated by singing, and we found a striking regional diversity of both basal and singing-induced programs in the four key song nuclei of the zebra finch, a vocal learning songbird. The region-enriched patterns were a result of distinct combinations of region-enriched transcription factors (TFs), their binding motifs, and presinging acetylation of histone 3 at lysine 27 (H3K27ac) enhancer activity in the regulatory regions of the associated genes. RNA interference manipulations validated the role of the calcium-response transcription factor (CaRF) in regulating genes preferentially expressed in specific song nuclei in response to singing. Thus, differential combinatorial binding of a small group of activity-regulated TFs and predefined epigenetic enhancer activity influences the anatomical diversity of behaviorally regulated gene networks. PMID:25504732
Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae
Reguly, Teresa; Breitkreutz, Ashton; Boucher, Lorrie; Breitkreutz, Bobby-Joe; Hon, Gary C; Myers, Chad L; Parsons, Ainslie; Friesen, Helena; Oughtred, Rose; Tong, Amy; Stark, Chris; Ho, Yuen; Botstein, David; Andrews, Brenda; Boone, Charles; Troyanskya, Olga G; Ideker, Trey; Dolinski, Kara; Batada, Nizar N; Tyers, Mike
2006-01-01
Background The study of complex biological networks and prediction of gene function has been enabled by high-throughput (HTP) methods for detection of genetic and protein interactions. Sparse coverage in HTP datasets may, however, distort network properties and confound predictions. Although a vast number of well substantiated interactions are recorded in the scientific literature, these data have not yet been distilled into networks that enable system-level inference. Results We describe here a comprehensive database of genetic and protein interactions, and associated experimental evidence, for the budding yeast Saccharomyces cerevisiae, as manually curated from over 31,793 abstracts and online publications. This literature-curated (LC) dataset contains 33,311 interactions, on the order of all extant HTP datasets combined. Surprisingly, HTP protein-interaction datasets currently achieve only around 14% coverage of the interactions in the literature. The LC network nevertheless shares attributes with HTP networks, including scale-free connectivity and correlations between interactions, abundance, localization, and expression. We find that essential genes or proteins are enriched for interactions with other essential genes or proteins, suggesting that the global network may be functionally unified. This interconnectivity is supported by a substantial overlap of protein and genetic interactions in the LC dataset. We show that the LC dataset considerably improves the predictive power of network-analysis approaches. The full LC dataset is available at the BioGRID () and SGD () databases. Conclusion Comprehensive datasets of biological interactions derived from the primary literature provide critical benchmarks for HTP methods, augment functional prediction, and reveal system-level attributes of biological networks. PMID:16762047
Fang, Jiansong; Gao, Li; Ma, Huili; Wu, Qihui; Wu, Tian; Wu, Jun; Wang, Qi; Cheng, Feixiong
2017-01-01
Aging that refers the accumulation of genetic and physiology changes in cells and tissues over a lifetime has been shown a high risk of developing various complex diseases, such as neurodegenerative disease, cardiovascular disease and cancer. Over the past several decades, natural products have been demonstrated as anti-aging interveners via extending lifespan and preventing aging-associated disorders. In this study, we developed an integrated systems pharmacology infrastructure to uncover new indications for aging-associated disorders by natural products. Specifically, we incorporated 411 high-quality aging-associated human genes or human-orthologous genes from mus musculus (MM), saccharomyces cerevisiae (SC), c aenorhabditis elegans (CE), and drosophila melanogaster (DM). We constructed a global drug-target network of natural products by integrating both experimental and computationally predicted drug-target interactions (DTI). We further built the statistical network models for identification of new anti-aging indications of natural products through integration of the curated aging-associated genes and drug-target network of natural products. High accuracy was achieved on the network models. We showcased several network-predicted anti-aging indications of four typical natural products (caffeic acid, metformin, myricetin, and resveratrol) with new mechanism-of-actions. In summary, this study offers a powerful systems pharmacology infrastructure to identify natural products for treatment of aging-associated disorders.
Fang, Jiansong; Gao, Li; Ma, Huili; Wu, Qihui; Wu, Tian; Wu, Jun; Wang, Qi; Cheng, Feixiong
2017-01-01
Aging that refers the accumulation of genetic and physiology changes in cells and tissues over a lifetime has been shown a high risk of developing various complex diseases, such as neurodegenerative disease, cardiovascular disease and cancer. Over the past several decades, natural products have been demonstrated as anti-aging interveners via extending lifespan and preventing aging-associated disorders. In this study, we developed an integrated systems pharmacology infrastructure to uncover new indications for aging-associated disorders by natural products. Specifically, we incorporated 411 high-quality aging-associated human genes or human-orthologous genes from mus musculus (MM), saccharomyces cerevisiae (SC), caenorhabditis elegans (CE), and drosophila melanogaster (DM). We constructed a global drug-target network of natural products by integrating both experimental and computationally predicted drug-target interactions (DTI). We further built the statistical network models for identification of new anti-aging indications of natural products through integration of the curated aging-associated genes and drug-target network of natural products. High accuracy was achieved on the network models. We showcased several network-predicted anti-aging indications of four typical natural products (caffeic acid, metformin, myricetin, and resveratrol) with new mechanism-of-actions. In summary, this study offers a powerful systems pharmacology infrastructure to identify natural products for treatment of aging-associated disorders. PMID:29093681
From integrative genomics to systems genetics in the rat to link genotypes to phenotypes
Moreno-Moral, Aida
2016-01-01
ABSTRACT Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. PMID:27736746
Hejnol, Andreas; Lowe, Christopher J
2015-12-19
Molecular biology has provided a rich dataset to develop hypotheses of nervous system evolution. The startling patterning similarities between distantly related animals during the development of their central nervous system (CNS) have resulted in the hypothesis that a CNS with a single centralized medullary cord and a partitioned brain is homologous across bilaterians. However, the ability to precisely reconstruct ancestral neural architectures from molecular genetic information requires that these gene networks specifically map with particular neural anatomies. A growing body of literature representing the development of a wider range of metazoan neural architectures demonstrates that patterning gene network complexity is maintained in animals with more modest levels of neural complexity. Furthermore, a robust phylogenetic framework that provides the basis for testing the congruence of these homology hypotheses has been lacking since the advent of the field of 'evo-devo'. Recent progress in molecular phylogenetics is refining the necessary framework to test previous homology statements that span large evolutionary distances. In this review, we describe recent advances in animal phylogeny and exemplify for two neural characters-the partitioned brain of arthropods and the ventral centralized nerve cords of annelids-a test for congruence using this framework. The sequential sister taxa at the base of Ecdysozoa and Spiralia comprise small, interstitial groups. This topology is not consistent with the hypothesis of homology of tripartitioned brain of arthropods and vertebrates as well as the ventral arthropod and rope-like ladder nervous system of annelids. There can be exquisite conservation of gene regulatory networks between distantly related groups with contrasting levels of nervous system centralization and complexity. Consequently, the utility of molecular characters to reconstruct ancestral neural organization in deep time is limited. © 2015 The Authors.
Hejnol, Andreas; Lowe, Christopher J.
2015-01-01
Molecular biology has provided a rich dataset to develop hypotheses of nervous system evolution. The startling patterning similarities between distantly related animals during the development of their central nervous system (CNS) have resulted in the hypothesis that a CNS with a single centralized medullary cord and a partitioned brain is homologous across bilaterians. However, the ability to precisely reconstruct ancestral neural architectures from molecular genetic information requires that these gene networks specifically map with particular neural anatomies. A growing body of literature representing the development of a wider range of metazoan neural architectures demonstrates that patterning gene network complexity is maintained in animals with more modest levels of neural complexity. Furthermore, a robust phylogenetic framework that provides the basis for testing the congruence of these homology hypotheses has been lacking since the advent of the field of ‘evo-devo’. Recent progress in molecular phylogenetics is refining the necessary framework to test previous homology statements that span large evolutionary distances. In this review, we describe recent advances in animal phylogeny and exemplify for two neural characters—the partitioned brain of arthropods and the ventral centralized nerve cords of annelids—a test for congruence using this framework. The sequential sister taxa at the base of Ecdysozoa and Spiralia comprise small, interstitial groups. This topology is not consistent with the hypothesis of homology of tripartitioned brain of arthropods and vertebrates as well as the ventral arthropod and rope-like ladder nervous system of annelids. There can be exquisite conservation of gene regulatory networks between distantly related groups with contrasting levels of nervous system centralization and complexity. Consequently, the utility of molecular characters to reconstruct ancestral neural organization in deep time is limited. PMID:26554039
From integrative genomics to systems genetics in the rat to link genotypes to phenotypes.
Moreno-Moral, Aida; Petretto, Enrico
2016-10-01
Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. © 2016. Published by The Company of Biologists Ltd.
Variable-free exploration of stochastic models: a gene regulatory network example.
Erban, Radek; Frewen, Thomas A; Wang, Xiao; Elston, Timothy C; Coifman, Ronald; Nadler, Boaz; Kevrekidis, Ioannis G
2007-04-21
Finding coarse-grained, low-dimensional descriptions is an important task in the analysis of complex, stochastic models of gene regulatory networks. This task involves (a) identifying observables that best describe the state of these complex systems and (b) characterizing the dynamics of the observables. In a previous paper [R. Erban et al., J. Chem. Phys. 124, 084106 (2006)] the authors assumed that good observables were known a priori, and presented an equation-free approach to approximate coarse-grained quantities (i.e., effective drift and diffusion coefficients) that characterize the long-time behavior of the observables. Here we use diffusion maps [R. Coifman et al., Proc. Natl. Acad. Sci. U.S.A. 102, 7426 (2005)] to extract appropriate observables ("reduction coordinates") in an automated fashion; these involve the leading eigenvectors of a weighted Laplacian on a graph constructed from network simulation data. We present lifting and restriction procedures for translating between physical variables and these data-based observables. These procedures allow us to perform equation-free, coarse-grained computations characterizing the long-term dynamics through the design and processing of short bursts of stochastic simulation initialized at appropriate values of the data-based observables.
Dufour, Yann S.; Donohue, Timothy J.
2015-01-01
Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly. PMID:23046950
Reverse Engineering Validation using a Benchmark Synthetic Gene Circuit in Human Cells
Kang, Taek; White, Jacob T.; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas
2013-01-01
Multi-component biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network. PMID:23654266
Reverse engineering validation using a benchmark synthetic gene circuit in human cells.
Kang, Taek; White, Jacob T; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas
2013-05-17
Multicomponent biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network.
B-cell Ligand Processing Pathways Detected by Large-scale Comparative Analysis
Towfic, Fadi; Gupta, Shakti; Honavar, Vasant; Subramaniam, Shankar
2012-01-01
The initiation of B-cell ligand recognition is a critical step for the generation of an immune response against foreign bodies. We sought to identify the biochemical pathways involved in the B-cell ligand recognition cascade and sets of ligands that trigger similar immunological responses. We utilized several comparative approaches to analyze the gene coexpression networks generated from a set of microarray experiments spanning 33 different ligands. First, we compared the degree distributions of the generated networks. Second, we utilized a pairwise network alignment algorithm, BiNA, to align the networks based on the hubs in the networks. Third, we aligned the networks based on a set of KEGG pathways. We summarized our results by constructing a consensus hierarchy of pathways that are involved in B cell ligand recognition. The resulting pathways were further validated through literature for their common physiological responses. Collectively, the results based on our comparative analyses of degree distributions, alignment of hubs, and alignment based on KEGG pathways provide a basis for molecular characterization of the immune response states of B-cells and demonstrate the power of comparative approaches (e.g., gene coexpression network alignment algorithms) in elucidating biochemical pathways involved in complex signaling events in cells. PMID:22917187