determination gene network: Topics by Science.gov

Sample records for determination gene network

Analysis of gene network robustness based on saturated fixed point attractors

PubMed Central

2014-01-01

The analysis of gene network robustness to noise and mutation is important for fundamental and practical reasons. Robustness refers to the stability of the equilibrium expression state of a gene network to variations of the initial expression state and network topology. Numerical simulation of these variations is commonly used for the assessment of robustness. Since there exists a great number of possible gene network topologies and initial states, even millions of simulations may be still too small to give reliable results. When the initial and equilibrium expression states are restricted to being saturated (i.e., their elements can only take values 1 or −1 corresponding to maximum activation and maximum repression of genes), an analytical gene network robustness assessment is possible. We present this analytical treatment based on determination of the saturated fixed point attractors for sigmoidal function models. The analysis can determine (a) for a given network, which and how many saturated equilibrium states exist and which and how many saturated initial states converge to each of these saturated equilibrium states and (b) for a given saturated equilibrium state or a given pair of saturated equilibrium and initial states, which and how many gene networks, referred to as viable, share this saturated equilibrium state or the pair of saturated equilibrium and initial states. We also show that the viable networks sharing a given saturated equilibrium state must follow certain patterns. These capabilities of the analytical treatment make it possible to properly define and accurately determine robustness to noise and mutation for gene networks. Previous network research conclusions drawn from performing millions of simulations follow directly from the results of our analytical treatment. Furthermore, the analytical results provide criteria for the identification of model validity and suggest modified models of gene network dynamics. The yeast cell-cycle network is used as an illustration of the practical application of this analytical treatment. PMID:24650364
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks

PubMed Central

Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.

2013-01-01

Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602
Detecting recurrent gene mutation in interaction network context using multi-scale graph diffusion.

PubMed

Babaei, Sepideh; Hulsman, Marc; Reinders, Marcel; de Ridder, Jeroen

2013-01-23

Delineating the molecular drivers of cancer, i.e. determining cancer genes and the pathways which they deregulate, is an important challenge in cancer research. In this study, we aim to identify pathways of frequently mutated genes by exploiting their network neighborhood encoded in the protein-protein interaction network. To this end, we introduce a multi-scale diffusion kernel and apply it to a large collection of murine retroviral insertional mutagenesis data. The diffusion strength plays the role of scale parameter, determining the size of the network neighborhood that is taken into account. As a result, in addition to detecting genes with frequent mutations in their genomic vicinity, we find genes that harbor frequent mutations in their interaction network context. We identify densely connected components of known and putatively novel cancer genes and demonstrate that they are strongly enriched for cancer related pathways across the diffusion scales. Moreover, the mutations in the clusters exhibit a significant pattern of mutual exclusion, supporting the conjecture that such genes are functionally linked. Using multi-scale diffusion kernel, various infrequently mutated genes are found to harbor significant numbers of mutations in their interaction network neighborhood. Many of them are well-known cancer genes. The results demonstrate the importance of defining recurrent mutations while taking into account the interaction network context. Importantly, the putative cancer genes and networks detected in this study are found to be significant at different diffusion scales, confirming the necessity of a multi-scale analysis.
Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

PubMed

Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

2015-01-01

In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Genomics of sex determination.

PubMed

Zhang, Jisen; Boualem, Adnane; Bendahmane, Abdelhafid; Ming, Ray

2014-04-01

Sex determination is a major switch in the evolutionary history of angiosperm, resulting 11% monoecious and dioecious species. The genomic sequences of papaya sex chromosomes unveiled the molecular basis of recombination suppression in the sex determination region, and candidate genes for sex determination. Identification and analyses of sex determination genes in cucurbits and maize demonstrated conservation of sex determination mechanism in one lineage and divergence between the two systems. Epigenetic control and hormonal influence of sex determination were elucidated in both plants and animals. Intensive investigation of potential sex determination genes in model species will improve our understanding of sex determination gene network. Such network will in turn accelerate the identification of sex determination genes in dioecious species with sex chromosomes, which are burdensome due to no recombination in sex determining regions. The sex determination genes in dioecious species are crucial for understanding the origin of dioecy and sex chromosomes, particularly in their early stage of evolution. Copyright © 2014 Elsevier Ltd. All rights reserved.
Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases

PubMed Central

Ritchie, Marylyn D; White, Bill C; Parker, Joel S; Hahn, Lance W; Moore, Jason H

2003-01-01

Background Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases. Results Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present. Conclusion This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases. PMID:12846935
Stationary and structural control in gene regulatory networks: basic concepts

NASA Astrophysics Data System (ADS)

Dougherty, Edward R.; Pal, Ranadip; Qian, Xiaoning; Bittner, Michael L.; Datta, Aniruddha

2010-01-01

A major reason for constructing gene regulatory networks is to use them as models for determining therapeutic intervention strategies by deriving ways of altering their long-run dynamics in such a way as to reduce the likelihood of entering undesirable states. In general, two paradigms have been taken for gene network intervention: (1) stationary external control is based on optimally altering the status of a control gene (or genes) over time to drive network dynamics; and (2) structural intervention involves an optimal one-time change of the network structure (wiring) to beneficially alter the long-run behaviour of the network. These intervention approaches have mainly been developed within the context of the probabilistic Boolean network model for gene regulation. This article reviews both types of intervention and applies them to reducing the metastatic competence of cells via intervention in a melanoma-related network.
Elucidation of the transcription network governing mammalian sex determination by exploiting strain-specific susceptibility to sex reversal

PubMed Central

Munger, Steven C.; Aylor, David L.; Syed, Haider Ali; Magwene, Paul M.; Threadgill, David W.; Capel, Blanche

2009-01-01

Despite the identification of some key genes that regulate sex determination, most cases of disorders of sexual development remain unexplained. Evidence suggests that the sexual fate decision in the developing gonad depends on a complex network of interacting factors that converge on a critical threshold. To elucidate the transcriptional network underlying sex determination, we took the first expression quantitative trait loci (eQTL) approach in a developing organ. We identified reproducible differences in the transcriptome of the embryonic day 11.5 (E11.5) XY gonad between C57BL/6J (B6) and 129S1/SvImJ (129S1), indicating that the reported sensitivity of B6 to sex reversal is consistent with a higher expression of a female-like transcriptome in B6. Gene expression is highly variable in F2 XY gonads from B6 and 129S1 intercrosses, yet strong correlations emerged. We estimated the F2 coexpression network and predicted roles for genes of unknown function based on their connectivity and position within the network. A genetic analysis of the F2 population detected autosomal regions that control the expression of many sex-related genes, including Sry (sex-determining region of the Y chromosome) and Sox9 (Sry-box containing gene 9), the key regulators of male sex determination. Our results reveal the complex transcription architecture underlying sex determination, and provide a mechanism by which individuals may be sensitized for sex reversal. PMID:19884258
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

PubMed Central

Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

2012-01-01

Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms. PMID:22807664
Hormonal response to bidirectional selection on social behavior

USDA-ARS?s Scientific Manuscript database

Behavior is a quantitative trait determined through the actions of multiple genes. These genes form pleiotropic networks that are sensitive to environmental variation and genetic background. One aspect of behavioral gene networks that is of special interest includes effects during early development....
BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.

PubMed

Maere, Steven; Heymans, Karel; Kuiper, Martin

2005-08-15

The Biological Networks Gene Ontology tool (BiNGO) is an open-source Java tool to determine which Gene Ontology (GO) terms are significantly overrepresented in a set of genes. BiNGO can be used either on a list of genes, pasted as text, or interactively on subgraphs of biological networks visualized in Cytoscape. BiNGO maps the predominant functional themes of the tested gene set on the GO hierarchy, and takes advantage of Cytoscape's versatile visualization environment to produce an intuitive and customizable visual representation of the results.
Network Analysis of Rodent Transcriptomes in Spaceflight

NASA Technical Reports Server (NTRS)

Ramachandran, Maya; Fogle, Homer; Costes, Sylvain

2017-01-01

Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.
Transcriptome display during tilapia sex determination and differentiation as revealed by RNA-Seq analysis.

PubMed

Tao, Wenjing; Chen, Jinlin; Tan, Dejie; Yang, Jing; Sun, Lina; Wei, Jing; Conte, Matthew A; Kocher, Thomas D; Wang, Deshou

2018-05-15

The factors determining sex in teleosts are diverse. Great efforts have been made to characterize the underlying genetic network in various species. However, only seven master sex-determining genes have been identified in teleosts. While the function of a few genes involved in sex determination and differentiation has been studied, we are far from fully understanding how genes interact to coordinate in this process. To enable systematic insights into fish sexual differentiation, we generated a dynamic co-expression network from tilapia gonadal transcriptomes at 5, 20, 30, 40, 90, and 180 dah (days after hatching), plus 45 and 90 dat (days after treatment) and linked gene expression profiles to both development and sexual differentiation. Transcriptomic profiles of female and male gonads at 5 and 20 dah exhibited high similarities except for a small number of genes that were involved in sex determination, while drastic changes were observed from 90 to 180 dah, with a group of differently expressed genes which were involved in gonadal differentiation and gametogenesis. Weighted gene correlation network analysis identified changes in the expression of Borealin, Gtsf1, tesk1, Zar1, Cdn15, and Rpl that were correlated with the expression of genes previously known to be involved in sex differentiation, such as Foxl2, Cyp19a1a, Gsdf, Dmrt1, and Amh. Global gonadal gene expression kinetics during sex determination and differentiation have been extensively profiled in tilapia. These findings provide insights into the genetic framework underlying sex determination and sexual differentiation, and expand our current understanding of developmental pathways during teleost sex determination.
State Space Model with hidden variables for reconstruction of gene regulatory networks.

PubMed

Wu, Xi; Li, Peng; Wang, Nan; Gong, Ping; Perkins, Edward J; Deng, Youping; Zhang, Chaoyang

2011-01-01

State Space Model (SSM) is a relatively new approach to inferring gene regulatory networks. It requires less computational time than Dynamic Bayesian Networks (DBN). There are two types of variables in the linear SSM, observed variables and hidden variables. SSM uses an iterative method, namely Expectation-Maximization, to infer regulatory relationships from microarray datasets. The hidden variables cannot be directly observed from experiments. How to determine the number of hidden variables has a significant impact on the accuracy of network inference. In this study, we used SSM to infer Gene regulatory networks (GRNs) from synthetic time series datasets, investigated Bayesian Information Criterion (BIC) and Principle Component Analysis (PCA) approaches to determining the number of hidden variables in SSM, and evaluated the performance of SSM in comparison with DBN. True GRNs and synthetic gene expression datasets were generated using GeneNetWeaver. Both DBN and linear SSM were used to infer GRNs from the synthetic datasets. The inferred networks were compared with the true networks. Our results show that inference precision varied with the number of hidden variables. For some regulatory networks, the inference precision of DBN was higher but SSM performed better in other cases. Although the overall performance of the two approaches is compatible, SSM is much faster and capable of inferring much larger networks than DBN. This study provides useful information in handling the hidden variables and improving the inference precision.
Genetic regulation of maize flower development and sex determination.

PubMed

Li, Qinglin; Liu, Baoshen

2017-01-01

The determining process of pistil fate are central to maize sex determination, mainly regulated by a genetic network in which the sex-determining genes SILKLESS 1 , TASSEL SEED 1 , TASSEL SEED 2 and the paramutagenic locus Required to maintain repression 6 play pivotal roles. Maize silks, which emerge from the ear shoot and derived from the pistil, are the functional stigmas of female flowers and play a pivotal role in pollination. Previous studies on sex-related mutants have revealed that sex-determining genes and phytohormones play an important role in the regulation of flower organogenesis. The processes determining pistil fate are central to flower development, where a silk identified gene SILKLESS 1 (SK1) is required to protect pistil primordia from a cell death signal produced by two commonly known genes, TASSEL SEED 1 (TS1) and TASSEL SEED 2 (TS2). In this review, maize flower developmental process is presented together with a focus on important sex-determining mutants and hormonal signaling affecting pistil development. The role of sex-determining genes, microRNAs, phytohormones, and the paramutagenic locus Required to maintain repression 6 (Rmr6), in forming a regulatory network that determines pistil fate, is discussed. Cloning SK1 and clarifying its function were crucial in understanding the regulation network of sex determination. The signaling mechanisms of phytohormones in sex determination are also an important research focus.
Enriching regulatory networks by bootstrap learning using optimised GO-based gene similarity and gene links mined from PubMed abstracts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taylor, Ronald C.; Sanfilippo, Antonio P.; McDermott, Jason E.

2011-02-18

Transcriptional regulatory networks are being determined using “reverse engineering” methods that infer connections based on correlations in gene state. Corroboration of such networks through independent means such as evidence from the biomedical literature is desirable. Here, we explore a novel approach, a bootstrapping version of our previous Cross-Ontological Analytic method (XOA) that can be used for semi-automated annotation and verification of inferred regulatory connections, as well as for discovery of additional functional relationships between the genes. First, we use our annotation and network expansion method on a biological network learned entirely from the literature. We show how new relevant linksmore » between genes can be iteratively derived using a gene similarity measure based on the Gene Ontology that is optimized on the input network at each iteration. Second, we apply our method to annotation, verification, and expansion of a set of regulatory connections found by the Context Likelihood of Relatedness algorithm.« less
Transcriptional Regulatory Networks in Saccharomyces cerevisiae

NASA Astrophysics Data System (ADS)

Lee, Tong Ihn; Rinaldi, Nicola J.; Robert, François; Odom, Duncan T.; Bar-Joseph, Ziv; Gerber, Georg K.; Hannett, Nancy M.; Harbison, Christopher T.; Thompson, Craig M.; Simon, Itamar; Zeitlinger, Julia; Jennings, Ezra G.; Murray, Heather L.; Gordon, D. Benjamin; Ren, Bing; Wyrick, John J.; Tagne, Jean-Bosco; Volkert, Thomas L.; Fraenkel, Ernest; Gifford, David K.; Young, Richard A.

2002-10-01

We have determined how most of the transcriptional regulators encoded in the eukaryote Saccharomyces cerevisiae associate with genes across the genome in living cells. Just as maps of metabolic networks describe the potential pathways that may be used by a cell to accomplish metabolic processes, this network of regulator-gene interactions describes potential pathways yeast cells can use to regulate global gene expression programs. We use this information to identify network motifs, the simplest units of network architecture, and demonstrate that an automated process can use motifs to assemble a transcriptional regulatory network structure. Our results reveal that eukaryotic cellular functions are highly connected through networks of transcriptional regulators that regulate other transcriptional regulators.
Sex determination in insects: a binary decision based on alternative splicing.

PubMed

Salz, Helen K

2011-08-01

The gene regulatory networks that control sex determination vary between species. Despite these differences, comparative studies in insects have found that alternative splicing is reiteratively used in evolution to control expression of the key sex-determining genes. Sex determination is best understood in Drosophila where activation of the RNA binding protein-encoding gene Sex-lethal is the central female-determining event. Sex-lethal serves as a genetic switch because once activated it controls its own expression by a positive feedback splicing mechanism. Sex fate choice in is also maintained by self-sustaining positive feedback splicing mechanisms in other dipteran and hymenopteran insects, although different RNA binding protein-encoding genes function as the binary switch. Studies exploring the mechanisms of sex-specific splicing have revealed the extent to which sex determination is integrated with other developmental regulatory networks. Copyright © 2011 Elsevier Ltd. All rights reserved.
Mouse Social Network Dynamics and Community Structure are Associated with Plasticity-Related Brain Gene Expression

PubMed Central

Williamson, Cait M.; Franks, Becca; Curley, James P.

2016-01-01

Laboratory studies of social behavior have typically focused on dyadic interactions occurring within a limited spatiotemporal context. However, this strategy prevents analyses of the dynamics of group social behavior and constrains identification of the biological pathways mediating individual differences in behavior. In the current study, we aimed to identify the spatiotemporal dynamics and hierarchical organization of a large social network of male mice. We also sought to determine if standard assays of social and exploratory behavior are predictive of social behavior in this social network and whether individual network position was associated with the mRNA expression of two plasticity-related genes, DNA methyltransferase 1 and 3a. Mice were observed to form a hierarchically organized social network and self-organized into two separate social network communities. Members of both communities exhibited distinct patterns of socio-spatial organization within the vivaria that was not limited to only agonistic interactions. We further established that exploratory and social behaviors in standard behavioral assays conducted prior to placing the mice into the large group was predictive of initial network position and behavior but were not associated with final social network position. Finally, we determined that social network position is associated with variation in mRNA levels of two neural plasticity genes, DNMT1 and DNMT3a, in the hippocampus but not the mPOA. This work demonstrates the importance of understanding the role of social context and complex social dynamics in determining the relationship between individual differences in social behavior and brain gene expression. PMID:27540359
Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks

PubMed Central

Yamanaka, Ryota; Kitano, Hiroaki

2013-01-01

Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks. PMID:24278007

Gene expression, signal transduction pathways and functional networks associated with growth of sporadic vestibular schwannomas.

PubMed

Sass, Hjalte C R; Borup, Rehannah; Alanin, Mikkel; Nielsen, Finn Cilius; Cayé-Thomasen, Per

2017-01-01

The objective of this study was to determine global gene expression in relation to Vestibular schwannomas (VS) growth rate and to identify signal transduction pathways and functional molecular networks associated with growth. Repeated magnetic resonance imaging (MRI) prior to surgery determined tumor growth rate. Following tissue sampling during surgery, mRNA was extracted from 16 sporadic VS. Double stranded cDNA was synthesized from the mRNA and used as template for in vitro transcription reaction to synthesize biotin-labeled antisense cRNA, which was hybridized to Affymetrix HG-U133A arrays and analyzed by dChip software. Differential gene expression was defined as a 1.5-fold difference between fast and slow growing tumors (><0.5 ccm/year), employing a p-value <0.01. Deregulated transcripts were matched against established gene ontology. Ingenuity Pathway Analysis was used for identification of signal transduction pathways and functional molecular networks associated with tumor growth. In total 109 genes were deregulated in relation to tumor growth rate. Genes associated with apoptosis, growth and cell proliferation were deregulated. Gene ontology included regulation of the cell cycle, cell differentiation and proliferation, among other functions. Fourteen pathways were associated with tumor growth. Five functional molecular networks were generated. This first study on global gene expression in relation to vestibular schwannoma growth rate identified several genes, signal transduction pathways and functional networks associated with tumor progression. Specific genes involved in apoptosis, cell growth and proliferation were deregulated in fast growing tumors. Fourteen pathways were associated with tumor growth. Generated functional networks underlined the importance of the PI3K family, among others.
Sox5 is involved in germ-cell regulation and sex determination in medaka following co-option of nested transposable elements.

PubMed

Schartl, Manfred; Schories, Susanne; Wakamatsu, Yuko; Nagao, Yusuke; Hashimoto, Hisashi; Bertin, Chloé; Mourot, Brigitte; Schmidt, Cornelia; Wilhelm, Dagmar; Centanin, Lazaro; Guiguen, Yann; Herpin, Amaury

2018-01-29

Sex determination relies on a hierarchically structured network of genes, and is one of the most plastic processes in evolution. The evolution of sex-determining genes within a network, by neo- or sub-functionalization, also requires the regulatory landscape to be rewired to accommodate these novel gene functions. We previously showed that in medaka fish, the regulatory landscape of the master male-determining gene dmrt1bY underwent a profound rearrangement, concomitantly with acquiring a dominant position within the sex-determining network. This rewiring was brought about by the exaptation of a transposable element (TE) called Izanagi, which is co-opted to act as a silencer to turn off the dmrt1bY gene after it performed its function in sex determination. We now show that a second TE, Rex1, has been incorporated into Izanagi. The insertion of Rex1 brought in a preformed regulatory element for the transcription factor Sox5, which here functions in establishing the temporal and cell-type-specific expression pattern of dmrt1bY. Mutant analysis demonstrates the importance of Sox5 in the gonadal development of medaka, and possibly in mice, in a dmrt1bY-independent manner. Moreover, Sox5 medaka mutants have complete female-to-male sex reversal. Our work reveals an unexpected complexity in TE-mediated transcriptional rewiring, with the exaptation of a second TE into a network already rewired by a TE. We also show a dual role for Sox5 during sex determination: first, as an evolutionarily conserved regulator of germ-cell number in medaka, and second, by de novo regulation of dmrt1 transcriptional activity during primary sex determination due to exaptation of the Rex1 transposable element.
Transcriptional network control of normal and leukaemic haematopoiesis

PubMed Central

Sive, Jonathan I.; Göttgens, Berthold

2014-01-01

Transcription factors (TFs) play a key role in determining the gene expression profiles of stem/progenitor cells, and defining their potential to differentiate into mature cell lineages. TF interactions within gene-regulatory networks are vital to these processes, and dysregulation of these networks by TF overexpression, deletion or abnormal gene fusions have been shown to cause malignancy. While investigation of these processes remains a challenge, advances in genome-wide technologies and growing interactions between laboratory and computational science are starting to produce increasingly accurate network models. The haematopoietic system provides an attractive experimental system to elucidate gene regulatory mechanisms, and allows experimental investigation of both normal and dysregulated networks. In this review we examine the principles of TF-controlled gene regulatory networks and the key experimental techniques used to investigate them. We look in detail at examples of how these approaches can be used to dissect out the regulatory mechanisms controlling normal haematopoiesis, as well as the dysregulated networks associated with haematological malignancies. PMID:25014893
Transcriptional network control of normal and leukaemic haematopoiesis.

PubMed

Sive, Jonathan I; Göttgens, Berthold

2014-12-10

Transcription factors (TFs) play a key role in determining the gene expression profiles of stem/progenitor cells, and defining their potential to differentiate into mature cell lineages. TF interactions within gene-regulatory networks are vital to these processes, and dysregulation of these networks by TF overexpression, deletion or abnormal gene fusions have been shown to cause malignancy. While investigation of these processes remains a challenge, advances in genome-wide technologies and growing interactions between laboratory and computational science are starting to produce increasingly accurate network models. The haematopoietic system provides an attractive experimental system to elucidate gene regulatory mechanisms, and allows experimental investigation of both normal and dysregulated networks. In this review we examine the principles of TF-controlled gene regulatory networks and the key experimental techniques used to investigate them. We look in detail at examples of how these approaches can be used to dissect out the regulatory mechanisms controlling normal haematopoiesis, as well as the dysregulated networks associated with haematological malignancies. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Massive-scale gene co-expression network construction and robustness testing using random matrix theory.

PubMed

Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C

2013-01-01

The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.
Reconstructing directed gene regulatory network by only gene expression data.

PubMed

Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng

2016-08-18

Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules

PubMed Central

Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.

PubMed

Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex

2012-01-01

Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Form and function in gene regulatory networks: the structure of network motifs determines fundamental properties of their dynamical state space.

PubMed

Ahnert, S E; Fink, T M A

2016-07-01

Network motifs have been studied extensively over the past decade, and certain motifs, such as the feed-forward loop, play an important role in regulatory networks. Recent studies have used Boolean network motifs to explore the link between form and function in gene regulatory networks and have found that the structure of a motif does not strongly determine its function, if this is defined in terms of the gene expression patterns the motif can produce. Here, we offer a different, higher-level definition of the 'function' of a motif, in terms of two fundamental properties of its dynamical state space as a Boolean network. One is the basin entropy, which is a complexity measure of the dynamics of Boolean networks. The other is the diversity of cyclic attractor lengths that a given motif can produce. Using these two measures, we examine all 104 topologically distinct three-node motifs and show that the structural properties of a motif, such as the presence of feedback loops and feed-forward loops, predict fundamental characteristics of its dynamical state space, which in turn determine aspects of its functional versatility. We also show that these higher-level properties have a direct bearing on real regulatory networks, as both basin entropy and cycle length diversity show a close correspondence with the prevalence, in neural and genetic regulatory networks, of the 13 connected motifs without self-interactions that have been studied extensively in the literature. © 2016 The Authors.
CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.

PubMed

Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang

2016-12-23

A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .
A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe

PubMed Central

Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

2016-01-01

Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338
A novel gene network inference algorithm using predictive minimum description length approach.

PubMed

Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang

2010-05-28

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the PMDL principle is effective in determining the MI threshold and the developed algorithm improves precision of gene regulatory network inference. Based on the sensitivity analysis of all tested cases, an optimal CMI threshold value has been identified. Finally it was observed that the performance of the algorithms saturates at a certain threshold of data size.
Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent

PubMed Central

Zhu, Sha; Degnan, James H.

2017-01-01

Abstract Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable—that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. PMID:27780899
Differential Network Analysis Reveals Evolutionary Complexity in Secondary Metabolism of Rauvolfia serpentina over Catharanthus roseus

PubMed Central

Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S.

2016-01-01

Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites. PMID:27588023
Differential Network Analysis Reveals Evolutionary Complexity in Secondary Metabolism of Rauvolfia serpentina over Catharanthus roseus.

PubMed

Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S

2016-01-01

Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites.
Systems genetics identifies a convergent gene network for cognition and neurodevelopmental disease.

PubMed

Johnson, Michael R; Shkura, Kirill; Langley, Sarah R; Delahaye-Duriez, Andree; Srivastava, Prashant; Hill, W David; Rackham, Owen J L; Davies, Gail; Harris, Sarah E; Moreno-Moral, Aida; Rotival, Maxime; Speed, Doug; Petrovski, Slavé; Katz, Anaïs; Hayward, Caroline; Porteous, David J; Smith, Blair H; Padmanabhan, Sandosh; Hocking, Lynne J; Starr, John M; Liewald, David C; Visconti, Alessia; Falchi, Mario; Bottolo, Leonardo; Rossetti, Tiziana; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Grote, Alexander; Helmstaedter, Christoph; Becker, Albert J; Kaminski, Rafal M; Deary, Ian J; Petretto, Enrico

2016-02-01

Genetic determinants of cognition are poorly characterized, and their relationship to genes that confer risk for neurodevelopmental disease is unclear. Here we performed a systems-level analysis of genome-wide gene expression data to infer gene-regulatory networks conserved across species and brain regions. Two of these networks, M1 and M3, showed replicable enrichment for common genetic variants underlying healthy human cognitive abilities, including memory. Using exome sequence data from 6,871 trios, we found that M3 genes were also enriched for mutations ascertained from patients with neurodevelopmental disease generally, and intellectual disability and epileptic encephalopathy in particular. M3 consists of 150 genes whose expression is tightly developmentally regulated, but which are collectively poorly annotated for known functional pathways. These results illustrate how systems-level analyses can reveal previously unappreciated relationships between neurodevelopmental disease-associated genes in the developed human brain, and provide empirical support for a convergent gene-regulatory network influencing cognition and neurodevelopmental disease.
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks.

PubMed

Fogelmark, Karl; Peterson, Carsten; Troein, Carl

2016-01-01

Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks.
Patterns of Metabolite Changes Identified from Large-Scale Gene Perturbations in Arabidopsis Using a Genome-Scale Metabolic Network1[OPEN

PubMed Central

Kim, Taehyong; Dreher, Kate; Nilo-Poyanco, Ricardo; Lee, Insuk; Fiehn, Oliver; Lange, Bernd Markus; Nikolau, Basil J.; Sumner, Lloyd; Welti, Ruth; Wurtele, Eve S.; Rhee, Seung Y.

2015-01-01

Metabolomics enables quantitative evaluation of metabolic changes caused by genetic or environmental perturbations. However, little is known about how perturbing a single gene changes the metabolic system as a whole and which network and functional properties are involved in this response. To answer this question, we investigated the metabolite profiles from 136 mutants with single gene perturbations of functionally diverse Arabidopsis (Arabidopsis thaliana) genes. Fewer than 10 metabolites were changed significantly relative to the wild type in most of the mutants, indicating that the metabolic network was robust to perturbations of single metabolic genes. These changed metabolites were closer to each other in a genome-scale metabolic network than expected by chance, supporting the notion that the genetic perturbations changed the network more locally than globally. Surprisingly, the changed metabolites were close to the perturbed reactions in only 30% of the mutants of the well-characterized genes. To determine the factors that contributed to the distance between the observed metabolic changes and the perturbation site in the network, we examined nine network and functional properties of the perturbed genes. Only the isozyme number affected the distance between the perturbed reactions and changed metabolites. This study revealed patterns of metabolic changes from large-scale gene perturbations and relationships between characteristics of the perturbed genes and metabolic changes. PMID:25670818
NetDecoder: a network biology platform that decodes context-specific biological networks and gene activities.

PubMed

da Rocha, Edroaldo Lummertz; Ung, Choong Yong; McGehee, Cordelia D; Correia, Cristina; Li, Hu

2016-06-02

The sequential chain of interactions altering the binary state of a biomolecule represents the 'information flow' within a cellular network that determines phenotypic properties. Given the lack of computational tools to dissect context-dependent networks and gene activities, we developed NetDecoder, a network biology platform that models context-dependent information flows using pairwise phenotypic comparative analyses of protein-protein interactions. Using breast cancer, dyslipidemia and Alzheimer's disease as case studies, we demonstrate NetDecoder dissects subnetworks to identify key players significantly impacting cell behaviour specific to a given disease context. We further show genes residing in disease-specific subnetworks are enriched in disease-related signalling pathways and information flow profiles, which drive the resulting disease phenotypes. We also devise a novel scoring scheme to quantify key genes-network routers, which influence many genes, key targets, which are influenced by many genes, and high impact genes, which experience a significant change in regulation. We show the robustness of our results against parameter changes. Our network biology platform includes freely available source code (http://www.NetDecoder.org) for researchers to explore genome-wide context-dependent information flow profiles and key genes, given a set of genes of particular interest and transcriptome data. More importantly, NetDecoder will enable researchers to uncover context-dependent drug targets. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory

PubMed Central

Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.

2013-01-01

The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071

The G-Box Transcriptional Regulatory Code in Arabidopsis1[OPEN

PubMed Central

Shepherd, Samuel J.K.; Brestovitsky, Anna; Dickinson, Patrick; Biswas, Surojit

2017-01-01

Plants have significantly more transcription factor (TF) families than animals and fungi, and plant TF families tend to contain more genes; these expansions are linked to adaptation to environmental stressors. Many TF family members bind to similar or identical sequence motifs, such as G-boxes (CACGTG), so it is difficult to predict regulatory relationships. We determined that the flanking sequences near G-boxes help determine in vitro specificity but that this is insufficient to predict the transcription pattern of genes near G-boxes. Therefore, we constructed a gene regulatory network that identifies the set of bZIPs and bHLHs that are most predictive of the expression of genes downstream of perfect G-boxes. This network accurately predicts transcriptional patterns and reconstructs known regulatory subnetworks. Finally, we present Ara-BOX-cis (araboxcis.org), a Web site that provides interactive visualizations of the G-box regulatory network, a useful resource for generating predictions for gene regulatory relations. PMID:28864470
Long-Term Oil Contamination Alters the Molecular Ecological Networks of Soil Microbial Functional Genes

PubMed Central

Liang, Yuting; Zhao, Huihui; Deng, Ye; Zhou, Jizhong; Li, Guanghe; Sun, Bo

2016-01-01

With knowledge on microbial composition and diversity, investigation of within-community interactions is a further step to elucidate microbial ecological functions, such as the biodegradation of hazardous contaminants. In this work, microbial functional molecular ecological networks were studied in both contaminated and uncontaminated soils to determine the possible influences of oil contamination on microbial interactions and potential functions. Soil samples were obtained from an oil-exploring site located in South China, and the microbial functional genes were analyzed with GeoChip, a high-throughput functional microarray. By building random networks based on null model, we demonstrated that overall network structures and properties were significantly different between contaminated and uncontaminated soils (P < 0.001). Network connectivity, module numbers, and modularity were all reduced with contamination. Moreover, the topological roles of the genes (module hub and connectors) were altered with oil contamination. Subnetworks of genes involved in alkane and polycyclic aromatic hydrocarbon degradation were also constructed. Negative co-occurrence patterns prevailed among functional genes, thereby indicating probable competition relationships. The potential “keystone” genes, defined as either “hubs” or genes with highest connectivities in the network, were further identified. The network constructed in this study predicted the potential effects of anthropogenic contamination on microbial community co-occurrence interactions. PMID:26870020
Computational exploration of cis-regulatory modules in rhythmic expression data using the "Exploration of Distinctive CREs and CRMs" (EDCC) and "CRM Network Generator" (CNG) programs.

PubMed

Bekiaris, Pavlos Stephanos; Tekath, Tobias; Staiger, Dorothee; Danisman, Selahattin

2018-01-01

Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, "Exploration of Distinctive CREs and CRMs" (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, "CRM Network Generator" (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression.
Computational exploration of cis-regulatory modules in rhythmic expression data using the “Exploration of Distinctive CREs and CRMs” (EDCC) and “CRM Network Generator” (CNG) programs

PubMed Central

Staiger, Dorothee

2018-01-01

Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, “Exploration of Distinctive CREs and CRMs” (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, “CRM Network Generator” (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression. PMID:29298348
Influence maximization in time bounded network identifies transcription factors regulating perturbed pathways

PubMed Central

Jo, Kyuri; Jung, Inuk; Moon, Ji Hwan; Kim, Sun

2016-01-01

Motivation: To understand the dynamic nature of the biological process, it is crucial to identify perturbed pathways in an altered environment and also to infer regulators that trigger the response. Current time-series analysis methods, however, are not powerful enough to identify perturbed pathways and regulators simultaneously. Widely used methods include methods to determine gene sets such as differentially expressed genes or gene clusters and these genes sets need to be further interpreted in terms of biological pathways using other tools. Most pathway analysis methods are not designed for time series data and they do not consider gene-gene influence on the time dimension. Results: In this article, we propose a novel time-series analysis method TimeTP for determining transcription factors (TFs) regulating pathway perturbation, which narrows the focus to perturbed sub-pathways and utilizes the gene regulatory network and protein–protein interaction network to locate TFs triggering the perturbation. TimeTP first identifies perturbed sub-pathways that propagate the expression changes along the time. Starting points of the perturbed sub-pathways are mapped into the network and the most influential TFs are determined by influence maximization technique. The analysis result is visually summarized in TF-Pathway map in time clock. TimeTP was applied to PIK3CA knock-in dataset and found significant sub-pathways and their regulators relevant to the PIP3 signaling pathway. Availability and Implementation: TimeTP is implemented in Python and available at http://biohealth.snu.ac.kr/software/TimeTP/. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: sunkim.bioinfo@snu.ac.kr PMID:27307609
Predictive minimum description length principle approach to inferring gene regulatory networks.

PubMed

Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping

2011-01-01

Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
The HOX genes are expressed, in vivo, in human tooth germs: in vitro cAMP exposure of dental pulp cells results in parallel HOX network activation and neuronal differentiation.

PubMed

D'Antò, Vincenzo; Cantile, Monica; D'Armiento, Maria; Schiavo, Giulia; Spagnuolo, Gianrico; Terracciano, Luigi; Vecchione, Raffaela; Cillo, Clemente

2006-03-01

Homeobox-containing genes play a crucial role in odontogenesis. After the detection of Dlx and Msx genes in overlapping domains along maxillary and mandibular processes, a homeobox odontogenic code has been proposed to explain the interaction between different homeobox genes during dental lamina patterning. No role has so far been assigned to the Hox gene network in the homeobox odontogenic code due to studies on specific Hox genes and evolutionary considerations. Despite its involvement in early patterning during embryonal development, the HOX gene network, the most repeat-poor regions of the human genome, controls the phenotype identity of adult eukaryotic cells. Here, according to our results, the HOX gene network appears to be active in human tooth germs between 18 and 24 weeks of development. The immunohistochemical localization of specific HOX proteins mostly concerns the epithelial tooth germ compartment. Furthermore, only a few genes of the network are active in embryonal retromolar tissues, as well as in ectomesenchymal dental pulp cells (DPC) grown in vitro from adult human molar. Exposure of DPCs to cAMP induces the expression of from three to nine total HOX genes of the network in parallel with phenotype modifications with traits of neuronal differentiation. Our observations suggest that: (i) by combining its component genes, the HOX gene network determines the phenotype identity of epithelial and ectomesenchymal cells interacting in the generation of human tooth germ; (ii) cAMP treatment activates the HOX network and induces, in parallel, a neuronal-like phenotype in human primary ectomesenchymal dental pulp cells. 2005 Wiley-Liss, Inc.
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks

PubMed Central

Fogelmark, Karl; Peterson, Carsten; Troein, Carl

2016-01-01

Background Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. Methodology To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. Principal Findings We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks. PMID:26927540
Network-based analysis of differentially expressed genes in cerebrospinal fluid (CSF) and blood reveals new candidate genes for multiple sclerosis

PubMed Central

Safari-Alighiarloo, Nahid; Taghizadeh, Mohammad; Tabatabaei, Seyyed Mohammad; Namaki, Saeed

2016-01-01

Background The involvement of multiple genes and missing heritability, which are dominant in complex diseases such as multiple sclerosis (MS), entail using network biology to better elucidate their molecular basis and genetic factors. We therefore aimed to integrate interactome (protein–protein interaction (PPI)) and transcriptomes data to construct and analyze PPI networks for MS disease. Methods Gene expression profiles in paired cerebrospinal fluid (CSF) and peripheral blood mononuclear cells (PBMCs) samples from MS patients, sampled in relapse or remission and controls, were analyzed. Differentially expressed genes which determined only in CSF (MS vs. control) and PBMCs (relapse vs. remission) separately integrated with PPI data to construct the Query-Query PPI (QQPPI) networks. The networks were further analyzed to investigate more central genes, functional modules and complexes involved in MS progression. Results The networks were analyzed and high centrality genes were identified. Exploration of functional modules and complexes showed that the majority of high centrality genes incorporated in biological pathways driving MS pathogenesis. Proteasome and spliceosome were also noticeable in enriched pathways in PBMCs (relapse vs. remission) which were identified by both modularity and clique analyses. Finally, STK4, RB1, CDKN1A, CDK1, RAC1, EZH2, SDCBP genes in CSF (MS vs. control) and CDC37, MAP3K3, MYC genes in PBMCs (relapse vs. remission) were identified as potential candidate genes for MS, which were the more central genes involved in biological pathways. Discussion This study showed that network-based analysis could explicate the complex interplay between biological processes underlying MS. Furthermore, an experimental validation of candidate genes can lead to identification of potential therapeutic targets. PMID:28028462
Modelling the influence of parental effects on gene-network evolution.

PubMed

Odorico, Andreas; Rünneburger, Estelle; Le Rouzic, Arnaud

2018-05-01

Understanding the importance of nongenetic heredity in the evolutionary process is a major topic in modern evolutionary biology. We modified a classical gene-network model by allowing parental transmission of gene expression and studied its evolutionary properties through individual-based simulations. We identified ontogenetic time (i.e. the time gene networks have to stabilize before being submitted to natural selection) as a crucial factor in determining the evolutionary impact of this phenotypic inheritance. Indeed, fast-developing organisms display enhanced adaptation and greater robustness to mutations when evolving in presence of nongenetic inheritance (NGI). In contrast, in our model, long development reduces the influence of the inherited state of the gene network. NGI thus had a negligible effect on the evolution of gene networks when the speed at which transcription levels reach equilibrium is not constrained. Nevertheless, simulations show that intergenerational transmission of the gene-network state negatively affects the evolution of robustness to environmental disturbances for either fast- or slow-developing organisms. Therefore, these results suggest that the evolutionary consequences of NGI might not be sought only in the way species respond to selection, but also on the evolution of emergent properties (such as environmental and genetic canalization) in complex genetic architectures. © 2018 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2018 European Society For Evolutionary Biology.
Using SPEEDES to simulate the blue gene interconnect network

NASA Technical Reports Server (NTRS)

Springer, P.; Upchurch, E.

2003-01-01

JPL and the Center for Advanced Computer Architecture (CACR) is conducting application and simulation analyses of BG/L in order to establish a range of effectiveness for the Blue Gene/L MPP architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution.
Gene regulatory network identification from the yeast cell cycle based on a neuro-fuzzy system.

PubMed

Wang, B H; Lim, J W; Lim, J S

2016-08-30

Many studies exist for reconstructing gene regulatory networks (GRNs). In this paper, we propose a method based on an advanced neuro-fuzzy system, for gene regulatory network reconstruction from microarray time-series data. This approach uses a neural network with a weighted fuzzy function to model the relationships between genes. Fuzzy rules, which determine the regulators of genes, are very simplified through this method. Additionally, a regulator selection procedure is proposed, which extracts the exact dynamic relationship between genes, using the information obtained from the weighted fuzzy function. Time-series related features are extracted from the original data to employ the characteristics of temporal data that are useful for accurate GRN reconstruction. The microarray dataset of the yeast cell cycle was used for our study. We measured the mean squared prediction error for the efficiency of the proposed approach and evaluated the accuracy in terms of precision, sensitivity, and F-score. The proposed method outperformed the other existing approaches.
Robust gene network analysis reveals alteration of the STAT5a network as a hallmark of prostate cancer.

PubMed

Reddy, Anupama; Huang, C Chris; Liu, Huiqing; Delisi, Charles; Nevalainen, Marja T; Szalma, Sandor; Bhanot, Gyan

2010-01-01

We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (≥7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-lα is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.
Resistance Genes in Global Crop Breeding Networks.

PubMed

Garrett, K A; Andersen, K F; Asche, F; Bowden, R L; Forbes, G A; Kulakow, P A; Zhou, B

2017-10-01

Resistance genes are a major tool for managing crop diseases. The networks of crop breeders who exchange resistance genes and deploy them in varieties help to determine the global landscape of resistance and epidemics, an important system for maintaining food security. These networks function as a complex adaptive system, with associated strengths and vulnerabilities, and implications for policies to support resistance gene deployment strategies. Extensions of epidemic network analysis can be used to evaluate the multilayer agricultural networks that support and influence crop breeding networks. Here, we evaluate the general structure of crop breeding networks for cassava, potato, rice, and wheat. All four are clustered due to phytosanitary and intellectual property regulations, and linked through CGIAR hubs. Cassava networks primarily include public breeding groups, whereas others are more mixed. These systems must adapt to global change in climate and land use, the emergence of new diseases, and disruptive breeding technologies. Research priorities to support policy include how best to maintain both diversity and redundancy in the roles played by individual crop breeding groups (public versus private and global versus local), and how best to manage connectivity to optimize resistance gene deployment while avoiding risks to the useful life of resistance genes. [Formula: see text] Copyright © 2017 The Author(s). This is an open access article distributed under the CC BY 4.0 International license .
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.

PubMed

Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida

2012-07-20

Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
Use of Bayesian Networks to Probabilistically Model and Improve the Likelihood of Validation of Microarray Findings by RT-PCR

PubMed Central

English, Sangeeta B.; Shih, Shou-Ching; Ramoni, Marco F.; Smith, Lois E.; Butte, Atul J.

2014-01-01

Though genome-wide technologies, such as microarrays, are widely used, data from these methods are considered noisy; there is still varied success in downstream biological validation. We report a method that increases the likelihood of successfully validating microarray findings using real time RT-PCR, including genes at low expression levels and with small differences. We use a Bayesian network to identify the most relevant sources of noise based on the successes and failures in validation for an initial set of selected genes, and then improve our subsequent selection of genes for validation based on eliminating these sources of noise. The network displays the significant sources of noise in an experiment, and scores the likelihood of validation for every gene. We show how the method can significantly increase validation success rates. In conclusion, in this study, we have successfully added a new automated step to determine the contributory sources of noise that determine successful or unsuccessful downstream biological validation. PMID:18790084
Precision matters for position decoding in the early fly embryo

NASA Astrophysics Data System (ADS)

Petkova, Mariela D.; Tkacik, Gasper; Wieschaus, Eric F.; Bialek, William; Gregor, Thomas

Genetic networks can determine cell fates in multicellular organisms with precision that often reaches the physical limits of the system. However, it is unclear how the organism uses this precision and whether it has biological content. Here we address this question in the developing fly embryo, in which a genetic network of patterning genes reaches 1% precision in positioning cells along the embryo axis. The network consists of three interconnected layers: an input layer of maternal gradients, a processing layer of gap genes, and an output layer of pair-rule genes with seven-striped patterns. From measurements of gap gene protein expression in hundreds of wild-type embryos we construct a ``decoder'', which is a look-up table that determines cellular positions from the concentration means, variances and co-variances. When we apply the decoder to measurements in mutant embryos lacking various combinations of the maternal inputs, we predict quantitative changes in the output layer such as missing, altered or displaced stripes. We confirm these predictions by measuring pair-rule expression in the mutant embryos. Our results thereby show that the precision of the patterning network is biologically meaningful and a necessary feature for decoding cell positions in the early fly embryo.
Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm

PubMed Central

Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

2012-01-01

Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565
Molecular networks discriminating mouse bladder responses to intravesical bacillus Calmette-Guerin (BCG), LPS, and TNF-α

PubMed Central

Saban, Marcia R; O'Donnell, Michael A; Hurst, Robert E; Wu, Xue-Ru; Simpson, Cindy; Dozmorov, Igor; Davis, Carole; Saban, Ricardo

2008-01-01

Background Despite being a mainstay for treating superficial bladder carcinoma and a promising agent for interstitial cystitis, the precise mechanism of Bacillus Calmette-Guerin (BCG) remains poorly understood. It is particularly unclear whether BCG is capable of altering gene expression in the bladder target organ beyond its well-recognized pro-inflammatory effects and how this relates to its therapeutic efficacy. The objective of this study was to determine differentially expressed genes in the mouse bladder following chronic intravesical BCG therapy and to compare the results to non-specific pro inflammatory stimuli (LPS and TNF-α). For this purpose, C57BL/6 female mice received four weekly instillations of BCG, LPS, or TNF-α. Seven days after the last instillation, the urothelium along with the submucosa was removed from detrusor muscle and the RNA was extracted from both layers for cDNA array experiments. Microarray results were normalized by a robust regression analysis and only genes with an expression above a conditional threshold of 0.001 (3SD above background) were selected for analysis. Next, genes presenting a 3-fold ratio in regard to the control group were entered in Ingenuity Pathway Analysis (IPA) for a comparative analysis in order to determine genes specifically regulated by BCG, TNF-α, and LPS. In addition, the transcriptome was precipitated with an antibody against RNA polymerase II and real-time polymerase chain reaction assay (Q-PCR) was used to confirm some of the BCG-specific transcripts. Results Molecular networks of treatment-specific genes generated several hypotheses regarding the mode of action of BCG. BCG-specific genes involved small GTPases and BCG-specific networks overlapped with the following canonical signaling pathways: axonal guidance, B cell receptor, aryl hydrocarbon receptor, IL-6, PPAR, Wnt/β-catenin, and cAMP. In addition, a specific detrusor network expressed a high degree of overlap with the development of the lymphatic system. Interestingly, TNF-α-specific networks overlapped with the following canonical signaling pathways: PPAR, death receptor, and apoptosis. Finally, LPS-specific networks overlapped with the LPS/IL-1 mediated inhibition of RXR. Because NF-kappaB occupied a central position in several networks, we further determined whether this transcription factor was part of the responses to BCG. Electrophoretic mobility shift assays confirmed the participation of NF-kappaB in the mouse bladder responses to BCG. In addition, BCG treatment of a human urothelial cancer cell line (J82) also increased the binding activity of NF-kappaB, as determined by precipitation of the chromatin by a NF-kappaB-p65 antibody and Q-PCR of genes bearing a NF-kappaB consensus sequence. Next, we tested the hypothesis of whether small GTPases such as LRG-47 are involved in the uptake of BCG by the bladder urothelium. Conclusion As expected, BCG treatment induces the transcription of genes belonging to common pro-inflammatory networks. However, BCG also induces unique genes belonging to molecular networks involved in axonal guidance and lymphatic system development within the bladder target organ. In addition, NF-kappaB seems to play a predominant role in the bladder responses to BCG therapy. Finally, in intact urothelium, BCG-GFP internalizes in LRG-47-positive vesicles. These results provide a molecular framework for the further study of the involvement of immune and nervous systems in the bladder responses to BCG therapy. PMID:18267009
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

PubMed

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks.

Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment

PubMed Central

2014-01-01

Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks. PMID:24428926
On the holistic approach in cellular and cancer biology: nonlinearity, complexity, and quasi-determinism of the dynamic cellular network.

PubMed

Waliszewski, P; Molski, M; Konarski, J

1998-06-01

A keystone of the molecular reductionist approach to cellular biology is a specific deductive strategy relating genotype to phenotype-two distinct categories. This relationship is based on the assumption that the intermediary cellular network of actively transcribed genes and their regulatory elements is deterministic (i.e., a link between expression of a gene and a phenotypic trait can always be identified, and evolution of the network in time is predetermined). However, experimental data suggest that the relationship between genotype and phenotype is nonbijective (i.e., a gene can contribute to the emergence of more than just one phenotypic trait or a phenotypic trait can be determined by expression of several genes). This implies nonlinearity (i.e., lack of the proportional relationship between input and the outcome), complexity (i.e. emergence of the hierarchical network of multiple cross-interacting elements that is sensitive to initial conditions, possesses multiple equilibria, organizes spontaneously into different morphological patterns, and is controlled in dispersed rather than centralized manner), and quasi-determinism (i.e., coexistence of deterministic and nondeterministic events) of the network. Nonlinearity within the space of the cellular molecular events underlies the existence of a fractal structure within a number of metabolic processes, and patterns of tissue growth, which is measured experimentally as a fractal dimension. Because of its complexity, the same phenotype can be associated with a number of alternative sequences of cellular events. Moreover, the primary cause initiating phenotypic evolution of cells such as malignant transformation can be favored probabilistically, but not identified unequivocally. Thermodynamic fluctuations of energy rather than gene mutations, the material traits of the fluctuations alter both the molecular and informational structure of the network. Then, the interplay between deterministic chaos, complexity, self-organization, and natural selection drives formation of malignant phenotype. This concept offers a novel perspective for investigation of tumorigenesis without invalidating current molecular findings. The essay integrates the ideas of the sciences of complexity in a biological context.
Estimation of the proteomic cancer co-expression sub networks by using association estimators.

PubMed

Erdoğan, Cihat; Kurt, Zeyneb; Diri, Banu

2017-01-01

In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators' performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists.
Estimation of the proteomic cancer co-expression sub networks by using association estimators

PubMed Central

Kurt, Zeyneb; Diri, Banu

2017-01-01

In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators’ performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists. PMID:29145449
Genomic connectivity networks based on the BrainSpan atlas of the developing human brain

NASA Astrophysics Data System (ADS)

Mahfouz, Ahmed; Ziats, Mark N.; Rennert, Owen M.; Lelieveldt, Boudewijn P. F.; Reinders, Marcel J. T.

2014-03-01

The human brain comprises systems of networks that span the molecular, cellular, anatomic and functional levels. Molecular studies of the developing brain have focused on elucidating networks among gene products that may drive cellular brain development by functioning together in biological pathways. On the other hand, studies of the brain connectome attempt to determine how anatomically distinct brain regions are connected to each other, either anatomically (diffusion tensor imaging) or functionally (functional MRI and EEG), and how they change over development. A global examination of the relationship between gene expression and connectivity in the developing human brain is necessary to understand how the genetic signature of different brain regions instructs connections to other regions. Furthermore, analyzing the development of connectivity networks based on the spatio-temporal dynamics of gene expression provides a new insight into the effect of neurodevelopmental disease genes on brain networks. In this work, we construct connectivity networks between brain regions based on the similarity of their gene expression signature, termed "Genomic Connectivity Networks" (GCNs). Genomic connectivity networks were constructed using data from the BrainSpan Transcriptional Atlas of the Developing Human Brain. Our goal was to understand how the genetic signatures of anatomically distinct brain regions relate to each other across development. We assessed the neurodevelopmental changes in connectivity patterns of brain regions when networks were constructed with genes implicated in the neurodevelopmental disorder autism (autism spectrum disorder; ASD). Using graph theory metrics to characterize the GCNs, we show that ASD-GCNs are relatively less connected later in development with the cerebellum showing a very distinct expression of ASD-associated genes compared to other brain regions.
Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.

PubMed

Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J

2013-11-12

Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.
A Framework for Engineering Stress Resilient Plants Using Genetic Feedback Control and Regulatory Network Rewiring.

PubMed

Foo, Mathias; Gherman, Iulia; Zhang, Peijun; Bates, Declan G; Denby, Katherine J

2018-05-23

Crop disease leads to significant waste worldwide, both pre- and postharvest, with subsequent economic and sustainability consequences. Disease outcome is determined both by the plants' response to the pathogen and by the ability of the pathogen to suppress defense responses and manipulate the plant to enhance colonization. The defense response of a plant is characterized by significant transcriptional reprogramming mediated by underlying gene regulatory networks, and components of these networks are often targeted by attacking pathogens. Here, using gene expression data from Botrytis cinerea-infected Arabidopsis plants, we develop a systematic approach for mitigating the effects of pathogen-induced network perturbations, using the tools of synthetic biology. We employ network inference and system identification techniques to build an accurate model of an Arabidopsis defense subnetwork that contains key genes determining susceptibility of the plant to the pathogen attack. Once validated against time-series data, we use this model to design and test perturbation mitigation strategies based on the use of genetic feedback control. We show how a synthetic feedback controller can be designed to attenuate the effect of external perturbations on the transcription factor CHE in our subnetwork. We investigate and compare two approaches for implementing such a controller biologically-direct implementation of the genetic feedback controller, and rewiring the regulatory regions of multiple genes-to achieve the network motif required to implement the controller. Our results highlight the potential of combining feedback control theory with synthetic biology for engineering plants with enhanced resilience to environmental stress.
Inferring gene regression networks with model trees

PubMed Central

2010-01-01

Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
Network-assisted crop systems genetics: network inference and integrative analysis.

PubMed

Lee, Tak; Kim, Hyojin; Lee, Insuk

2015-04-01

Although next-generation sequencing (NGS) technology has enabled the decoding of many crop species genomes, most of the underlying genetic components for economically important crop traits remain to be determined. Network approaches have proven useful for the study of the reference plant, Arabidopsis thaliana, and the success of network-based crop genetics will also require the availability of a genome-scale functional networks for crop species. In this review, we discuss how to construct functional networks and elucidate the holistic view of a crop system. The crop gene network then can be used for gene prioritization and the analysis of resequencing-based genome-wide association study (GWAS) data, the amount of which will rapidly grow in the field of crop science in the coming years. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reverse engineering the gap gene network of Drosophila melanogaster.

PubMed

Perkins, Theodore J; Jaeger, Johannes; Reinitz, John; Glass, Leon

2006-05-01

A fundamental problem in functional genomics is to determine the structure and dynamics of genetic networks based on expression data. We describe a new strategy for solving this problem and apply it to recently published data on early Drosophila melanogaster development. Our method is orders of magnitude faster than current fitting methods and allows us to fit different types of rules for expressing regulatory relationships. Specifically, we use our approach to fit models using a smooth nonlinear formalism for modeling gene regulation (gene circuits) as well as models using logical rules based on activation and repression thresholds for transcription factors. Our technique also allows us to infer regulatory relationships de novo or to test network structures suggested by the literature. We fit a series of models to test several outstanding questions about gap gene regulation, including regulation of and by hunchback and the role of autoactivation. Based on our modeling results and validation against the experimental literature, we propose a revised network structure for the gap gene system. Interestingly, some relationships in standard textbook models of gap gene regulation appear to be unnecessary for or even inconsistent with the details of gap gene expression during wild-type development.
Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

PubMed Central

2009-01-01

Background The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes. Results We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality. Conclusion We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing essentiality. PMID:19758426
In silico experiment system for testing hypothesis on gene functions using three condition specific biological networks.

PubMed

Lee, Chai-Jin; Kang, Dongwon; Lee, Sangseon; Lee, Sunwon; Kang, Jaewoo; Kim, Sun

2018-05-25

Determining functions of a gene requires time consuming, expensive biological experiments. Scientists can speed up this experimental process if the literature information and biological networks can be adequately provided. In this paper, we present a web-based information system that can perform in silico experiments of computationally testing hypothesis on the function of a gene. A hypothesis that is specified in English by the user is converted to genes using a literature and knowledge mining system called BEST. Condition-specific TF, miRNA and PPI (protein-protein interaction) networks are automatically generated by projecting gene and miRNA expression data to template networks. Then, an in silico experiment is to test how well the target genes are connected from the knockout gene through the condition-specific networks. The test result visualizes path from the knockout gene to the target genes in the three networks. Statistical and information-theoretic scores are provided on the resulting web page to help scientists either accept or reject the hypothesis being tested. Our web-based system was extensively tested using three data sets, such as E2f1, Lrrk2, and Dicer1 knockout data sets. We were able to re-produce gene functions reported in the original research papers. In addition, we comprehensively tested with all disease names in MalaCards as hypothesis to show the effectiveness of our system. Our in silico experiment system can be very useful in suggesting biological mechanisms which can be further tested in vivo or in vitro. http://biohealth.snu.ac.kr/software/insilico/. Copyright © 2018 Elsevier Inc. All rights reserved.
The immunoglobulin-like genetic predetermination of the brain: the protocadherins, blueprint of the neuronal network

NASA Astrophysics Data System (ADS)

Hilschmann, N.; Barnikol, H. U.; Barnikol-Watanabe, S.; Götz, H.; Kratzin, H.; Thinnes, F. P.

2001-01-01

The morphogenesis of the brain is governed by synaptogenesis. Synaptogenesis in turn is determined by cell adhesion molecules, which bridge the synaptic cleft and, by homophilic contact, decide which neurons are connected and which are not. Because of their enormous diversification in specificities, protocadherins (pcdhα, pcdhβ, pcdhγ), a new class of cadherins, play a decisive role. Surprisingly, the genetic control of the protocadherins is very similar to that of the immunoglobulins. There are three sets of variable (V) genes followed by a corresponding constant (C) gene. Applying the rules of the immunoglobulin genes to the protocadherin genes leads, despite of this similarity, to quite different results in the central nervous system. The lymphocyte expresses one single receptor molecule specifically directed against an outside stimulus. In contrast, there are three specific recognition sites in each neuron, each expressing a different protocadherin. In this way, 4,950 different neurons arising from one stem cell form a neuronal network, in which homophilic contacts can be formed in 52 layers, permitting an enormous number of different connections and restraints between neurons. This network is one module of the central computer of the brain. Since the V-genes are generated during evolution and V-gene translocation during embryogenesis, outside stimuli have no influence on this network. The network is an inborn property of the protocadherin genes. Every circuit produced, as well as learning and memory, has to be based on this genetically predetermined network. This network is so universal that it can cope with everything, even the unexpected. In this respect the neuronal network resembles the recognition sites of the immunoglobulins.
Analyzing the Coordinated Gene Network Underlying Temperature-Dependent Sex Determination in Reptiles

PubMed Central

Shoemaker, Christina M.; Crews, David

2009-01-01

Although gonadogenesis has been extensively studied in vertebrates with genetic sex determination, investigations at the molecular level in nontraditional model organisms with temperature-dependent sex determination are a relatively new area of research. Results show that while the key players of the molecular network underlying gonad development appear to be retained, their functions range from conserved to novel roles. In this review, we summarize experiments investigating candidate molecular players underlying temperature-dependent sex determination. We discuss some of the problems encountered unraveling this network, pose potential solutions, and suggest rewarding future directions of research. PMID:19022389
Polyploidy in animals: effects of gene expression on sex determination, evolution and ecology.

PubMed

Wertheim, B; Beukeboom, L W; van de Zande, L

2013-01-01

Polyploidy is rarer in animals than in plants. Why? Since Muller's observation in 1925, many hypotheses have been proposed and tested, but none were able to completely explain this intriguing fact. New genomic technologies enable the study of whole genomes to explain the constraints on or consequences of polyploidization, rather than focusing on specific genes or life history characteristics. Here, we review a selection of old and recent literature on polyploidy in animals, with emphasis on the consequences of polyploidization for gene expression patterns and genomic network interactions. We propose a conceptual model to contrast various scenarios for changes in genomic networks, which may serve as a framework to explain the different evolutionary dynamics of polyploidy in animals and plants. We also present new insights of genetic sex determination in animals and our emerging understanding of how animal sex determination systems may hamper or enable polyploidization, including some recent data on haplodiploids. We discuss the role of polyploidy in evolution and ecology, using a gene regulation perspective, and conclude with a synopsis regarding the effects of whole genome duplications on the balance of genomic networks. See also the sister articles focusing on plants by Ashman et al. and Madlung and Wendel in this themed issue. Copyright © 2013 S. Karger AG, Basel.
DEFINING THE PLAYERS IN HIGHER-ORDER NETWORKS: PREDICTIVE MODELING FOR REVERSE ENGINEERING FUNCTIONAL INFLUENCE NETWORKS

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDermott, Jason E.; Costa, Michelle N.; Stevens, S.L.

A difficult problem that is currently growing rapidly due to the sharp increase in the amount of high-throughput data available for many systems is that of determining useful and informative causative influence networks. These networks can be used to predict behavior given observation of a small number of components, predict behavior at a future time point, or identify components that are critical to the functioning of the system under particular conditions. In these endeavors incorporating observations of systems from a wide variety of viewpoints can be particularly beneficial, but has often been undertaken with the objective of inferring networks thatmore » are generally applicable. The focus of the current work is to integrate both general observations and measurements taken for a particular pathology, that of ischemic stroke, to provide improved ability to produce useful predictions of systems behavior. A number of hybrid approaches have recently been proposed for network generation in which the Gene Ontology is used to filter or enrich network links inferred from gene expression data through reverse engineering methods. These approaches have been shown to improve the biological plausibility of the inferred relationships determined, but still treat knowledge-based and machine-learning inferences as incommensurable inputs. In this paper, we explore how further improvements may be achieved through a full integration of network inference insights achieved through application of the Gene Ontology and reverse engineering methods with specific reference to the construction of dynamic models of transcriptional regulatory networks. We show that integrating two approaches to network construction, one based on reverse-engineering from conditional transcriptional data, one based on reverse-engineering from in situ hybridization data, and another based on functional associations derived from Gene Ontology, using probabilities can improve results of clustering as evaluated by a predictive model of transcriptional expression levels.« less
Planting increases the abundance and structure complexity of soil core functional genes relevant to carbon and nitrogen cycling

PubMed Central

Wang, Feng; Liang, Yuting; Jiang, Yuji; Yang, Yunfeng; Xue, Kai; Xiong, Jinbo; Zhou, Jizhong; Sun, Bo

2015-01-01

Plants have an important impact on soil microbial communities and their functions. However, how plants determine the microbial composition and network interactions is still poorly understood. During a four-year field experiment, we investigated the functional gene composition of three types of soils (Phaeozem, Cambisols and Acrisol) under maize planting and bare fallow regimes located in cold temperate, warm temperate and subtropical regions, respectively. The core genes were identified using high-throughput functional gene microarray (GeoChip 3.0), and functional molecular ecological networks (fMENs) were subsequently developed with the random matrix theory (RMT)-based conceptual framework. Our results demonstrated that planting significantly (P < 0.05) increased the gene alpha-diversity in terms of richness and Shannon – Simpson’s indexes for all three types of soils and 83.5% of microbial alpha-diversity can be explained by the plant factor. Moreover, planting had significant impacts on the microbial community structure and the network interactions of the microbial communities. The calculated network complexity was higher under maize planting than under bare fallow regimes. The increase of the functional genes led to an increase in both soil respiration and nitrification potential with maize planting, indicating that changes in the soil microbial communities and network interactions influenced ecological functioning. PMID:26396042
Uncovering co-expression gene network modules regulating fruit acidity in diverse apples.

PubMed

Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong

2015-08-16

Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P < 0.001) correlation with malate. Of these, the Ma1 containing module (Turquoise) of 336 genes showed the highest correlation (0.79). We also identified 12 intramodular hub genes from each of the five modules and 18 enriched gene ontology (GO) terms and MapMan sub-bines, including two GO terms (GO:0015979 and GO:0009765) and two MapMap sub-bins (1.3.4 and 1.1.1.1) related to photosynthesis in module Turquoise. Using Lemon-Tree algorithms, we identified 12 regulator genes of probabilistic scores 35.5-81.0, including MDP0000525602 (a LLR receptor kinase), MDP0000319170 (an IQD2-like CaM binding protein) and MDP0000190273 (an EIN3-like transcription factor) of greater interest for being one of the 18 MSAGs or one of the 12 intramodular hub genes in Turquoise, and/or a regulator to the cluster containing Ma1. The most relevant finding of this study is the identification of the MSAGs, intramodular hub genes, enriched photosynthesis related processes, and regulator genes in a WGCNA module Turquoise that not only encompasses Ma1 but also shows the highest modular correlation with acidity. Overall, this study provides important insight into the Ma1-mediated gene network controlling acidity in mature apple fruit of diverse genetic background.
Dual regulation of gene expression mediated by extended MAPK activation and salicylic acid contributes to robust innate immunity in Arabidopsis thaliana.

PubMed

Tsuda, Kenichi; Mine, Akira; Bethke, Gerit; Igarashi, Daisuke; Botanga, Christopher J; Tsuda, Yayoi; Glazebrook, Jane; Sato, Masanao; Katagiri, Fumiaki

2013-01-01

Network robustness is a crucial property of the plant immune signaling network because pathogens are under a strong selection pressure to perturb plant network components to dampen plant immune responses. Nevertheless, modulation of network robustness is an area of network biology that has rarely been explored. While two modes of plant immunity, Effector-Triggered Immunity (ETI) and Pattern-Triggered Immunity (PTI), extensively share signaling machinery, the network output is much more robust against perturbations during ETI than PTI, suggesting modulation of network robustness. Here, we report a molecular mechanism underlying the modulation of the network robustness in Arabidopsis thaliana. The salicylic acid (SA) signaling sector regulates a major portion of the plant immune response and is important in immunity against biotrophic and hemibiotrophic pathogens. In Arabidopsis, SA signaling was required for the proper regulation of the vast majority of SA-responsive genes during PTI. However, during ETI, regulation of most SA-responsive genes, including the canonical SA marker gene PR1, could be controlled by SA-independent mechanisms as well as by SA. The activation of the two immune-related MAPKs, MPK3 and MPK6, persisted for several hours during ETI but less than one hour during PTI. Sustained MAPK activation was sufficient to confer SA-independent regulation of most SA-responsive genes. Furthermore, the MPK3 and SA signaling sectors were compensatory to each other for inhibition of bacterial growth as well as for PR1 expression during ETI. These results indicate that the duration of the MAPK activation is a critical determinant for modulation of robustness of the immune signaling network. Our findings with the plant immune signaling network imply that the robustness level of a biological network can be modulated by the activities of network components.
Gene network polymorphism is the raw material of natural selection: the selfish gene network hypothesis.

PubMed

Boldogköi, Zsolt

2004-09-01

Population genetics, the mathematical theory of modern evolutionary biology, defines evolution as the alteration of the frequency of distinct gene variants (alleles) differing in fitness over the time. The major problem with this view is that in gene and protein sequences we can find little evidence concerning the molecular basis of phenotypic variance, especially those that would confer adaptive benefit to the bearers. Some novel data, however, suggest that a large amount of genetic variation exists in the regulatory region of genes within populations. In addition, comparison of homologous DNA sequences of various species shows that evolution appears to depend more strongly on gene expression than on the genes themselves. Furthermore, it has been demonstrated in several systems that genes form functional networks, whose products exhibit interrelated expression profiles. Finally, it has been found that regulatory circuits of development behave as evolutionary units. These data demonstrate that our view of evolution calls for a new synthesis. In this article I propose a novel concept, termed the selfish gene network hypothesis, which is based on an overall consideration of the above findings. The major statements of this hypothesis are as follows. (1) Instead of individual genes, gene networks (GNs) are responsible for the determination of traits and behaviors. (2) The primary source of microevolution is the intraspecific polymorphism in GNs and not the allelic variation in either the coding or the regulatory sequences of individual genes. (3) GN polymorphism is generated by the variation in the regulatory regions of the component genes and not by the variance in their coding sequences. (4) Evolution proceeds through continuous restructuring of the composition of GNs rather than fixing of specific alleles or GN variants.

Analysis of genetic association in Listeria and Diabetes using Hierarchical Clustering and Silhouette Index

NASA Astrophysics Data System (ADS)

Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.

2016-04-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks.

PubMed

Li, Min; Li, Qi; Ganegoda, Gamage Upeksha; Wang, JianXin; Wu, FangXiang; Pan, Yi

2014-11-01

Identification of disease-causing genes among a large number of candidates is a fundamental challenge in human disease studies. However, it is still time-consuming and laborious to determine the real disease-causing genes by biological experiments. With the advances of the high-throughput techniques, a large number of protein-protein interactions have been produced. Therefore, to address this issue, several methods based on protein interaction network have been proposed. In this paper, we propose a shortest path-based algorithm, named SPranker, to prioritize disease-causing genes in protein interaction networks. Considering the fact that diseases with similar phenotypes are generally caused by functionally related genes, we further propose an improved algorithm SPGOranker by integrating the semantic similarity of GO annotations. SPGOranker not only considers the topological similarity between protein pairs in a protein interaction network but also takes their functional similarity into account. The proposed algorithms SPranker and SPGOranker were applied to 1598 known orphan disease-causing genes from 172 orphan diseases and compared with three state-of-the-art approaches, ICN, VS and RWR. The experimental results show that SPranker and SPGOranker outperform ICN, VS, and RWR for the prioritization of orphan disease-causing genes. Importantly, for the case study of severe combined immunodeficiency, SPranker and SPGOranker predict several novel causal genes.
Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.

PubMed

Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M

2017-01-01

Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms and genes that lead to complex phenotypes, like meat quality, and the nutritional and healthfulness value of beef. Improvements in genome annotation and knowledge of gene function will contribute to more comprehensive analyses that will advance our ability to dissect the complex architecture of complex traits.
High-resolution gene expression data from blastoderm embryos of the scuttle fly Megaselia abdita

PubMed Central

Wotton, Karl R; Jiménez-Guri, Eva; Crombach, Anton; Cicin-Sain, Damjan; Jaeger, Johannes

2015-01-01

Gap genes are involved in segment determination during early development in dipteran insects (flies, midges, and mosquitoes). We carried out a systematic quantitative comparative analysis of the gap gene network across different dipteran species. Our work provides mechanistic insights into the evolution of this pattern-forming network. As a central component of our project, we created a high-resolution quantitative spatio-temporal data set of gap and maternal co-ordinate gene expression in the blastoderm embryo of the non-drosophilid scuttle fly, Megaselia abdita. Our data include expression patterns in both wild-type and RNAi-treated embryos. The data—covering 10 genes, 10 time points, and over 1,000 individual embryos—consist of original embryo images, quantified expression profiles, extracted positions of expression boundaries, and integrated expression patterns, plus metadata and intermediate processing steps. These data provide a valuable resource for researchers interested in the comparative study of gene regulatory networks and pattern formation, an essential step towards a more quantitative and mechanistic understanding of developmental evolution. PMID:25977812
Prediction of C. elegans Longevity Genes by Human and Worm Longevity Networks

PubMed Central

de Magalhães, João Pedro; Ruvkun, Gary; Fraifeld, Vadim E.; Curran, Sean P.

2012-01-01

Intricate and interconnected pathways modulate longevity, but screens to identify the components of these pathways have not been saturating. Because biological processes are often executed by protein complexes and fine-tuned by regulatory factors, the first-order protein-protein interactors of known longevity genes are likely to participate in the regulation of longevity. Data-rich maps of protein interactions have been established for many cardinal organisms such as yeast, worms, and humans. We propose that these interaction maps could be mined for the identification of new putative regulators of longevity. For this purpose, we have constructed longevity networks in both humans and worms. We reasoned that the essential first-order interactors of known longevity-associated genes in these networks are more likely to have longevity phenotypes than randomly chosen genes. We have used C. elegans to determine whether post-developmental inactivation of these essential genes modulates lifespan. Our results suggest that the worm and human longevity networks are functionally relevant and possess a high predictive power for identifying new longevity regulators. PMID:23144747
Engineering of a synthetic quadrastable gene network to approach Waddington landscape and cell fate determination.

PubMed

Wu, Fuqing; Su, Ri-Qi; Lai, Ying-Cheng; Wang, Xiao

2017-04-11

The process of cell fate determination has been depicted intuitively as cells travelling and resting on a rugged landscape, which has been probed by various theoretical studies. However, few studies have experimentally demonstrated how underlying gene regulatory networks shape the landscape and hence orchestrate cellular decision-making in the presence of both signal and noise. Here we tested different topologies and verified a synthetic gene circuit with mutual inhibition and auto-activations to be quadrastable, which enables direct study of quadruple cell fate determination on an engineered landscape. We show that cells indeed gravitate towards local minima and signal inductions dictate cell fates through modulating the shape of the multistable landscape. Experiments, guided by model predictions, reveal that sequential inductions generate distinct cell fates by changing landscape in sequence and hence navigating cells to different final states. This work provides a synthetic biology framework to approach cell fate determination and suggests a landscape-based explanation of fixed induction sequences for targeted differentiation.
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

PubMed

Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin

2014-01-01

Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Inferring Nonlinear Gene Regulatory Networks from Gene Expression Data Based on Distance Correlation

PubMed Central

Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin

2014-01-01

Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference. PMID:24551058
A framework for analyzing the relationship between gene expression and morphological, topological, and dynamical patterns in neuronal networks.

PubMed

de Arruda, Henrique Ferraz; Comin, Cesar Henrique; Miazaki, Mauro; Viana, Matheus Palhares; Costa, Luciano da Fontoura

2015-04-30

A key point in developmental biology is to understand how gene expression influences the morphological and dynamical patterns that are observed in living beings. In this work we propose a methodology capable of addressing this problem that is based on estimating the mutual information and Pearson correlation between the intensity of gene expression and measurements of several morphological properties of the cells. A similar approach is applied in order to identify effects of gene expression over the system dynamics. Neuronal networks were artificially grown over a lattice by considering a reference model used to generate artificial neurons. The input parameters of the artificial neurons were determined according to two distinct patterns of gene expression and the dynamical response was assessed by considering the integrate-and-fire model. As far as single gene dependence is concerned, we found that the interaction between the gene expression and the network topology, as well as between the former and the dynamics response, is strongly affected by the gene expression pattern. In addition, we observed a high correlation between the gene expression and some topological measurements of the neuronal network for particular patterns of gene expression. To our best understanding, there are no similar analyses to compare with. A proper understanding of gene expression influence requires jointly studying the morphology, topology, and dynamics of neurons. The proposed framework represents a first step towards predicting gene expression patterns from morphology and connectivity. Copyright © 2015. Published by Elsevier B.V.
Characterization of Pisrt1/Foxl2 in Ellobius lutescens and exclusion as sex-determining genes.

PubMed

Baumstark, Annette; Hameister, Horst; Hakhverdyan, Mikhayil; Bakloushinskaya, Irina; Just, Walter

2005-04-01

The rodent Ellobius lutescens is an exceptional mammal which determines male sex constitutively without the SRY gene and, therefore, may serve as an animal model for human 46,XX female-to-male sex reversal. It was suggested that other factors of the network of sex-determining genes determine maleness in these animals. However, some sex-determining genes like SOX9 and SF1 have already been excluded by segregation analysis as primary sex-determining factors in E. lutescens. In this work, we have cloned and characterized two genes of the PIS (polled intersex syndrome) gene interval, which were reported as candidates in female-to-male sex reversal in hornless goats recently. The genes Foxl2 and Pisrt1 from that interval were identified in E. lutescens DNA and mapped to Chromosome 8. We have excluded linkage of Foxl2 and Pisrt1 loci with the sex of the animals. Hence, the involvement of this gene region in sex determination may be specific for goats and is not a general mechanism of XX sex reversal or XX male sex determination.
Coherent organization in gene regulation: a study on six networks

NASA Astrophysics Data System (ADS)

Aral, Neşe; Kabakçıoğlu, Alkan

2016-04-01

Structural and dynamical fingerprints of evolutionary optimization in biological networks are still unclear. Here we analyze the dynamics of genetic regulatory networks responsible for the regulation of cell cycle and cell differentiation in three organisms or cell types each, and show that they follow a version of Hebb's rule which we have termed coherence. More precisely, we find that simultaneously expressed genes with a common target are less likely to act antagonistically at the attractors of the regulatory dynamics. We then investigate the dependence of coherence on structural parameters, such as the mean number of inputs per node and the activatory/repressory interaction ratio, as well as on dynamically determined quantities, such as the basin size and the number of expressed genes.
A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo

NASA Technical Reports Server (NTRS)

Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar;

2002-01-01

We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of the model are greatly facilitated by interspecific computational sequence comparison, which affords a rapid identification of likely cis-regulatory elements in advance of experimental analysis. The network specifies genomically encoded regulatory processes between early cleavage and gastrula stages. These control the specification of the micromere lineage and of the initial veg(2) endomesodermal domain; the blastula-stage separation of the central veg(2) mesodermal domain (i.e., the secondary mesenchyme progenitor field) from the peripheral veg(2) endodermal domain; the stabilization of specification state within these domains; and activation of some downstream differentiation genes. Each of the temporal-spatial phases of specification is represented in a subelement of the network model, that treats regulatory events within the relevant embryonic nuclei at particular stages. (c) 2002 Elsevier Science (USA).

Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weighill, Deborah; Jones, Piet; Shah, Manesh

Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery

DOE PAGES

Weighill, Deborah; Jones, Piet; Shah, Manesh; ...

2018-05-11

Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Introduction to focus issue: quantitative approaches to genetic networks.

PubMed

Albert, Réka; Collins, James J; Glass, Leon

2013-06-01

All cells of living organisms contain similar genetic instructions encoded in the organism's DNA. In any particular cell, the control of the expression of each different gene is regulated, in part, by binding of molecular complexes to specific regions of the DNA. The molecular complexes are composed of protein molecules, called transcription factors, combined with various other molecules such as hormones and drugs. Since transcription factors are coded by genes, cellular function is partially determined by genetic networks. Recent research is making large strides to understand both the structure and the function of these networks. Further, the emerging discipline of synthetic biology is engineering novel gene circuits with specific dynamic properties to advance both basic science and potential practical applications. Although there is not yet a universally accepted mathematical framework for studying the properties of genetic networks, the strong analogies between the activation and inhibition of gene expression and electric circuits suggest frameworks based on logical switching circuits. This focus issue provides a selection of papers reflecting current research directions in the quantitative analysis of genetic networks. The work extends from molecular models for the binding of proteins, to realistic detailed models of cellular metabolism. Between these extremes are simplified models in which genetic dynamics are modeled using classical methods of systems engineering, Boolean switching networks, differential equations that are continuous analogues of Boolean switching networks, and differential equations in which control is based on power law functions. The mathematical techniques are applied to study: (i) naturally occurring gene networks in living organisms including: cyanobacteria, Mycoplasma genitalium, fruit flies, immune cells in mammals; (ii) synthetic gene circuits in Escherichia coli and yeast; and (iii) electronic circuits modeling genetic networks using field-programmable gate arrays. Mathematical analyses will be essential for understanding naturally occurring genetic networks in diverse organisms and for providing a foundation for the improved development of synthetic genetic networks.
Introduction to Focus Issue: Quantitative Approaches to Genetic Networks

NASA Astrophysics Data System (ADS)

Albert, Réka; Collins, James J.; Glass, Leon

2013-06-01

All cells of living organisms contain similar genetic instructions encoded in the organism's DNA. In any particular cell, the control of the expression of each different gene is regulated, in part, by binding of molecular complexes to specific regions of the DNA. The molecular complexes are composed of protein molecules, called transcription factors, combined with various other molecules such as hormones and drugs. Since transcription factors are coded by genes, cellular function is partially determined by genetic networks. Recent research is making large strides to understand both the structure and the function of these networks. Further, the emerging discipline of synthetic biology is engineering novel gene circuits with specific dynamic properties to advance both basic science and potential practical applications. Although there is not yet a universally accepted mathematical framework for studying the properties of genetic networks, the strong analogies between the activation and inhibition of gene expression and electric circuits suggest frameworks based on logical switching circuits. This focus issue provides a selection of papers reflecting current research directions in the quantitative analysis of genetic networks. The work extends from molecular models for the binding of proteins, to realistic detailed models of cellular metabolism. Between these extremes are simplified models in which genetic dynamics are modeled using classical methods of systems engineering, Boolean switching networks, differential equations that are continuous analogues of Boolean switching networks, and differential equations in which control is based on power law functions. The mathematical techniques are applied to study: (i) naturally occurring gene networks in living organisms including: cyanobacteria, Mycoplasma genitalium, fruit flies, immune cells in mammals; (ii) synthetic gene circuits in Escherichia coli and yeast; and (iii) electronic circuits modeling genetic networks using field-programmable gate arrays. Mathematical analyses will be essential for understanding naturally occurring genetic networks in diverse organisms and for providing a foundation for the improved development of synthetic genetic networks.
Integrated metagenomic analysis of the rumen microbiome of cattle reveals key biological mechanisms associated with methane traits.

PubMed

Wang, Haiying; Zheng, Huiru; Browne, Fiona; Roehe, Rainer; Dewhurst, Richard J; Engel, Felix; Hemmje, Matthias; Lu, Xiangwu; Walsh, Paul

2017-07-15

Methane is one of the major contributors to global warming. The rumen microbiota is directly involved in methane production in cattle. The link between variation in rumen microbial communities and host genetics has important applications and implications in bioscience. Having the potential to reveal the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis holds great promise in this endeavour. This study investigates the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. Based on the relative abundance of 1570 microbial genes identified in a metagenomics analysis, the co-abundance network was constructed and functional modules of microbial genes were identified. One of the main contributions is to develop a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network. The resulting network, consisting of 549 microbial genes and 3349 connections, exhibits a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p<10 -11 ). One third of genes are involved in methane metabolism pathways. The further examination of abundance profiles across 8 samples of genes highlights that the revealed pattern of metagenomics abundance has a strong association with methane emissions. Furthermore, the module is significantly enriched with microbial genes encoding enzymes that are directly involved in methanogenesis (hypergeometric test, p<10 -9 ). Copyright © 2017 Elsevier Inc. All rights reserved.
Quantifying the underlying landscape and paths of cancer

PubMed Central

Li, Chunhe; Wang, Jin

2014-01-01

Cancer is a disease regulated by the underlying gene networks. The emergence of normal and cancer states as well as the transformation between them can be thought of as a result of the gene network interactions and associated changes. We developed a global potential landscape and path framework to quantify cancer and associated processes. We constructed a cancer gene regulatory network based on the experimental evidences and uncovered the underlying landscape. The resulting tristable landscape characterizes important biological states: normal, cancer and apoptosis. The landscape topography in terms of barrier heights between stable state attractors quantifies the global stability of the cancer network system. We propose two mechanisms of cancerization: one is by the changes of landscape topography through the changes in regulation strengths of the gene networks. The other is by the fluctuations that help the system to go over the critical barrier at fixed landscape topography. The kinetic paths from least action principle quantify the transition processes among normal state, cancer state and apoptosis state. The kinetic rates provide the quantification of transition speeds among normal, cancer and apoptosis attractors. By the global sensitivity analysis of the gene network parameters on the landscape topography, we uncovered some key gene regulations determining the transitions between cancer and normal states. This can be used to guide the design of new anti-cancer tactics, through cocktail strategy of targeting multiple key regulation links simultaneously, for preventing cancer occurrence or transforming the early cancer state back to normal state. PMID:25232051
Retinal Determination genes function along with cell-cell signals to regulate Drosophila eye development: examples of multi-layered regulation by Master Regulators

PubMed Central

Baker, Nicholas E.; Firth, Lucy C.

2015-01-01

It is thought that Retinal Determination gene products define the response made to cell-cell signals within the eye developmental field by binding to enhancers of genes that are also regulated by cell-cell signaling pathways. In Drosophila, Retinal Determination genes including Eyeless, teashirt, eyes absent, dachsous and sine oculis, are required for normal eye development and can induce ectopic eyes when mis-expressed. Characterization of the enhancers responsible for eye expression of the hedgehog, shaven, and atonal genes, as well as the dynamics of Retinal Determination gene expression themselves, now suggest a multilayered network whereby transcriptional regulation by either Retinal Determination genes or cell-cell signaling pathways can sometimes be indirect and mediated by other transcription factor intermediates. In this updated view of the interaction between extracellular information and cell intrinsic programs during development, regulation of individual genes might sometimes be several steps removed from either the Retinal Determination genes or cell-cell signaling pathways that nevertheless govern their expression. PMID:21607995
Differential Connectivity in Colorectal Cancer Gene Expression Network

PubMed

Izadi, Fereshteh

2018-05-30

Colorectal cancer (CRC) is one of the challenging types of cancers; thus, exploring effective biomarkers related to colorectal could lead to significant progresses toward the treatment of this disease. In the present study, CRC gene expression datasets have been reanalyzed. Mutual differentially expressed genes across 294 normal mucosa and adjacent tumoral samples were then utilized in order to build two independent transcriptional regulatory networks. By analyzing the networks topologically, genes with differential global connectivity related to cancer state were determined for which the potential transcriptional regulators including transcription factors were identified. The majority of differentially connected genes (DCGs) were up-regulated in colorectal transcriptome experiments. Moreover, a number of these genes have been experimentally validated as cancer or CRC-associated genes. The DCGs, including GART, TGFB1, ITGA2, SLC16A5, SOX9, and MMP7, were investigated across 12 cancer types. Functional enrichment analysis followed by detailed data mining exhibited that these candidate genes could be related to CRC by mediating in metastatic cascade in addition to shared pathways with 12 cancer types by triggering the inflammatory events Our study uncovered correlated alterations in gene expression related to CRC susceptibility and progression that the potent candidate biomarkers could provide a link to disease.

Networking of differentially expressed genes in human cancer cells resistant to methotrexate

PubMed Central

2009-01-01

Background The need for an integrated view of data obtained from high-throughput technologies gave rise to network analyses. These are especially useful to rationalize how external perturbations propagate through the expression of genes. To address this issue in the case of drug resistance, we constructed biological association networks of genes differentially expressed in cell lines resistant to methotrexate (MTX). Methods Seven cell lines representative of different types of cancer, including colon cancer (HT29 and Caco2), breast cancer (MCF-7 and MDA-MB-468), pancreatic cancer (MIA PaCa-2), erythroblastic leukemia (K562) and osteosarcoma (Saos-2), were used. The differential expression pattern between sensitive and MTX-resistant cells was determined by whole human genome microarrays and analyzed with the GeneSpring GX software package. Genes deregulated in common between the different cancer cell lines served to generate biological association networks using the Pathway Architect software. Results Dikkopf homolog-1 (DKK1) is a highly interconnected node in the network generated with genes in common between the two colon cancer cell lines, and functional validations of this target using small interfering RNAs (siRNAs) showed a chemosensitization toward MTX. Members of the UDP-glucuronosyltransferase 1A (UGT1A) family formed a network of genes differentially expressed in the two breast cancer cell lines. siRNA treatment against UGT1A also showed an increase in MTX sensitivity. Eukaryotic translation elongation factor 1 alpha 1 (EEF1A1) was overexpressed among the pancreatic cancer, leukemia and osteosarcoma cell lines, and siRNA treatment against EEF1A1 produced a chemosensitization toward MTX. Conclusions Biological association networks identified DKK1, UGT1As and EEF1A1 as important gene nodes in MTX-resistance. Treatments using siRNA technology against these three genes showed chemosensitization toward MTX. PMID:19732436
Sequence-based Network Completion Reveals the Integrality of Missing Reactions in Metabolic Networks*

PubMed Central

Krumholz, Elias W.; Libourel, Igor G. L.

2015-01-01

Genome-scale metabolic models are central in connecting genotypes to metabolic phenotypes. However, even for well studied organisms, such as Escherichia coli, draft networks do not contain a complete biochemical network. Missing reactions are referred to as gaps. These gaps need to be filled to enable functional analysis, and gap-filling choices influence model predictions. To investigate whether functional networks existed where all gap-filling reactions were supported by sequence similarity to annotated enzymes, four draft networks were supplemented with all reactions from the Model SEED database for which minimal sequence similarity was found in their genomes. Quadratic programming revealed that the number of reactions that could partake in a gap-filling solution was vast: 3,270 in the case of E. coli, where 72% of the metabolites in the draft network could connect a gap-filling solution. Nonetheless, no network could be completed without the inclusion of orphaned enzymes, suggesting that parts of the biochemistry integral to biomass precursor formation are uncharacterized. However, many gap-filling reactions were well determined, and the resulting networks showed improved prediction of gene essentiality compared with networks generated through canonical gap filling. In addition, gene essentiality predictions that were sensitive to poorly determined gap-filling reactions were of poor quality, suggesting that damage to the network structure resulting from the inclusion of erroneous gap-filling reactions may be predictable. PMID:26041773
Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia

2014-08-28

The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less
Evolving gene regulation networks into cellular networks guiding adaptive behavior: an outline how single cells could have evolved into a centralized neurosensory system

PubMed Central

Fritzsch, Bernd; Jahan, Israt; Pan, Ning; Elliott, Karen L.

2014-01-01

Understanding the evolution of the neurosensory system of man, able to reflect on its own origin, is one of the major goals of comparative neurobiology. Details of the origin of neurosensory cells, their aggregation into central nervous systems and associated sensory organs, their localized patterning into remarkably different cell types aggregated into variably sized parts of the central nervous system begin to emerge. Insights at the cellular and molecular level begin to shed some light on the evolution of neurosensory cells, partially covered in this review. Molecular evidence suggests that high mobility group (HMG) proteins of pre-metazoans evolved into the definitive Sox [SRY (sex determining region Y)-box] genes used for neurosensory precursor specification in metazoans. Likewise, pre-metazoan basic helix-loop-helix (bHLH) genes evolved in metazoans into the group A bHLH genes dedicated to neurosensory differentiation in bilaterians. Available evidence suggests that the Sox and bHLH genes evolved a cross-regulatory network able to synchronize expansion of precursor populations and their subsequent differentiation into novel parts of the brain or sensory organs. Molecular evidence suggests metazoans evolved patterning gene networks early and not dedicated to neuronal development. Only later in evolution were these patterning gene networks tied into the increasing complexity of diffusible factors, many of which were already present in pre-metazoans, to drive local patterning events. It appears that the evolving molecular basis of neurosensory cell development may have led, in interaction with differentially expressed patterning genes, to local network modifications guiding unique specializations of neurosensory cells into sensory organs and various areas of the central nervous system. PMID:25416504
Evolving gene regulatory networks into cellular networks guiding adaptive behavior: an outline how single cells could have evolved into a centralized neurosensory system.

PubMed

Fritzsch, Bernd; Jahan, Israt; Pan, Ning; Elliott, Karen L

2015-01-01

Understanding the evolution of the neurosensory system of man, able to reflect on its own origin, is one of the major goals of comparative neurobiology. Details of the origin of neurosensory cells, their aggregation into central nervous systems and associated sensory organs and their localized patterning leading to remarkably different cell types aggregated into variably sized parts of the central nervous system have begun to emerge. Insights at the cellular and molecular level have begun to shed some light on the evolution of neurosensory cells, partially covered in this review. Molecular evidence suggests that high mobility group (HMG) proteins of pre-metazoans evolved into the definitive Sox [SRY (sex determining region Y)-box] genes used for neurosensory precursor specification in metazoans. Likewise, pre-metazoan basic helix-loop-helix (bHLH) genes evolved in metazoans into the group A bHLH genes dedicated to neurosensory differentiation in bilaterians. Available evidence suggests that the Sox and bHLH genes evolved a cross-regulatory network able to synchronize expansion of precursor populations and their subsequent differentiation into novel parts of the brain or sensory organs. Molecular evidence suggests metazoans evolved patterning gene networks early, which were not dedicated to neuronal development. Only later in evolution were these patterning gene networks tied into the increasing complexity of diffusible factors, many of which were already present in pre-metazoans, to drive local patterning events. It appears that the evolving molecular basis of neurosensory cell development may have led, in interaction with differentially expressed patterning genes, to local network modifications guiding unique specializations of neurosensory cells into sensory organs and various areas of the central nervous system.
SOX9 Duplication Linked to Intersex in Deer

PubMed Central

Kropatsch, Regina; Dekomien, Gabriele; Akkad, Denis A.; Gerding, Wanda M.; Petrasch-Parwez, Elisabeth; Young, Neil D.; Altmüller, Janine; Nürnberg, Peter; Gasser, Robin B.; Epplen, Jörg T.

2013-01-01

A complex network of genes determines sex in mammals. Here, we studied a European roe deer with an intersex phenotype that was consistent with a XY genotype with incomplete male-determination. Whole genome sequencing and quantitative real-time PCR analyses revealed a triple dose of the SOX9 gene, allowing insights into a new genetic defect in a wild animal. PMID:24040047
Regulatory modules controlling maize inflorescence architecture

USDA-ARS?s Scientific Manuscript database

Genetic control of branching is a primary determinant of yield, regulating seed number and harvesting ability, yet little is known about the molecular networks that shape grain-bearing inflorescences of cereal crops. Here, we used the maize (Zea mays) inflorescence to investigate gene networks that...
Network topology and parameter estimation: from experimental design methods to gene regulatory network kinetics using a community based approach

PubMed Central

2014-01-01

Background Accurate estimation of parameters of biochemical models is required to characterize the dynamics of molecular processes. This problem is intimately linked to identifying the most informative experiments for accomplishing such tasks. While significant progress has been made, effective experimental strategies for parameter identification and for distinguishing among alternative network topologies remain unclear. We approached these questions in an unbiased manner using a unique community-based approach in the context of the DREAM initiative (Dialogue for Reverse Engineering Assessment of Methods). We created an in silico test framework under which participants could probe a network with hidden parameters by requesting a range of experimental assays; results of these experiments were simulated according to a model of network dynamics only partially revealed to participants. Results We proposed two challenges; in the first, participants were given the topology and underlying biochemical structure of a 9-gene regulatory network and were asked to determine its parameter values. In the second challenge, participants were given an incomplete topology with 11 genes and asked to find three missing links in the model. In both challenges, a budget was provided to buy experimental data generated in silico with the model and mimicking the features of different common experimental techniques, such as microarrays and fluorescence microscopy. Data could be bought at any stage, allowing participants to implement an iterative loop of experiments and computation. Conclusions A total of 19 teams participated in this competition. The results suggest that the combination of state-of-the-art parameter estimation and a varied set of experimental methods using a few datasets, mostly fluorescence imaging data, can accurately determine parameters of biochemical models of gene regulation. However, the task is considerably more difficult if the gene network topology is not completely defined, as in challenge 2. Importantly, we found that aggregating independent parameter predictions and network topology across submissions creates a solution that can be better than the one from the best-performing submission. PMID:24507381
Interplay of Noisy Gene Expression and Dynamics Explains Patterns of Bacterial Operon Organization

NASA Astrophysics Data System (ADS)

Igoshin, Oleg

2011-03-01

Bacterial chromosomes are organized into operons -- sets of genes co-transcribed into polycistronic messenger RNA. Hypotheses explaining the emergence and maintenance of operons include proportional co-regulation, horizontal transfer of intact ``selfish'' operons, emergence via gene duplication, and co-production of physically interacting proteins to speed their association. We hypothesized an alternative: operons can reduce or increase intrinsic gene expression noise in a manner dependent on the post-translational interactions, thereby resulting in selection for or against operons in depending on the network architecture. We devised five classes of two-gene network modules and show that the effects of operons on intrinsic noise depend on class membership. Two classes exhibit decreased noise with co-transcription, two others reveal increased noise, and the remaining one does not show a significant difference. To test our modeling predictions we employed bioinformatic analysis to determine the relationship gene expression noise and operon organization. The results confirm the overrepresentation of noise-minimizing operon architectures and provide evidence against other hypotheses. Our results thereby suggest a central role for gene expression noise in selecting for or maintaining operons in bacterial chromosomes. This demonstrates how post-translational network dynamics may provide selective pressure for organizing bacterial chromosomes, and has practical consequences for designing synthetic gene networks. This work is supported by National Institutes of Health grant 1R01GM096189-01.
Gene Circuit Analysis of the Terminal Gap Gene huckebein

PubMed Central

Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes

2009-01-01

The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378
Gene circuit analysis of the terminal gap gene huckebein.

PubMed

Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes

2009-10-01

The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network.
Quantitative trait loci mapping and gene network analysis implicate protocadherin-15 as a determinant of brain serotonin transporter expression.

PubMed

Ye, R; Carneiro, A M D; Han, Q; Airey, D; Sanders-Bush, E; Zhang, B; Lu, L; Williams, R; Blakely, R D

2014-03-01

Presynaptic serotonin (5-hydroxytryptamine, 5-HT) transporters (SERT) regulate 5-HT signaling via antidepressant-sensitive clearance of released neurotransmitter. Polymorphisms in the human SERT gene (SLC6A4) have been linked to risk for multiple neuropsychiatric disorders, including depression, obsessive-compulsive disorder and autism. Using BXD recombinant inbred mice, a genetic reference population that can support the discovery of novel determinants of complex traits, merging collective trait assessments with bioinformatics approaches, we examine phenotypic and molecular networks associated with SERT gene and protein expression. Correlational analyses revealed a network of genes that significantly associated with SERT mRNA levels. We quantified SERT protein expression levels and identified region- and gender-specific quantitative trait loci (QTLs), one of which associated with male midbrain SERT protein expression, centered on the protocadherin-15 gene (Pcdh15), overlapped with a QTL for midbrain 5-HT levels. Pcdh15 was also the only QTL-associated gene whose midbrain mRNA expression significantly associated with both SERT protein and 5-HT traits, suggesting an unrecognized role of the cell adhesion protein in the development or function of 5-HT neurons. To test this hypothesis, we assessed SERT protein and 5-HT traits in the Pcdh15 functional null line (Pcdh15(av-) (3J) ), studies that revealed a strong, negative influence of Pcdh15 on these phenotypes. Together, our findings illustrate the power of multidimensional profiling of recombinant inbred lines in the analysis of molecular networks that support synaptic signaling, and that, as in the case of Pcdh15, can reveal novel relationships that may underlie risk for mental illness. © 2014 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Dimorphic DNA methylation during temperature-dependent sex determination in the sea turtle Lepidochelys olivacea.

PubMed

Venegas, Daniela; Marmolejo-Valencia, Alejandro; Valdes-Quezada, Christian; Govenzensky, Tzipe; Recillas-Targa, Félix; Merchant-Larios, Horacio

2016-09-15

Sex determination in vertebrates depends on the expression of a conserved network of genes. Sea turtles such as Lepidochelys olivacea have temperature-dependent sex determination. The present work analyses some of the epigenetic processes involved in this. We describe sexual dimorphism in global DNA methylation patterns between ovaries and testes of L. olivacea and show that the differences may arise from a combination of DNA methylation and demethylation events that occur during sex determination. Irrespective of incubation temperature, 5-hydroxymethylcytosine was abundant in the bipotential gonad; however, following sex determination, this modification was no longer found in pre-Sertoli cells in the testes. These changes correlate with the establishment of the sexually dimorphic DNA methylation patterns, down regulation of Sox9 gene expression in ovaries and irreversible gonadal commitment towards a male or female differentiation pathway. Thus, DNA methylation changes may be necessary for the stabilization of the gene expression networks that drive the differentiation of the bipotential gonad to form either an ovary or a testis in L. olivacea and probably among other species that manifest temperature-dependent sex determination. Copyright © 2016 Elsevier Inc. All rights reserved.
Transcriptional Network Analysis Identifies BACH1 as a Master Regulator of Breast Cancer Bone Metastasis

PubMed Central

Liang, Yajun; Wu, Heng; Lei, Rong; Chong, Robert A.; Wei, Yong; Lu, Xin; Tagkopoulos, Ilias; Kung, Sun-Yuan; Yang, Qifeng; Hu, Guohong; Kang, Yibin

2012-01-01

The application of functional genomic analysis of breast cancer metastasis has led to the identification of a growing number of organ-specific metastasis genes, which often function in concert to facilitate different steps of the metastatic cascade. However, the gene regulatory network that controls the expression of these metastasis genes remains largely unknown. Here, we demonstrate a computational approach for the deconvolution of transcriptional networks to discover master regulators of breast cancer bone metastasis. Several known regulators of breast cancer bone metastasis such as Smad4 and HIF1 were identified in our analysis. Experimental validation of the networks revealed BACH1, a basic leucine zipper transcription factor, as the common regulator of several functional metastasis genes, including MMP1 and CXCR4. Ectopic expression of BACH1 enhanced the malignance of breast cancer cells, and conversely, BACH1 knockdown significantly reduced bone metastasis. The expression of BACH1 and its target genes was linked to the higher risk of breast cancer recurrence in patients. This study established BACH1 as the master regulator of breast cancer bone metastasis and provided a paradigm to identify molecular determinants in complex pathological processes. PMID:22875853
A negative genetic interaction map in isogenic cancer cell lines reveals cancer cell vulnerabilities

PubMed Central

Vizeacoumar, Franco J; Arnold, Roland; Vizeacoumar, Frederick S; Chandrashekhar, Megha; Buzina, Alla; Young, Jordan T F; Kwan, Julian H M; Sayad, Azin; Mero, Patricia; Lawo, Steffen; Tanaka, Hiromasa; Brown, Kevin R; Baryshnikova, Anastasia; Mak, Anthony B; Fedyshyn, Yaroslav; Wang, Yadong; Brito, Glauber C; Kasimer, Dahlia; Makhnevych, Taras; Ketela, Troy; Datti, Alessandro; Babu, Mohan; Emili, Andrew; Pelletier, Laurence; Wrana, Jeff; Wainberg, Zev; Kim, Philip M; Rottapel, Robert; O'Brien, Catherine A; Andrews, Brenda; Boone, Charles; Moffat, Jason

2013-01-01

Improved efforts are necessary to define the functional product of cancer mutations currently being revealed through large-scale sequencing efforts. Using genome-scale pooled shRNA screening technology, we mapped negative genetic interactions across a set of isogenic cancer cell lines and confirmed hundreds of these interactions in orthogonal co-culture competition assays to generate a high-confidence genetic interaction network of differentially essential or differential essentiality (DiE) genes. The network uncovered examples of conserved genetic interactions, densely connected functional modules derived from comparative genomics with model systems data, functions for uncharacterized genes in the human genome and targetable vulnerabilities. Finally, we demonstrate a general applicability of DiE gene signatures in determining genetic dependencies of other non-isogenic cancer cell lines. For example, the PTEN−/− DiE genes reveal a signature that can preferentially classify PTEN-dependent genotypes across a series of non-isogenic cell lines derived from the breast, pancreas and ovarian cancers. Our reference network suggests that many cancer vulnerabilities remain to be discovered through systematic derivation of a network of differentially essential genes in an isogenic cancer cell model. PMID:24104479
Reconstructing the regulatory circuit of cell fate determination in yeast mating response.

PubMed

Shao, Bin; Yuan, Haiyu; Zhang, Rongfei; Wang, Xuan; Zhang, Shuwen; Ouyang, Qi; Hao, Nan; Luo, Chunxiong

2017-07-01

Massive technological advances enabled high-throughput measurements of proteomic changes in biological processes. However, retrieving biological insights from large-scale protein dynamics data remains a challenging task. Here we used the mating differentiation in yeast Saccharomyces cerevisiae as a model and developed integrated experimental and computational approaches to analyze the proteomic dynamics during the process of cell fate determination. When exposed to a high dose of mating pheromone, the yeast cell undergoes growth arrest and forms a shmoo-like morphology; however, at intermediate doses, chemotropic elongated growth is initialized. To understand the gene regulatory networks that control this differentiation switch, we employed a high-throughput microfluidic imaging system that allows real-time and simultaneous measurements of cell growth and protein expression. Using kinetic modeling of protein dynamics, we classified the stimulus-dependent changes in protein abundance into two sources: global changes due to physiological alterations and gene-specific changes. A quantitative framework was proposed to decouple gene-specific regulatory modes from the growth-dependent global modulation of protein abundance. Based on the temporal patterns of gene-specific regulation, we established the network architectures underlying distinct cell fates using a reverse engineering method and uncovered the dose-dependent rewiring of gene regulatory network during mating differentiation. Furthermore, our results suggested a potential crosstalk between the pheromone response pathway and the target of rapamycin (TOR)-regulated ribosomal biogenesis pathway, which might underlie a cell differentiation switch in yeast mating response. In summary, our modeling approach addresses the distinct impacts of the global and gene-specific regulation on the control of protein dynamics and provides new insights into the mechanisms of cell fate determination. We anticipate that our integrated experimental and modeling strategies could be widely applicable to other biological systems.
From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

NASA Technical Reports Server (NTRS)

Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

2000-01-01

We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.
Stability analysis of a model gene network links aging, stress resistance, and negligible senescence.

PubMed

Kogan, Valeria; Molodtsov, Ivan; Menshikov, Leonid I; Shmookler Reis, Robert J; Fedichev, Peter

2015-08-28

Several animal species are considered to exhibit what is called negligible senescence, i.e. they do not show signs of functional decline or any increase of mortality with age. Recent studies in naked mole rat and long-lived sea urchins showed that these species do not alter their gene-expression profiles with age as much as other organisms do. This is consistent with exceptional endurance of naked mole rat tissues to various genotoxic stresses. We conjectured, therefore, that the lifelong transcriptional stability of an organism may be a key determinant of longevity. We analyzed the stability of a simple genetic-network model and found that under most common circumstances, such a gene network is inherently unstable. Over a time it undergoes an exponential accumulation of gene-regulation deviations leading to death. However, should the repair systems be sufficiently effective, the gene network can stabilize so that gene damage remains constrained along with mortality of the organism. We investigate the relationship between stress-resistance and aging and suggest that the unstable regime may provide a mathematical basis for the Gompertz "law" of aging in many species. At the same time, this model accounts for the apparently age-independent mortality observed in some exceptionally long-lived animals.
Deciphering the Interdependence between Ecological and Evolutionary Networks.

PubMed

Melián, Carlos J; Matthews, Blake; de Andreazzi, Cecilia S; Rodríguez, Jorge P; Harmon, Luke J; Fortuna, Miguel A

2018-05-24

Biological systems consist of elements that interact within and across hierarchical levels. For example, interactions among genes determine traits of individuals, competitive and cooperative interactions among individuals influence population dynamics, and interactions among species affect the dynamics of communities and ecosystem processes. Such systems can be represented as hierarchical networks, but can have complex dynamics when interdependencies among levels of the hierarchy occur. We propose integrating ecological and evolutionary processes in hierarchical networks to explore interdependencies in biological systems. We connect gene networks underlying predator-prey trait distributions to food webs. Our approach addresses longstanding questions about how complex traits and intraspecific trait variation affect the interdependencies among biological levels and the stability of meta-ecosystems. Copyright © 2018 Elsevier Ltd. All rights reserved.
Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids

PubMed Central

Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

2015-01-01

Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221

Engineering of a synthetic quadrastable gene network to approach Waddington landscape and cell fate determination

PubMed Central

Wu, Fuqing; Su, Ri-Qi; Lai, Ying-Cheng; Wang, Xiao

2017-01-01

The process of cell fate determination has been depicted intuitively as cells travelling and resting on a rugged landscape, which has been probed by various theoretical studies. However, few studies have experimentally demonstrated how underlying gene regulatory networks shape the landscape and hence orchestrate cellular decision-making in the presence of both signal and noise. Here we tested different topologies and verified a synthetic gene circuit with mutual inhibition and auto-activations to be quadrastable, which enables direct study of quadruple cell fate determination on an engineered landscape. We show that cells indeed gravitate towards local minima and signal inductions dictate cell fates through modulating the shape of the multistable landscape. Experiments, guided by model predictions, reveal that sequential inductions generate distinct cell fates by changing landscape in sequence and hence navigating cells to different final states. This work provides a synthetic biology framework to approach cell fate determination and suggests a landscape-based explanation of fixed induction sequences for targeted differentiation. DOI: http://dx.doi.org/10.7554/eLife.23702.001 PMID:28397688
Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

PubMed Central

Michailidis, George

2014-01-01

Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network. PMID:24586224
A stele-enriched gene regulatory network in the Arabidopsis root

PubMed Central

Brady, Siobhan M; Zhang, Lifang; Megraw, Molly; Martinez, Natalia J; Jiang, Eric; Yi, Charles S; Liu, Weilin; Zeng, Anna; Taylor-Teeples, Mallorie; Kim, Dahae; Ahnert, Sebastian; Ohler, Uwe; Ware, Doreen; Walhout, Albertha J M; Benfey, Philip N

2011-01-01

Tightly controlled gene expression is a hallmark of multicellular development and is accomplished by transcription factors (TFs) and microRNAs (miRNAs). Although many studies have focused on identifying downstream targets of these molecules, less is known about the factors that regulate their differential expression. We used data from high spatial resolution gene expression experiments and yeast one-hybrid (Y1H) and two-hybrid (Y2H) assays to delineate a subset of interactions occurring within a gene regulatory network (GRN) that determines tissue-specific TF and miRNA expression in plants. We find that upstream TFs are expressed in more diverse cell types than their targets and that promoters that are bound by a relatively large number of TFs correspond to key developmental regulators. The regulatory consequence of many TFs for their target was experimentally determined using genetic analysis. Remarkably, molecular phenotypes were identified for 65% of the TFs, but morphological phenotypes were associated with only 16%. This indicates that the GRN is robust, and that gene expression changes may be canalized or buffered. PMID:21245844
Generation of oscillating gene regulatory network motifs

NASA Astrophysics Data System (ADS)

van Dorp, M.; Lannoo, B.; Carlon, E.

2013-07-01

Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.
Sequence-based Network Completion Reveals the Integrality of Missing Reactions in Metabolic Networks.

PubMed

Krumholz, Elias W; Libourel, Igor G L

2015-07-31

Genome-scale metabolic models are central in connecting genotypes to metabolic phenotypes. However, even for well studied organisms, such as Escherichia coli, draft networks do not contain a complete biochemical network. Missing reactions are referred to as gaps. These gaps need to be filled to enable functional analysis, and gap-filling choices influence model predictions. To investigate whether functional networks existed where all gap-filling reactions were supported by sequence similarity to annotated enzymes, four draft networks were supplemented with all reactions from the Model SEED database for which minimal sequence similarity was found in their genomes. Quadratic programming revealed that the number of reactions that could partake in a gap-filling solution was vast: 3,270 in the case of E. coli, where 72% of the metabolites in the draft network could connect a gap-filling solution. Nonetheless, no network could be completed without the inclusion of orphaned enzymes, suggesting that parts of the biochemistry integral to biomass precursor formation are uncharacterized. However, many gap-filling reactions were well determined, and the resulting networks showed improved prediction of gene essentiality compared with networks generated through canonical gap filling. In addition, gene essentiality predictions that were sensitive to poorly determined gap-filling reactions were of poor quality, suggesting that damage to the network structure resulting from the inclusion of erroneous gap-filling reactions may be predictable. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Inference of developmental gene regulatory networks beyond classical model systems: new approaches in the post-genomic era.

PubMed

Fernandez-Valverde, Selene L; Aguilera, Felipe; Ramos-Díaz, René Alexander

2018-06-18

The advent of high-throughput sequencing technologies has revolutionized the way we understand the transformation of genetic information into morphological traits. Elucidating the network of interactions between genes that govern cell differentiation through development is one of the core challenges in genome research. These networks are known as developmental gene regulatory networks (dGRNs) and consist largely of the functional linkage between developmental control genes, cis-regulatory modules and differentiation genes, which generate spatially and temporally refined patterns of gene expression. Over the last 20 years, great advances have been made in determining these gene interactions mainly in classical model systems, including human, mouse, sea urchin, fruit fly, and worm. This has brought about a radical transformation in the fields of developmental biology and evolutionary biology, allowing the generation of high-resolution gene regulatory maps to analyse cell differentiation during animal development. Such maps have enabled the identification of gene regulatory circuits and have led to the development of network inference methods that can recapitulate the differentiation of specific cell-types or developmental stages. In contrast, dGRN research in non-classical model systems has been limited to the identification of developmental control genes via the candidate gene approach and the characterization of their spatiotemporal expression patterns, as well as to the discovery of cis-regulatory modules via patterns of sequence conservation and/or predicted transcription-factor binding sites. However, thanks to the continuous advances in high-throughput sequencing technologies, this scenario is rapidly changing. Here, we give a historical overview on the architecture and elucidation of the dGRNs. Subsequently, we summarize the approaches available to unravel these regulatory networks, highlighting the vast range of possibilities of integrating multiple technical advances and theoretical approaches to expand our understanding on the global of gene regulation during animal development in non-classical model systems. Such new knowledge will not only lead to greater insights into the evolution of molecular mechanisms underlying cell identity and animal body plans, but also into the evolution of morphological key innovations in animals.
DNA-Binding Kinetics Determines the Mechanism of Noise-Induced Switching in Gene Networks

PubMed Central

Tse, Margaret J.; Chu, Brian K.; Roy, Mahua; Read, Elizabeth L.

2015-01-01

Gene regulatory networks are multistable dynamical systems in which attractor states represent cell phenotypes. Spontaneous, noise-induced transitions between these states are thought to underlie critical cellular processes, including cell developmental fate decisions, phenotypic plasticity in fluctuating environments, and carcinogenesis. As such, there is increasing interest in the development of theoretical and computational approaches that can shed light on the dynamics of these stochastic state transitions in multistable gene networks. We applied a numerical rare-event sampling algorithm to study transition paths of spontaneous noise-induced switching for a ubiquitous gene regulatory network motif, the bistable toggle switch, in which two mutually repressive genes compete for dominant expression. We find that the method can efficiently uncover detailed switching mechanisms that involve fluctuations both in occupancies of DNA regulatory sites and copy numbers of protein products. In addition, we show that the rate parameters governing binding and unbinding of regulatory proteins to DNA strongly influence the switching mechanism. In a regime of slow DNA-binding/unbinding kinetics, spontaneous switching occurs relatively frequently and is driven primarily by fluctuations in DNA-site occupancies. In contrast, in a regime of fast DNA-binding/unbinding kinetics, switching occurs rarely and is driven by fluctuations in levels of expressed protein. Our results demonstrate how spontaneous cell phenotype transitions involve collective behavior of both regulatory proteins and DNA. Computational approaches capable of simulating dynamics over many system variables are thus well suited to exploring dynamic mechanisms in gene networks. PMID:26488666
A meta-analysis of public microarray data identifies biological regulatory networks in Parkinson's disease.

PubMed

Su, Lining; Wang, Chunjie; Zheng, Chenqing; Wei, Huiping; Song, Xiaoqing

2018-04-13

Parkinson's disease (PD) is a long-term degenerative disease that is caused by environmental and genetic factors. The networks of genes and their regulators that control the progression and development of PD require further elucidation. We examine common differentially expressed genes (DEGs) from several PD blood and substantia nigra (SN) microarray datasets by meta-analysis. Further we screen the PD-specific genes from common DEGs using GCBI. Next, we used a series of bioinformatics software to analyze the miRNAs, lncRNAs and SNPs associated with the common PD-specific genes, and then identify the mTF-miRNA-gene-gTF network. Our results identified 36 common DEGs in PD blood studies and 17 common DEGs in PD SN studies, and five of the genes were previously known to be associated with PD. Further study of the regulatory miRNAs associated with the common PD-specific genes revealed 14 PD-specific miRNAs in our study. Analysis of the mTF-miRNA-gene-gTF network about PD-specific genes revealed two feed-forward loops: one involving the SPRK2 gene, hsa-miR-19a-3p and SPI1, and the second involving the SPRK2 gene, hsa-miR-17-3p and SPI. The long non-coding RNA (lncRNA)-mediated regulatory network identified lncRNAs associated with PD-specific genes and PD-specific miRNAs. Moreover, single nucleotide polymorphism (SNP) analysis of the PD-specific genes identified two significant SNPs, and SNP analysis of the neurodegenerative disease-specific genes identified seven significant SNPs. Most of these SNPs are present in the 3'-untranslated region of genes and are controlled by several miRNAs. Our study identified a total of 53 common DEGs in PD patients compared with healthy controls in blood and brain datasets and five of these genes were previously linked with PD. Regulatory network analysis identified PD-specific miRNAs, associated long non-coding RNA and feed-forward loops, which contribute to our understanding of the mechanisms underlying PD. The SNPs identified in our study can determine whether a genetic variant is associated with PD. Overall, these findings will help guide our study of the complex molecular mechanism of PD.
A Systems' Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks

PubMed Central

Kunz, Manfred; Vera, Julio; Wolkenhauer, Olaf

2013-01-01

MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. PMID:24350286
Network neighborhood analysis with the multi-node topological overlap measure.

PubMed

Li, Ai; Horvath, Steve

2007-01-15

The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topological overlap measure. Since the use of multiple nodes in the seed set may lead to more informative neighborhoods, it can be advantageous to define multi-node similarity measures. The pairwise topological overlap measure is generalized to multiple network nodes and subsequently used in a recursive neighborhood construction method. A local permutation scheme is used to determine the neighborhood size. Using four network applications and a simulated example, we provide empirical evidence that the resulting neighborhoods are biologically meaningful, e.g. we use neighborhood analysis to identify brain cancer related genes. An executable Windows program and tutorial for multi-node topological overlap measure (MTOM) based analysis can be downloaded from the webpage (http://www.genetics.ucla.edu/labs/horvath/MTOM/).
Elucidating the genotype-phenotype relationships and network perturbations of human shared and specific disease genes from an evolutionary perspective.

PubMed

Begum, Tina; Ghosh, Tapash Chandra

2014-10-05

To date, numerous studies have been attempted to determine the extent of variation in evolutionary rates between human disease and nondisease (ND) genes. In our present study, we have considered human autosomal monogenic (Mendelian) disease genes, which were classified into two groups according to the number of phenotypic defects, that is, specific disease (SPD) gene (one gene: one defect) and shared disease (SHD) gene (one gene: multiple defects). Here, we have compared the evolutionary rates of these two groups of genes, that is, SPD genes and SHD genes with respect to ND genes. We observed that the average evolutionary rates are slow in SHD group, intermediate in SPD group, and fast in ND group. Group-to-group evolutionary rate differences remain statistically significant regardless of their gene expression levels and number of defects. We demonstrated that disease genes are under strong selective constraint if they emerge through edgetic perturbation or drug-induced perturbation of the interactome network, show tissue-restricted expression, and are involved in transmembrane transport. Among all the factors, our regression analyses interestingly suggest the independent effects of 1) drug-induced perturbation and 2) the interaction term of expression breadth and transmembrane transport on protein evolutionary rates. We reasoned that the drug-induced network disruption is a combination of several edgetic perturbations and, thus, has more severe effect on gene phenotypes. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Functional genomics annotation of a statistical epistasis network associated with bladder cancer susceptibility.

PubMed

Hu, Ting; Pan, Qinxin; Andrew, Angeline S; Langer, Jillian M; Cole, Michael D; Tomlinson, Craig R; Karagas, Margaret R; Moore, Jason H

2014-04-11

Several different genetic and environmental factors have been identified as independent risk factors for bladder cancer in population-based studies. Recent studies have turned to understanding the role of gene-gene and gene-environment interactions in determining risk. We previously developed the bioinformatics framework of statistical epistasis networks (SEN) to characterize the global structure of interacting genetic factors associated with a particular disease or clinical outcome. By applying SEN to a population-based study of bladder cancer among Caucasians in New Hampshire, we were able to identify a set of connected genetic factors with strong and significant interaction effects on bladder cancer susceptibility. To support our statistical findings using networks, in the present study, we performed pathway enrichment analyses on the set of genes identified using SEN, and found that they are associated with the carcinogen benzo[a]pyrene, a component of tobacco smoke. We further carried out an mRNA expression microarray experiment to validate statistical genetic interactions, and to determine if the set of genes identified in the SEN were differentially expressed in a normal bladder cell line and a bladder cancer cell line in the presence or absence of benzo[a]pyrene. Significant nonrandom sets of genes from the SEN were found to be differentially expressed in response to benzo[a]pyrene in both the normal bladder cells and the bladder cancer cells. In addition, the patterns of gene expression were significantly different between these two cell types. The enrichment analyses and the gene expression microarray results support the idea that SEN analysis of bladder in population-based studies is able to identify biologically meaningful statistical patterns. These results bring us a step closer to a systems genetic approach to understanding cancer susceptibility that integrates population and laboratory-based studies.
Discovering Implicit Entity Relation with the Gene-Citation-Gene Network

PubMed Central

Song, Min; Han, Nam-Gi; Kim, Yong-Hwan; Ding, Ying; Chambers, Tamy

2013-01-01

In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG) network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG) network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner. PMID:24358368
Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency.

PubMed

Yeh, Hsiang-Yuan; Cheng, Shih-Wu; Lin, Yu-Chun; Yeh, Cheng-Yu; Lin, Shih-Fang; Soo, Von-Wun

2009-12-21

Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. We provide a computational framework to reconstruct the genetic regulatory network from the microarray data using biological knowledge and constraint-based inferences. Our method is helpful in verifying possible interaction relations in gene regulatory networks and filtering out incorrect relations inferred by imperfect methods. We predicted not only individual gene related to cancer but also discovered significant gene regulation networks. Our method is also validated in several enriched published papers and databases and the significant gene regulatory networks perform critical biological functions and processes including cell adhesion molecules, androgen and estrogen metabolism, smooth muscle contraction, and GO-annotated processes. Those significant gene regulations and the critical concept of tumor progression are useful to understand cancer biology and disease treatment.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

PubMed

Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

2017-08-31

Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks

PubMed Central

Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin

2017-01-01

Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Fyn-Dependent Gene Networks in Acute Ethanol Sensitivity

PubMed Central

Farris, Sean P.; Miles, Michael F.

2013-01-01

Studies in humans and animal models document that acute behavioral responses to ethanol are predisposing factor for the risk of long-term drinking behavior. Prior microarray data from our laboratory document strain- and brain region-specific variation in gene expression profile responses to acute ethanol that may be underlying regulators of ethanol behavioral phenotypes. The non-receptor tyrosine kinase Fyn has previously been mechanistically implicated in the sedative-hypnotic response to acute ethanol. To further understand how Fyn may modulate ethanol behaviors, we used whole-genome expression profiling. We characterized basal and acute ethanol-evoked (3 g/kg) gene expression patterns in nucleus accumbens (NAC), prefrontal cortex (PFC), and ventral midbrain (VMB) of control and Fyn knockout mice. Bioinformatics analysis identified a set of Fyn-related gene networks differently regulated by acute ethanol across the three brain regions. In particular, our analysis suggested a coordinate basal decrease in myelin-associated gene expression within NAC and PFC as an underlying factor in sensitivity of Fyn null animals to ethanol sedation. An in silico analysis across the BXD recombinant inbred (RI) strains of mice identified a significant correlation between Fyn expression and a previously published ethanol loss-of-righting-reflex (LORR) phenotype. By combining PFC gene expression correlates to Fyn and LORR across multiple genomic datasets, we identified robust Fyn-centric gene networks related to LORR. Our results thus suggest that multiple system-wide changes exist within specific brain regions of Fyn knockout mice, and that distinct Fyn-dependent expression networks within PFC may be important determinates of the LORR due to acute ethanol. These results add to the interpretation of acute ethanol behavioral sensitivity in Fyn kinase null animals, and identify Fyn-centric gene networks influencing variance in ethanol LORR. Such networks may also inform future design of pharmacotherapies for the treatment and prevention of alcohol use disorders. PMID:24312422
Statistical mechanics of scale-free gene expression networks

NASA Astrophysics Data System (ADS)

Gross, Eitan

2012-12-01

The gene co-expression networks of many organisms including bacteria, mice and man exhibit scale-free distribution. This heterogeneous distribution of connections decreases the vulnerability of the network to random attacks and thus may confer the genetic replication machinery an intrinsic resilience to such attacks, triggered by changing environmental conditions that the organism may be subject to during evolution. This resilience to random attacks comes at an energetic cost, however, reflected by the lower entropy of the scale-free distribution compared to the more homogenous, random network. In this study we found that the cell cycle-regulated gene expression pattern of the yeast Saccharomyces cerevisiae obeys a power-law distribution with an exponent α = 2.1 and an entropy of 1.58. The latter is very close to the maximal value of 1.65 obtained from linear optimization of the entropy function under the constraint of a constant cost function, determined by the average degree connectivity . We further show that the yeast's gene expression network can achieve scale-free distribution in a process that does not involve growth but rather via re-wiring of the connections between nodes of an ordered network. Our results support the idea of an evolutionary selection, which acts at the level of the protein sequence, and is compatible with the notion of greater biological importance of highly connected nodes in the protein interaction network. Our constrained re-wiring model provides a theoretical framework for a putative thermodynamically driven evolutionary selection process.
Regulatory divergence between parental alleles determines gene expression patterns in hybrids.

PubMed

Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe

2015-03-29

Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.

PubMed

Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

2011-04-15

SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.

Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining

PubMed Central

2012-01-01

Background Fever is one of the most common adverse events of vaccines. The detailed mechanisms of fever and vaccine-associated gene interaction networks are not fully understood. In the present study, we employed a genome-wide, Centrality and Ontology-based Network Discovery using Literature data (CONDL) approach to analyse the genes and gene interaction networks associated with fever or vaccine-related fever responses. Results Over 170,000 fever-related articles from PubMed abstracts and titles were retrieved and analysed at the sentence level using natural language processing techniques to identify genes and vaccines (including 186 Vaccine Ontology terms) as well as their interactions. This resulted in a generic fever network consisting of 403 genes and 577 gene interactions. A vaccine-specific fever sub-network consisting of 29 genes and 28 gene interactions was extracted from articles that are related to both fever and vaccines. In addition, gene-vaccine interactions were identified. Vaccines (including 4 specific vaccine names) were found to directly interact with 26 genes. Gene set enrichment analysis was performed using the genes in the generated interaction networks. Moreover, the genes in these networks were prioritized using network centrality metrics. Making scientific discoveries and generating new hypotheses were possible by using network centrality and gene set enrichment analyses. For example, our study found that the genes in the generic fever network were more enriched in cell death and responses to wounding, and the vaccine sub-network had more gene enrichment in leukocyte activation and phosphorylation regulation. The most central genes in the vaccine-specific fever network are predicted to be highly relevant to vaccine-induced fever, whereas genes that are central only in the generic fever network are likely to be highly relevant to generic fever responses. Interestingly, no Toll-like receptors (TLRs) were found in the gene-vaccine interaction network. Since multiple TLRs were found in the generic fever network, it is reasonable to hypothesize that vaccine-TLR interactions may play an important role in inducing fever response, which deserves a further investigation. Conclusions This study demonstrated that ontology-based literature mining is a powerful method for analyzing gene interaction networks and generating new scientific hypotheses. PMID:23256563
A systems biology approach toward understanding seed composition in soybean.

PubMed

Li, Ling; Hur, Manhoi; Lee, Joon-Yong; Zhou, Wenxu; Song, Zhihong; Ransom, Nick; Demirkale, Cumhur Yusuf; Nettleton, Dan; Westgate, Mark; Arendsee, Zebulun; Iyer, Vidya; Shanks, Jackie; Nikolau, Basil; Wurtele, Eve Syrkin

2015-01-01

The molecular, biochemical, and genetic mechanisms that regulate the complex metabolic network of soybean seed development determine the ultimate balance of protein, lipid, and carbohydrate stored in the mature seed. Many of the genes and metabolites that participate in seed metabolism are unknown or poorly defined; even more remains to be understood about the regulation of their metabolic networks. A global omics analysis can provide insights into the regulation of seed metabolism, even without a priori assumptions about the structure of these networks. With the future goal of predictive biology in mind, we have combined metabolomics, transcriptomics, and metabolic flux technologies to reveal the global developmental and metabolic networks that determine the structure and composition of the mature soybean seed. We have coupled this global approach with interactive bioinformatics and statistical analyses to gain insights into the biochemical programs that determine soybean seed composition. For this purpose, we used Plant/Eukaryotic and Microbial Metabolomics Systems Resource (PMR, http://www.metnetdb.org/pmr, a platform that incorporates metabolomics data to develop hypotheses concerning the organization and regulation of metabolic networks, and MetNet systems biology tools http://www.metnetdb.org for plant omics data, a framework to enable interactive visualization of metabolic and regulatory networks. This combination of high-throughput experimental data and bioinformatics analyses has revealed sets of specific genes, genetic perturbations and mechanisms, and metabolic changes that are associated with the developmental variation in soybean seed composition. Researchers can explore these metabolomics and transcriptomics data interactively at PMR.
Pan-histone deacetylase inhibitors regulate signaling pathways involved in proliferative and pro-inflammatory mechanisms in H9c2 cells

PubMed Central

2012-01-01

Background We have shown previously that pan-HDAC inhibitors (HDACIs) m-carboxycinnamic acid bis-hydroxamide (CBHA) and trichostatin A (TSA) attenuated cardiac hypertrophy in BALB/c mice by inducing hyper-acetylation of cardiac chromatin that was accompanied by suppression of pro-inflammatory gene networks. However, it was not feasible to determine the precise contribution of the myocytes- and non-myocytes to HDACI-induced gene expression in the intact heart. Therefore, the current study was undertaken with a primary goal of elucidating temporal changes in the transcriptomes of cardiac myocytes exposed to CBHA and TSA. Results We incubated H9c2 cardiac myocytes in growth medium containing either of the two HDACIs for 6h and 24h and analyzed changes in gene expression using Illumina microarrays. H9c2 cells exposed to TSA for 6h and 24h led to differential expression of 468 and 231 genes, respectively. In contrast, cardiac myocytes incubated with CBHA for 6h and 24h elicited differential expression of 768 and 999 genes, respectively. We analyzed CBHA- and TSA-induced differentially expressed genes by Ingenuity Pathway (IPA), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Core_TF programs and discovered that CBHA and TSA impinged on several common gene networks. Thus, both HDACIs induced a repertoire of signaling kinases (PTEN-PI3K-AKT and MAPK) and transcription factors (Myc, p53, NFkB and HNF4A) representing canonical TGFβ, TNF-α, IFNγ and IL-6 specific networks. An overrepresentation of E2F, AP2, EGR1 and SP1 specific motifs was also found in the promoters of the differentially expressed genes. Apparently, TSA elicited predominantly TGFβ- and TNF-α-intensive gene networks regardless of the duration of treatment. In contrast, CBHA elicited TNF-α and IFNγ specific networks at 6 h, followed by elicitation of IL-6 and IFNγ-centered gene networks at 24h. Conclusions Our data show that both CBHA and TSA induced similar, but not identical, time-dependent, gene networks in H9c2 cardiac myocytes. Initially, both HDACIs impinged on numerous genes associated with adipokine signaling, intracellular metabolism and energetics, and cell cycle. A continued exposure to either CBHA or TSA led to the emergence of a number of apoptosis- and inflammation-specific gene networks that were apparently suppressed by both HDACIs. Based on these data we posit that the anti-inflammatory and anti-proliferative actions of HDACIs are myocyte-intrinsic. These findings advance our understanding of the mechanisms of actions of HDACIs on cardiac myocytes and reveal potential signaling pathways that may be targeted therapeutically. PMID:23249388
Method for determining gene knockouts

DOEpatents

Maranas, Costas D [Port Matilda, PA; Burgard, Anthony R [State College, PA; Pharkya, Priti [State College, PA

2011-09-27

A method for determining candidates for gene deletions and additions using a model of a metabolic network associated with an organism, the model includes a plurality of metabolic reactions defining metabolite relationships, the method includes selecting a bioengineering objective for the organism, selecting at least one cellular objective, forming an optimization problem that couples the at least one cellular objective with the bioengineering objective, and solving the optimization problem to yield at least one candidate.
Method for determining gene knockouts

DOEpatents

Maranas, Costa D; Burgard, Anthony R; Pharkya, Priti

2013-06-04

A method for determining candidates for gene deletions and additions using a model of a metabolic network associated with an organism, the model includes a plurality of metabolic reactions defining metabolite relationships, the method includes selecting a bioengineering objective for the organism, selecting at least one cellular objective, forming an optimization problem that couples the at least one cellular objective with the bioengineering objective, and solving the optimization problem to yield at least one candidate.
Stochasticity versus determinism: consequences for realistic gene regulatory network modelling and evolution.

PubMed

Jenkins, Dafyd J; Stekel, Dov J

2010-02-01

Gene regulation is one important mechanism in producing observed phenotypes and heterogeneity. Consequently, the study of gene regulatory network (GRN) architecture, function and evolution now forms a major part of modern biology. However, it is impossible to experimentally observe the evolution of GRNs on the timescales on which living species evolve. In silico evolution provides an approach to studying the long-term evolution of GRNs, but many models have either considered network architecture from non-adaptive evolution, or evolution to non-biological objectives. Here, we address a number of important modelling and biological questions about the evolution of GRNs to the realistic goal of biomass production. Can different commonly used simulation paradigms, in particular deterministic and stochastic Boolean networks, with and without basal gene expression, be used to compare adaptive with non-adaptive evolution of GRNs? Are these paradigms together with this goal sufficient to generate a range of solutions? Will the interaction between a biological goal and evolutionary dynamics produce trade-offs between growth and mutational robustness? We show that stochastic basal gene expression forces shrinkage of genomes due to energetic constraints and is a prerequisite for some solutions. In systems that are able to evolve rates of basal expression, two optima, one with and one without basal expression, are observed. Simulation paradigms without basal expression generate bloated networks with non-functional elements. Further, a range of functional solutions was observed under identical conditions only in stochastic networks. Moreover, there are trade-offs between efficiency and yield, indicating an inherent intertwining of fitness and evolutionary dynamics.
Origins of extrinsic variability in eukaryotic gene expression

NASA Astrophysics Data System (ADS)

Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff

2006-02-01

Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes simultaneously, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modelling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous lower limit for expression variability. A second source, which is modelled as originating from a common upstream transcription factor, exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Origins of extrinsic variability in eukaryotic gene expression

NASA Astrophysics Data System (ADS)

Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff

2006-03-01

Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes in concert, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modeling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous noise floor in expression variability. A second source which is modeled as originating from a common upstream transcription factor exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Functional network analysis of genes differentially expressed during xylogenesis in soc1ful woody Arabidopsis plants.

PubMed

Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic

2016-06-01

Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
SoxB1-driven transcriptional network underlies neural-specific interpretation of morphogen signals.

PubMed

Oosterveen, Tony; Kurdija, Sanja; Ensterö, Mats; Uhde, Christopher W; Bergsland, Maria; Sandberg, Magnus; Sandberg, Rickard; Muhr, Jonas; Ericson, Johan

2013-04-30

The reiterative deployment of a small cadre of morphogen signals underlies patterning and growth of most tissues during embyogenesis, but how such inductive events result in tissue-specific responses remains poorly understood. By characterizing cis-regulatory modules (CRMs) associated with genes regulated by Sonic hedgehog (Shh), retinoids, or bone morphogenetic proteins in the CNS, we provide evidence that the neural-specific interpretation of morphogen signaling reflects a direct integration of these pathways with SoxB1 proteins at the CRM level. Moreover, expression of SoxB1 proteins in the limb bud confers on mesodermal cells the potential to activate neural-specific target genes upon Shh, retinoid, or bone morphogenetic protein signaling, and the collocation of binding sites for SoxB1 and morphogen-mediatory transcription factors in CRMs faithfully predicts neural-specific gene activity. Thus, an unexpectedly simple transcriptional paradigm appears to conceptually explain the neural-specific interpretation of pleiotropic signaling during vertebrate development. Importantly, genes induced in a SoxB1-dependent manner appear to constitute repressive gene regulatory networks that are directly interlinked at the CRM level to constrain the regional expression of patterning genes. Accordingly, not only does the topology of SoxB1-driven gene regulatory networks provide a tissue-specific mode of gene activation, but it also determines the spatial expression pattern of target genes within the developing neural tube.
Past climate change on Sky Islands drives novelty in a core developmental gene network and its phenotype.

PubMed

Favé, Marie-Julie; Johnson, Robert A; Cover, Stefan; Handschuh, Stephan; Metscher, Brian D; Müller, Gerd B; Gopalan, Shyamalika; Abouheif, Ehab

2015-09-04

A fundamental and enduring problem in evolutionary biology is to understand how populations differentiate in the wild, yet little is known about what role organismal development plays in this process. Organismal development integrates environmental inputs with the action of gene regulatory networks to generate the phenotype. Core developmental gene networks have been highly conserved for millions of years across all animals, and therefore, organismal development may bias variation available for selection to work on. Biased variation may facilitate repeatable phenotypic responses when exposed to similar environmental inputs and ecological changes. To gain a more complete understanding of population differentiation in the wild, we integrated evolutionary developmental biology with population genetics, morphology, paleoecology and ecology. This integration was made possible by studying how populations of the ant species Monomorium emersoni respond to climatic and ecological changes across five 'Sky Islands' in Arizona, which are mountain ranges separated by vast 'seas' of desert. Sky Islands represent a replicated natural experiment allowing us to determine how repeatable is the response of M. emersoni populations to climate and ecological changes at the phenotypic, developmental, and gene network levels. We show that a core developmental gene network and its phenotype has kept pace with ecological and climate change on each Sky Island over the last ~90,000 years before present (BP). This response has produced two types of evolutionary change within an ant species: one type is unpredictable and contingent on the pattern of isolation of Sky lsland populations by climate warming, resulting in slight changes in gene expression, organ growth, and morphology. The other type is predictable and deterministic, resulting in the repeated evolution of a novel wingless queen phenotype and its underlying gene network in response to habitat changes induced by climate warming. Our findings reveal dynamics of developmental gene network evolution in wild populations. This holds important implications: (1) for understanding how phenotypic novelty is generated in the wild; (2) for providing a possible bridge between micro- and macroevolution; and (3) for understanding how development mediates the response of organisms to past, and potentially, future climate change.
Differential network as an indicator of osteoporosis with network entropy.

PubMed

Ma, Lili; Du, Hongmei; Chen, Guangdong

2018-07-01

Osteoporosis is a common skeletal disorder characterized by a decrease in bone mass and density. The peak bone mass (PBM) is a significant determinant of osteoporosis. To gain insights into the indicating effect of PBM to osteoporosis, this study focused on characterizing the PBM networks and identifying key genes. One biological data set with 12 monocyte low PBM samples and 11 high PBM samples was derived to construct protein-protein interaction networks (PPINs). Based on clique-merging, module-identification algorithm was used to identify modules from PPINs. The systematic calculation and comparison were performed to test whether the network entropy can discriminate the low PBM network from high PBM network. We constructed 32 destination networks with 66 modules divided from monocyte low and high PBM networks. Among them, network 11 was the only significantly differential one (P<0.05) with 8 nodes and 28 edges. All genes belonged to precursors of osteoclasts, which were related to calcium transport as well as blood monocytes. In conclusion, based on the entropy in PBM PPINs, the differential network appears to be a novel therapeutic indicator for osteoporosis during the bone monocyte progression; these findings are helpful in disclosing the pathogenetic mechanisms of osteoporosis.
Physiologically Shrinking the Solution Space of a Saccharomyces cerevisiae Genome-Scale Model Suggests the Role of the Metabolic Network in Shaping Gene Expression Noise.

PubMed

Chi, Baofang; Tao, Shiheng; Liu, Yanlin

2015-01-01

Sampling the solution space of genome-scale models is generally conducted to determine the feasible region for metabolic flux distribution. Because the region for actual metabolic states resides only in a small fraction of the entire space, it is necessary to shrink the solution space to improve the predictive power of a model. A common strategy is to constrain models by integrating extra datasets such as high-throughput datasets and C13-labeled flux datasets. However, studies refining these approaches by performing a meta-analysis of massive experimental metabolic flux measurements, which are closely linked to cellular phenotypes, are limited. In the present study, experimentally identified metabolic flux data from 96 published reports were systematically reviewed. Several strong associations among metabolic flux phenotypes were observed. These phenotype-phenotype associations at the flux level were quantified and integrated into a Saccharomyces cerevisiae genome-scale model as extra physiological constraints. By sampling the shrunken solution space of the model, the metabolic flux fluctuation level, which is an intrinsic trait of metabolic reactions determined by the network, was estimated and utilized to explore its relationship to gene expression noise. Although no correlation was observed in all enzyme-coding genes, a relationship between metabolic flux fluctuation and expression noise of genes associated with enzyme-dosage sensitive reactions was detected, suggesting that the metabolic network plays a role in shaping gene expression noise. Such correlation was mainly attributed to the genes corresponding to non-essential reactions, rather than essential ones. This was at least partially, due to regulations underlying the flux phenotype-phenotype associations. Altogether, this study proposes a new approach in shrinking the solution space of a genome-scale model, of which sampling provides new insights into gene expression noise.
Mutual regulatory interactions of the trunk gap genes during blastoderm patterning in the hemipteran Oncopeltus fasciatus.

PubMed

Ben-David, Jonathan; Chipman, Ariel D

2010-10-01

The early embryo of the milkweed bug, Oncopeltus fasciatus, appears as a single cell layer - the embryonic blastoderm - covering the entire egg. It is at this blastoderm stage that morphological domains are first determined, long before the appearance of overt segmentation. Central to the process of patterning the blastoderm into distinct domains are a group of transcription factors known as gap genes. In Drosophila melanogaster these genes form a network of interactions, and maintain sharp expression boundaries through strong mutual repression. Their restricted expression domains define specific areas along the entire body. We have studied the expression domains of the four trunk gap gene homologues in O. fasciatus and have determined their interactions through dsRNA gene knockdown experiments, followed by expression analyses. While the blastoderm in O. fasciatus includes only the first six segments of the embryo, the expression domains of the gap genes within these segments are broadly similar to those in Drosophila where the blastoderm includes all 15 segments. However, the interactions between the gap genes are surprisingly different from those in Drosophila, and mutual repression between the genes seems to play a much less significant role. This suggests that the well-studied interaction pattern in Drosophila is evolutionarily derived, and has evolved from a less strongly interacting network. Copyright © 2010 Elsevier Inc. All rights reserved.
SorghumFDB: sorghum functional genomics database with multidimensional network analysis.

PubMed

Tian, Tian; You, Qi; Zhang, Liwei; Yi, Xin; Yan, Hengyu; Xu, Wenying; Su, Zhen

2016-01-01

Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic data and some functional annotations to construct a sorghum functional genomics database (SorghumFDB). SorghumFDB integrated knowledge of sorghum gene family classifications (transcription regulators/factors, carbohydrate-active enzymes, protein kinases, ubiquitins, cytochrome P450, monolignol biosynthesis related enzymes, R-genes and organelle-genes), detailed gene annotations, miRNA and target gene information, orthologous pairs in the model plants Arabidopsis, rice and maize, gene loci conversions and a genome browser. We further constructed a dynamic network of multidimensional biological relationships, comprised of the co-expression data, protein-protein interactions and miRNA-target pairs. We took effective measures to combine the network, gene set enrichment and motif analyses to determine the key regulators that participate in related metabolic pathways, such as the lignin pathway, which is a major biological process in bioenergy-producing plants.Database URL: http://structuralbiology.cau.edu.cn/sorghum/index.html. © The Author(s) 2016. Published by Oxford University Press.
Population Connectivity Measures of Fishery-Targeted Coral Reef Species to Inform Marine Reserve Network Design in Fiji.

PubMed

Eastwood, Erin K; López, Elora H; Drew, Joshua A

2016-01-25

Coral reef fish serve as food sources to coastal communities worldwide, yet are vulnerable to mounting anthropogenic pressures like overfishing and climate change. Marine reserve networks have become important tools for mitigating these pressures, and one of the most critical factors in determining their spatial design is the degree of connectivity among different populations of species prioritized for protection. To help inform the spatial design of an expanded reserve network in Fiji, we used rapidly evolving mitochondrial genes to investigate connectivity patterns of three coral reef species targeted by fisheries in Fiji: Epinephelus merra (Serranidae), Halichoeres trimaculatus (Labridae), and Holothuria atra (Holothuriidae). The two fish species, E. merra and Ha. trimaculatus, exhibited low genetic structuring and high amounts of gene flow, whereas the sea cucumber Ho. atra displayed high genetic partitioning and predominantly westward gene flow. The idiosyncratic patterns observed among these species indicate that patterns of connectivity in Fiji are likely determined by a combination of oceanographic and ecological characteristics. Our data indicate that in the cases of species with high connectivity, other factors such as representation or political availability may dictate where reserves are placed. In low connectivity species, ensuring upstream and downstream connections is critical.
Gene expression profiles reveal key genes for early diagnosis and treatment of adamantinomatous craniopharyngioma.

PubMed

Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing

2018-04-23

Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis and benefit the therapy improvement.
Candidate gene prioritization by network analysis of differential expression using machine learning approaches

PubMed Central

2010-01-01

Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype. PMID:20840752
An Arabidopsis Gene Regulatory Network for Secondary Cell Wall Synthesis

PubMed Central

Taylor-Teeples, M; Lin, L; de Lucas, M; Turco, G; Toal, TW; Gaudinier, A; Young, NF; Trabucco, GM; Veling, MT; Lamothe, R; Handakumbura, PP; Xiong, G; Wang, C; Corwin, J; Tsoukalas, A; Zhang, L; Ware, D; Pauly, M; Kliebenstein, DJ; Dehesh, K; Tagkopoulos, I; Breton, G; Pruneda-Paz, JL; Ahnert, SE; Kay, SA; Hazen, SP; Brady, SM

2014-01-01

Summary The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. Here, we present a protein-DNA network between Arabidopsis transcription factors and secondary cell wall metabolic genes with gene expression regulated by a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. These interactions will serve as a foundation for understanding the regulation of a complex, integral plant component. PMID:25533953
Genome Neighborhood Network Reveals Insights into Enediyne Biosynthesis and Facilitates Prediction and Prioritization for Discovery

PubMed Central

Rudolf, Jeffrey D.; Yan, Xiaohui; Shen, Ben

2015-01-01

The enediynes are one of the most fascinating families of bacterial natural products given their unprecedented molecular architecture and extraordinary cytotoxicity. Enediynes are rare with only 11 structurally characterized members and four additional members isolated in their cycloaromatized form. Recent advances in DNA sequencing have resulted in an explosion of microbial genomes. A virtual survey of the GenBank and JGI genome databases revealed 87 enediyne biosynthetic gene clusters from 78 bacteria strains, implying enediynes are more common than previously thought. Here we report the construction and analysis of an enediyne genome neighborhood network (GNN) as a high-throughput approach to analyze secondary metabolite gene clusters. Analysis of the enediyne GNN facilitated rapid gene cluster annotation, revealed genetic trends in enediyne biosynthetic gene clusters resulting in a simple prediction scheme to determine 9- vs 10-membered enediyne gene clusters, and supported a genomic-based strain prioritization method for enediyne discovery. PMID:26318027

Whole-exome sequencing in obsessive-compulsive disorder identifies rare mutations in immunological and neurodevelopmental pathways

PubMed Central

Cappi, C; Brentani, H; Lima, L; Sanders, S J; Zai, G; Diniz, B J; Reis, V N S; Hounie, A G; Conceição do Rosário, M; Mariani, D; Requena, G L; Puga, R; Souza-Duran, F L; Shavitt, R G; Pauls, D L; Miguel, E C; Fernandez, T V

2016-01-01

Studies of rare genetic variation have identified molecular pathways conferring risk for developmental neuropsychiatric disorders. To date, no published whole-exome sequencing studies have been reported in obsessive-compulsive disorder (OCD). We sequenced all the genome coding regions in 20 sporadic OCD cases and their unaffected parents to identify rare de novo (DN) single-nucleotide variants (SNVs). The primary aim of this pilot study was to determine whether DN variation contributes to OCD risk. To this aim, we evaluated whether there is an elevated rate of DN mutations in OCD, which would justify this approach toward gene discovery in larger studies of the disorder. Furthermore, to explore functional molecular correlations among genes with nonsynonymous DN SNVs in OCD probands, a protein–protein interaction (PPI) network was generated based on databases of direct molecular interactions. We applied Degree-Aware Disease Gene Prioritization (DADA) to rank the PPI network genes based on their relatedness to a set of OCD candidate genes from two OCD genome-wide association studies (Stewart et al., 2013; Mattheisen et al., 2014). In addition, we performed a pathway analysis with genes from the PPI network. The rate of DN SNVs in OCD was 2.51 × 10−8 per base per generation, significantly higher than a previous estimated rate in unaffected subjects using the same sequencing platform and analytic pipeline. Several genes harboring DN SNVs in OCD were highly interconnected in the PPI network and ranked high in the DADA analysis. Nearly all the DN SNVs in this study are in genes expressed in the human brain, and a pathway analysis revealed enrichment in immunological and central nervous system functioning and development. The results of this pilot study indicate that further investigation of DN variation in larger OCD cohorts is warranted to identify specific risk genes and to confirm our preliminary finding with regard to PPI network enrichment for particular biological pathways and functions. PMID:27023170
A transcription factor hierarchy defines an environmental stress response network.

PubMed

Song, Liang; Huang, Shao-Shan Carol; Wise, Aaron; Castanon, Rosa; Nery, Joseph R; Chen, Huaming; Watanabe, Marina; Thomas, Jerushah; Bar-Joseph, Ziv; Ecker, Joseph R

2016-11-04

Environmental stresses are universally encountered by microbes, plants, and animals. Yet systematic studies of stress-responsive transcription factor (TF) networks in multicellular organisms have been limited. The phytohormone abscisic acid (ABA) influences the expression of thousands of genes, allowing us to characterize complex stress-responsive regulatory networks. Using chromatin immunoprecipitation sequencing, we identified genome-wide targets of 21 ABA-related TFs to construct a comprehensive regulatory network in Arabidopsis thaliana Determinants of dynamic TF binding and a hierarchy among TFs were defined, illuminating the relationship between differential gene expression patterns and ABA pathway feedback regulation. By extrapolating regulatory characteristics of observed canonical ABA pathway components, we identified a new family of transcriptional regulators modulating ABA and salt responsiveness and demonstrated their utility to modulate plant resilience to osmotic stress. Copyright © 2016, American Association for the Advancement of Science.
Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.

PubMed

Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun

2017-12-21

Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

PubMed Central

Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K

2006-01-01

Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705
Molecular determinants of caste differentiation in the highly eusocial honeybee Apis mellifera.

PubMed

Barchuk, Angel R; Cristino, Alexandre S; Kucharski, Robert; Costa, Luciano F; Simões, Zilá L P; Maleszka, Ryszard

2007-06-18

In honeybees, differential feeding of female larvae promotes the occurrence of two different phenotypes, a queen and a worker, from identical genotypes, through incremental alterations, which affect general growth, and character state alterations that result in the presence or absence of specific structures. Although previous studies revealed a link between incremental alterations and differential expression of physiometabolic genes, the molecular changes accompanying character state alterations remain unknown. By using cDNA microarray analyses of >6,000 Apis mellifera ESTs, we found 240 differentially expressed genes (DEGs) between developing queens and workers. Many genes recorded as up-regulated in prospective workers appear to be unique to A. mellifera, suggesting that the workers' developmental pathway involves the participation of novel genes. Workers up-regulate more developmental genes than queens, whereas queens up-regulate a greater proportion of physiometabolic genes, including genes coding for metabolic enzymes and genes whose products are known to regulate the rate of mass-transforming processes and the general growth of the organism (e.g., tor). Many DEGs are likely to be involved in processes favoring the development of caste-biased structures, like brain, legs and ovaries, as well as genes that code for cytoskeleton constituents. Treatment of developing worker larvae with juvenile hormone (JH) revealed 52 JH responsive genes, specifically during the critical period of caste development. Using Gibbs sampling and Expectation Maximization algorithms, we discovered eight overrepresented cis-elements from four gene groups. Graph theory and complex networks concepts were adopted to attain powerful graphical representations of the interrelation between cis-elements and genes and objectively quantify the degree of relationship between these entities. We suggest that clusters of functionally related DEGs are co-regulated during caste development in honeybees. This network of interactions is activated by nutrition-driven stimuli in early larval stages. Our data are consistent with the hypothesis that JH is a key component of the developmental determination of queen-like characters. Finally, we propose a conceptual model of caste differentiation in A. mellifera based on gene-regulatory networks.
Molecular determinants of caste differentiation in the highly eusocial honeybee Apis mellifera

PubMed Central

Barchuk, Angel R; Cristino, Alexandre S; Kucharski, Robert; Costa, Luciano F; Simões, Zilá LP; Maleszka, Ryszard

2007-01-01

Background In honeybees, differential feeding of female larvae promotes the occurrence of two different phenotypes, a queen and a worker, from identical genotypes, through incremental alterations, which affect general growth, and character state alterations that result in the presence or absence of specific structures. Although previous studies revealed a link between incremental alterations and differential expression of physiometabolic genes, the molecular changes accompanying character state alterations remain unknown. Results By using cDNA microarray analyses of >6,000 Apis mellifera ESTs, we found 240 differentially expressed genes (DEGs) between developing queens and workers. Many genes recorded as up-regulated in prospective workers appear to be unique to A. mellifera, suggesting that the workers' developmental pathway involves the participation of novel genes. Workers up-regulate more developmental genes than queens, whereas queens up-regulate a greater proportion of physiometabolic genes, including genes coding for metabolic enzymes and genes whose products are known to regulate the rate of mass-transforming processes and the general growth of the organism (e.g., tor). Many DEGs are likely to be involved in processes favoring the development of caste-biased structures, like brain, legs and ovaries, as well as genes that code for cytoskeleton constituents. Treatment of developing worker larvae with juvenile hormone (JH) revealed 52 JH responsive genes, specifically during the critical period of caste development. Using Gibbs sampling and Expectation Maximization algorithms, we discovered eight overrepresented cis-elements from four gene groups. Graph theory and complex networks concepts were adopted to attain powerful graphical representations of the interrelation between cis-elements and genes and objectively quantify the degree of relationship between these entities. Conclusion We suggest that clusters of functionally related DEGs are co-regulated during caste development in honeybees. This network of interactions is activated by nutrition-driven stimuli in early larval stages. Our data are consistent with the hypothesis that JH is a key component of the developmental determination of queen-like characters. Finally, we propose a conceptual model of caste differentiation in A. mellifera based on gene-regulatory networks. PMID:17577409
Male sex determination: insights into molecular mechanisms

PubMed Central

McClelland, Kathryn; Bowles, Josephine; Koopman, Peter

2012-01-01

Disorders of sex development often arise from anomalies in the molecular or cellular networks that guide the differentiation of the embryonic gonad into either a testis or an ovary, two functionally distinct organs. The activation of the Y-linked gene Sry (sex-determining region Y) and its downstream target Sox9 (Sry box-containing gene 9) triggers testis differentiation by stimulating the differentiation of Sertoli cells, which then direct testis morphogenesis. Once engaged, a genetic pathway promotes the testis development while actively suppressing genes involved in ovarian development. This review focuses on the events of testis determination and the struggle to maintain male fate in the face of antagonistic pressure from the underlying female programme. PMID:22179516
A study of structural properties of gene network graphs for mathematical modeling of integrated mosaic gene networks.

PubMed

Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

2017-04-01

Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Chemical-Gene Interactions from ToxCast Bioactivity Data ...

EPA Pesticide Factsheets

Characterizing the effects of chemicals in biological systems is often summarized by chemical-gene interactions, which have sparse coverage in the literature. The ToxCast chemical screening program has produced bioactivity data for nearly 2000 chemicals and over 450 gene targets. To evaluate the information gained from the ToxCast project, a ToxCast bioactivity network was created comprising ToxCast chemical-gene interactions based on assay data and compared to a chemical-gene association network from literature. The literature network was compiled from PubMed articles, excluding ToxCast publications, mapped to genes and chemicals. Genes were identified by curated associations available from NCBI while chemicals were identified by PubChem submissions. The frequencies of chemical-gene associations from the literature network were log-scaled and then compared to the ToxCast bioactivity network. In total, 140 times more chemical-gene associations were present in the ToxCast network in comparison to the literature-derived network highlighting the vast increase in chemical-gene interactions putatively elucidated by the ToxCast research program. There were 165 associations found in the literature network that were reproduced by ToxCast bioactivity data, and 336 associations in the literature network were not reproduced by the ToxCast bioactivity network. The literature network relies on the assumption that chemical-gene associations represent a true chemical-gene inte
Competition among gene regulatory networks imposes order within the eye-antennal disc of Drosophila

PubMed Central

Weasner, Bonnie M.; Kumar, Justin P.

2013-01-01

The eye-antennal disc of Drosophila gives rise to numerous adult tissues, including the compound eyes, ocelli, antennae, maxillary palps and surrounding head capsule. The fate of each tissue is governed by the activity of unique gene regulatory networks (GRNs). The fate of the eye, for example, is controlled by a set of fourteen interlocking genes called the retinal determination (RD) network. Mutations within network members lead to replacement of the eyes with head capsule. Several studies have suggested that in these instances all retinal progenitor and precursor cells are eliminated via apoptosis and as a result the surrounding head capsule proliferates to compensate for retinal tissue loss. This model implies that the sole responsibility of the RD network is to promote the fate of the eye. We have re-analyzed eyes absent mutant discs and propose an alternative model. Our data suggests that in addition to promoting an eye fate the RD network simultaneously functions to actively repress GRNs that are responsible for directing antennal and head capsule fates. Compromising the RD network leads to the inappropriate expression of several head capsule selector genes such as cut, Lim1 and wingless. Instead of undergoing apoptosis, a population of mutant retinal progenitors and precursor cells adopt a head capsule fate. This transformation is accompanied by an adjustment of cell proliferation rates such that just enough head capsule is generated to produce an intact adult head. We propose that GRNs simultaneously promote primary fates, inhibit alternative fates and establish cell proliferation states. PMID:23222441
Epigenetic determinants of ovarian clear cell carcinoma biology

PubMed Central

Yamaguchi, Ken; Huang, Zhiqing; Matsumura, Noriomi; Mandai, Masaki; Okamoto, Takako; Baba, Tsukasa; Konishi, Ikuo; Berchuck, Andrew; Murphy, Susan K.

2015-01-01

Targeted approaches have revealed frequent epigenetic alterations in ovarian cancer, but the scope and relation of these changes to histologic subtype of disease is unclear. Genome-wide methylation and expression data for 14 clear cell carcinoma (CCC), 32 non-CCC, and 4 corresponding normal cell lines were generated to determine how methylation profiles differ between cells of different histological derivations of ovarian cancer. Consensus clustering showed that CCC is epigenetically distinct. Inverse relationships between expression and methylation in CCC were identified, suggesting functional regulation by methylation, and included 22 hypomethylated (UM) genes and 276 hypermethylated (HM) genes. Categorical and pathway analyses indicated that the CCC-specific UM genes were involved in response to stress and many contain hepatocyte nuclear factor (HNF) 1 binding sites, while the CCC-specific HM genes included members of the estrogen receptor alpha (ERalpha) network and genes involved in tumor development. We independently validated the methylation status of 17 of these pathway-specific genes, and confirmed increased expression of HNF1 network genes and repression of ERalpha pathway genes in CCC cell lines and primary cancer tissues relative to non-CCC specimens. Treatment of three CCC cell lines with the demethylating agent Decitabine significantly induced expression for all five genes analyzed. Coordinate changes in pathway expression were confirmed using two primary ovarian cancer datasets (p<0.0001 for both). Our results suggest that methylation regulates specific pathways and biological functions in CCC, with hypomethylation influencing the characteristic biology of the disease while hypermethylation contributes to the carcinogenic process. PMID:24382740
Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data.

PubMed

Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J

2015-03-07

Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10(-100)), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation.
Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency

PubMed Central

2009-01-01

Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. Conclusions We provide a computational framework to reconstruct the genetic regulatory network from the microarray data using biological knowledge and constraint-based inferences. Our method is helpful in verifying possible interaction relations in gene regulatory networks and filtering out incorrect relations inferred by imperfect methods. We predicted not only individual gene related to cancer but also discovered significant gene regulation networks. Our method is also validated in several enriched published papers and databases and the significant gene regulatory networks perform critical biological functions and processes including cell adhesion molecules, androgen and estrogen metabolism, smooth muscle contraction, and GO-annotated processes. Those significant gene regulations and the critical concept of tumor progression are useful to understand cancer biology and disease treatment. PMID:20025723
Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

PubMed

Zhou, Xionghui; Liu, Juan

2014-01-01

Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for phenotypic change.
Global Survey of Protein Expression during Gonadal Sex Determination in Mice*

PubMed Central

Ewen, Katherine; Baker, Mark; Wilhelm, Dagmar; Aitken, R. John; Koopman, Peter

2009-01-01

The development of an embryo as male or female depends on differentiation of the gonads as either testes or ovaries. A number of genes are known to be important for gonadal differentiation, but our understanding of the regulatory networks underpinning sex determination remains fragmentary. To advance our understanding of sexual development beyond the transcriptome level, we performed the first global survey of the mouse gonad proteome at the time of sex determination by using two-dimensional nanoflow LC-MS/MS. The resulting data set contains a total of 1037 gene products (154 non-redundant and 883 redundant proteins) identified from 620 peptides. Functional classification and biological network construction suggested that the identified proteins primarily serve in RNA post-transcriptional modification and trafficking, protein synthesis and folding, and post-translational modification. The data set contains potential novel regulators of gonad development and sex determination not revealed previously by transcriptomics and proteomics studies and more than 60 proteins with potential links to human disorders of sexual development. PMID:19617587
Analysis of genetic association using hierarchical clustering and cluster validation indices.

PubMed

Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

2017-10-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks.

PubMed

Zhou, Xiaobo; Wang, Xiaodong; Pal, Ranadip; Ivanov, Ivan; Bittner, Michael; Dougherty, Edward R

2004-11-22

We have hypothesized that the construction of transcriptional regulatory networks using a method that optimizes connectivity would lead to regulation consistent with biological expectations. A key expectation is that the hypothetical networks should produce a few, very strong attractors, highly similar to the original observations, mimicking biological state stability and determinism. Another central expectation is that, since it is expected that the biological control is distributed and mutually reinforcing, interpretation of the observations should lead to a very small number of connection schemes. We propose a fully Bayesian approach to constructing probabilistic gene regulatory networks (PGRNs) that emphasizes network topology. The method computes the possible parent sets of each gene, the corresponding predictors and the associated probabilities based on a nonlinear perceptron model, using a reversible jump Markov chain Monte Carlo (MCMC) technique, and an MCMC method is employed to search the network configurations to find those with the highest Bayesian scores to construct the PGRN. The Bayesian method has been used to construct a PGRN based on the observed behavior of a set of genes whose expression patterns vary across a set of melanoma samples exhibiting two very different phenotypes with respect to cell motility and invasiveness. Key biological features have been faithfully reflected in the model. Its steady-state distribution contains attractors that are either identical or very similar to the states observed in the data, and many of the attractors are singletons, which mimics the biological propensity to stably occupy a given state. Most interestingly, the connectivity rules for the most optimal generated networks constituting the PGRN are remarkably similar, as would be expected for a network operating on a distributed basis, with strong interactions between the components.
Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells

PubMed Central

Borel, Christelle; Mudge, Jonathan M.; Howald, Cédric; Foissac, Sylvain; Ucla, Catherine; Chrast, Jacqueline; Ribeca, Paolo; Martin, David; Murray, Ryan R.; Yang, Xinping; Ghamsari, Lila; Lin, Chenwei; Bell, Ian; Dumais, Erica; Drenkow, Jorg; Tress, Michael L.; Gelpí, Josep Lluís; Orozco, Modesto; Valencia, Alfonso; van Berkum, Nynke L.; Lajoie, Bryan R.; Vidal, Marc; Stamatoyannopoulos, John; Batut, Philippe; Dobin, Alex; Harrow, Jennifer; Hubbard, Tim; Dekker, Job; Frankish, Adam; Salehi-Ashtiani, Kourosh; Reymond, Alexandre; Antonarakis, Stylianos E.; Guigó, Roderic; Gingeras, Thomas R.

2012-01-01

The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network. PMID:22238572
IDPT: Insights into potential intrinsically disordered proteins through transcriptomic analysis of genes for prostate carcinoma epigenetic data.

PubMed

Mallik, Saurav; Sen, Sagnik; Maulik, Ujjwal

2016-07-15

Involvement of intrinsically disordered proteins (IDPs) with various dreadful diseases like cancer is an interesting research topic. In order to gain novel insights into the regulation of IDPs, in this article, we perform a transcriptomic analysis of mRNAs (genes) for transcripts encoding IDPs on a human multi-omics prostate carcinoma dataset having both gene expression and methylation data. In this regard, firstly the genes that consist of both the expression and methylation data, and that are corresponding to the cancer-related prostate-tissue-specific disordered proteins of MobiDb database, are selected. We apply standard t-test for determining differentially expressed genes as well as differentially methylated genes. A network having these genes and their targeter miRNAs from Diana Tarbase v7.0 database and corresponding Transcription Factors from TRANSFAC and ITFP databases, is then built. Thereafter, we perform literature search, and KEGG pathway and Gene Ontology analyses using DAVID database. Finally, we report several significant potential gene-markers (with the corresponding IDPs) that have inverse relationship between differential expression and methylation patterns, and that are hub genes of the TF-miRNA-gene network. Copyright © 2016 Elsevier B.V. All rights reserved.
Analysis of bHLH coding genes using gene co-expression network approach.

PubMed

Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

2016-07-01

Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.

Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks

PubMed Central

Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

2017-01-01

Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD. PMID:29262568
Prioritizing chronic obstructive pulmonary disease (COPD) candidate genes in COPD-related networks.

PubMed

Zhang, Yihua; Li, Wan; Feng, Yuyan; Guo, Shanshan; Zhao, Xilei; Wang, Yahui; He, Yuehan; He, Weiming; Chen, Lina

2017-11-28

Chronic obstructive pulmonary disease (COPD) is a multi-factor disease, which could be caused by many factors, including disturbances of metabolism and protein-protein interactions (PPIs). In this paper, a weighted COPD-related metabolic network and a weighted COPD-related PPI network were constructed base on COPD disease genes and functional information. Candidate genes in these weighted COPD-related networks were prioritized by making use of a gene prioritization method, respectively. Literature review and functional enrichment analysis of the top 100 genes in these two networks suggested the correlation of COPD and these genes. The performance of our gene prioritization method was superior to that of ToppGene and ToppNet for genes from the COPD-related metabolic network or the COPD-related PPI network after assessing using leave-one-out cross-validation, literature validation and functional enrichment analysis. The top-ranked genes prioritized from COPD-related metabolic and PPI networks could promote the better understanding about the molecular mechanism of this disease from different perspectives. The top 100 genes in COPD-related metabolic network or COPD-related PPI network might be potential markers for the diagnosis and treatment of COPD.
Gene networks and developmental context: the importance of understanding complex gene expression patterns in evolution.

PubMed

Signor, Sarah A; Arbeitman, Michelle N; Nuzhdin, Sergey V

2016-05-01

Animal development is the product of distinct components and interactions-genes, regulatory networks, and cells-and it exhibits emergent properties that cannot be inferred from the components in isolation. Often the focus is on the genotype-to-phenotype map, overlooking the process of development that turns one into the other. We propose a move toward micro-evolutionary analysis of development, incorporating new tools that enable cell type resolution and single-cell microscopy. Using the sex determination pathway in Drosophila to illustrate potential avenues of research, we highlight some of the questions that these emerging technologies can address. For example, they provide an unprecedented opportunity to study heterogeneity within cell populations, and the potential to add the dimension of time to gene regulatory network analysis. Challenges still remain in developing methods to analyze this data and to increase the throughput. However this line of research has the potential to bridge the gaps between previously more disparate fields, such as population genetics and development, opening up new avenues of research. © 2016 Wiley Periodicals, Inc.
Genomic analyses identify molecular subtypes of pancreatic cancer.

PubMed

Bailey, Peter; Chang, David K; Nones, Katia; Johns, Amber L; Patch, Ann-Marie; Gingras, Marie-Claude; Miller, David K; Christ, Angelika N; Bruxner, Tim J C; Quinn, Michael C; Nourse, Craig; Murtaugh, L Charles; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourbakhsh, Ehsan; Wani, Shivangi; Fink, Lynn; Holmes, Oliver; Chin, Venessa; Anderson, Matthew J; Kazakoff, Stephen; Leonard, Conrad; Newell, Felicity; Waddell, Nick; Wood, Scott; Xu, Qinying; Wilson, Peter J; Cloonan, Nicole; Kassahn, Karin S; Taylor, Darrin; Quek, Kelly; Robertson, Alan; Pantano, Lorena; Mincarelli, Laura; Sanchez, Luis N; Evers, Lisa; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Jones, Marc D; Colvin, Emily K; Nagrial, Adnan M; Humphrey, Emily S; Chantrill, Lorraine A; Mawson, Amanda; Humphris, Jeremy; Chou, Angela; Pajic, Marina; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Lovell, Jessica A; Merrett, Neil D; Toon, Christopher W; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Moran-Jones, Kim; Jamieson, Nigel B; Graham, Janet S; Duthie, Fraser; Oien, Karin; Hair, Jane; Grützmann, Robert; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Rusev, Borislav; Capelli, Paola; Salvia, Roberto; Tortora, Giampaolo; Mukhopadhyay, Debabrata; Petersen, Gloria M; Munzy, Donna M; Fisher, William E; Karim, Saadia A; Eshleman, James R; Hruban, Ralph H; Pilarsky, Christian; Morton, Jennifer P; Sansom, Owen J; Scarpa, Aldo; Musgrove, Elizabeth A; Bailey, Ulla-Maja Hagbo; Hofmann, Oliver; Sutherland, Robert L; Wheeler, David A; Gill, Anthony J; Gibbs, Richard A; Pearson, John V; Waddell, Nicola; Biankin, Andrew V; Grimmond, Sean M

2016-03-03

Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that aggregate into 10 pathways: KRAS, TGF-β, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the TP63∆N transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, exocrine (NR5A2 and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development.
Evolution of robustness to damage in artificial 3-dimensional development.

PubMed

Joachimczak, Michał; Wróbel, Borys

2012-09-01

GReaNs is an Artificial Life platform we have built to investigate the general principles that guide evolution of multicellular development and evolution of artificial gene regulatory networks. The embryos develop in GReaNs in a continuous 3-dimensional (3D) space with simple physics. The developmental trajectories are indirectly encoded in linear genomes. The genomes are not limited in size and determine the topology of gene regulatory networks that are not limited in the number of nodes. The expression of the genes is continuous and can be modified by adding environmental noise. In this paper we evolved development of structures with a specific shape (an ellipsoid) and asymmetrical pattering (a 3D pattern inspired by the French flag problem), and investigated emergence of the robustness to damage in development and the emergence of the robustness to noise. Our results indicate that both types of robustness are related, and that including noise during evolution promotes higher robustness to damage. Interestingly, we have observed that some evolved gene regulatory networks rely on noise for proper behaviour. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Dose and Time Dependencies in Stress Pathway Responses during Chemical Exposure: Novel Insights from Gene Regulatory Networks.

PubMed

Souza, Terezinha M; Kleinjans, Jos C S; Jennen, Danyel G J

2017-01-01

Perturbation of biological networks is often observed during exposure to xenobiotics, and the identification of disturbed processes, their dynamic traits, and dose-response relationships are some of the current challenges for elucidating the mechanisms determining adverse outcomes. In this scenario, reverse engineering of gene regulatory networks (GRNs) from expression data may provide a system-level snapshot embedded within accurate molecular events. Here, we investigate the composition of GRNs inferred from groups of chemicals with two distinct outcomes, namely carcinogenicity [azathioprine (AZA) and cyclophosphamide (CYC)] and drug-induced liver injury (DILI; diclofenac, nitrofurantoin, and propylthiouracil), and a non-carcinogenic/non-DILI group (aspirin, diazepam, and omeprazole). For this, we analyzed publicly available exposed in vitro human data, taking into account dose and time dependencies. Dose-Time Network Identification (DTNI) was applied to gene sets from exposed primary human hepatocytes using four stress pathways, namely endoplasmic reticulum (ER), NF-κB, NRF2, and TP53. Inferred GRNs suggested case specificity, varying in interactions, starting nodes, and target genes across groups. DILI and carcinogenic compounds were shown to directly affect all pathway-based GRNs, while non-DILI/non-carcinogenic chemicals only affected NF-κB. NF-κB-based GRNs clearly illustrated group-specific disturbances, with the cancer-related casein kinase CSNK2A1 being a target gene only in the carcinogenic group, and opposite regulation of NF-κB subunits being observed in DILI and non-DILI/non-carcinogenic groups. Target genes in NRF2-based GRNs shared by DILI and carcinogenic compounds suggested markers of hepatotoxicity. Finally, we indicate several of these group-specific interactions as potentially novel. In summary, our reversed-engineered GRNs are capable of revealing dose dependent, chemical-specific mechanisms of action in stress-related biological networks.
Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

PubMed

Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

2015-01-01

Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Constructing an integrated gene similarity network for the identification of disease genes.

PubMed

Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

2017-09-20

Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Reverse engineering highlights potential principles of large gene regulatory network design and learning.

PubMed

Carré, Clément; Mas, André; Krouk, Gabriel

2017-01-01

Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.
Coexpression network based on natural variation in human gene expression reveals gene interactions and functions

PubMed Central

Nayak, Renuka R.; Kearns, Michael; Spielman, Richard S.; Cheung, Vivian G.

2009-01-01

Genes interact in networks to orchestrate cellular processes. Analysis of these networks provides insights into gene interactions and functions. Here, we took advantage of normal variation in human gene expression to infer gene networks, which we constructed using correlations in expression levels of more than 8.5 million gene pairs in immortalized B cells from three independent samples. The resulting networks allowed us to identify biological processes and gene functions. Among the biological pathways, we found processes such as translation and glycolysis that co-occur in the same subnetworks. We predicted the functions of poorly characterized genes, including CHCHD2 and TMEM111, and provided experimental evidence that TMEM111 is part of the endoplasmic reticulum-associated secretory pathway. We also found that IFIH1, a susceptibility gene of type 1 diabetes, interacts with YES1, which plays a role in glucose transport. Furthermore, genes that predispose to the same diseases are clustered nonrandomly in the coexpression network, suggesting that networks can provide candidate genes that influence disease susceptibility. Therefore, our analysis of gene coexpression networks offers information on the role of human genes in normal and disease processes. PMID:19797678
Tumor SHB gene expression affects disease characteristics in human acute myeloid leukemia.

PubMed

Jamalpour, Maria; Li, Xiujuan; Cavelier, Lucia; Gustafsson, Karin; Mostoslavsky, Gustavo; Höglund, Martin; Welsh, Michael

2017-10-01

The mouse Shb gene coding for the Src Homology 2-domain containing adapter protein B has recently been placed in context of BCRABL1-induced myeloid leukemia in mice and the current study was performed in order to relate SHB to human acute myeloid leukemia (AML). Publicly available AML databases were mined for SHB gene expression and patient survival. SHB gene expression was determined in the Uppsala cohort of AML patients by qPCR. Cell proliferation was determined after SHB gene knockdown in leukemic cell lines. Despite a low frequency of SHB gene mutations, many tumors overexpressed SHB mRNA compared with normal myeloid blood cells. AML patients with tumors expressing low SHB mRNA displayed longer survival times. A subgroup of AML exhibiting a favorable prognosis, acute promyelocytic leukemia (APL) with a PMLRARA translocation, expressed less SHB mRNA than AML tumors in general. When examining genes co-expressed with SHB in AML tumors, four other genes ( PAX5, HDAC7, BCORL1, TET1) related to leukemia were identified. A network consisting of these genes plus SHB was identified that relates to certain phenotypic characteristics, such as immune cell, vascular and apoptotic features. SHB knockdown in the APL PMLRARA cell line NB4 and the monocyte/macrophage cell line MM6 adversely affected proliferation, linking SHB gene expression to tumor cell expansion and consequently to patient survival. It is concluded that tumor SHB gene expression relates to AML survival and its subgroup APL. Moreover, this gene is included in a network of genes that plays a role for an AML phenotype exhibiting certain immune cell, vascular and apoptotic characteristics.
Comparative de novo transcriptome analysis of male and female Sea buckthorn.

PubMed

Bansal, Ankush; Salaria, Mehul; Sharma, Tashil; Stobdan, Tsering; Kant, Anil

2018-02-01

Sea buckthorn is a dioecious medicinal plant found at high altitude. The plant has both male and female reproductive organs in separate individuals. In this article, whole transcriptome de novo assemblies of male and female flower bud samples were carried out using Illumina NextSeq 500 platform to determine the role of the genes involved in sex determination. Moreover, genes with differential expression in male and female transcriptomes were identified to understand the underlying sex determination mechanism. The current study showed 63,904 and 62,272 coding sequences (CDS) in female and male transcriptome data sets, respectively. 16,831 common CDS were screened out from both transcriptomes, out of which 625 were upregulated and 491 were found to be downregulated. To understand the potential regulatory roles of differentially expressed genes in metabolic networks and biosynthetic pathways: KEGG mapping, gene ontology, and co-expression network analysis were performed. Comparison with Flowering Interactive Database (FLOR-ID) resulted in eight differentially expressed genes viz. CHD3-type chromatin-remodeling factor PICKLE ( PKL ), phytochrome-associated serine/threonine-protein phosphatase ( FYPP ), protein TOPLESS ( TPL ), sensitive to freezing 6 ( SFR6 ), lysine-specific histone demethylase 1 homolog 1 ( LDL1 ), pre-mRNA-processing-splicing factor 8A ( PRP8A ), sucrose synthase 4 ( SUS4 ), ubiquitin carboxyl-terminal hydrolase 12 ( UBP12 ), known to be broadly involved in flowering, photoperiodism, embryo development, and cold response pathways. Male and female flower bud transcriptome data of Sea buckthorn may provide comprehensive information at genomic level for the identification of genetic regulation involved in sex determination.
Disease networks. Uncovering disease-disease relationships through the incomplete interactome.

PubMed

Menche, Jörg; Sharma, Amitabh; Kitsak, Maksim; Ghiassian, Susan Dina; Vidal, Marc; Loscalzo, Joseph; Barabási, Albert-László

2015-02-20

According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes. Copyright © 2015, American Association for the Advancement of Science.
Topology association analysis in weighted protein interaction network for gene prioritization

NASA Astrophysics Data System (ADS)

Wu, Shunyao; Shao, Fengjing; Zhang, Qi; Ji, Jun; Xu, Shaojie; Sun, Rencheng; Sun, Gengxin; Du, Xiangjun; Sui, Yi

2016-11-01

Although lots of algorithms for disease gene prediction have been proposed, the weights of edges are rarely taken into account. In this paper, the strengths of topology associations between disease and essential genes are analyzed in weighted protein interaction network. Empirical analysis demonstrates that compared to other genes, disease genes are weakly connected with essential genes in protein interaction network. Based on this finding, a novel global distance measurement for gene prioritization with weighted protein interaction network is proposed in this paper. Positive and negative flow is allocated to disease and essential genes, respectively. Additionally network propagation model is extended for weighted network. Experimental results on 110 diseases verify the effectiveness and potential of the proposed measurement. Moreover, weak links play more important role than strong links for gene prioritization, which is meaningful to deeply understand protein interaction network.
Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways.

PubMed

Azimzadeh Jamalkandi, Sadegh; Mozhgani, Sayed-Hamidreza; Gholami Pourbadie, Hamid; Mirzaie, Mehdi; Noorbakhsh, Farshid; Vaziri, Behrouz; Gholami, Alireza; Ansari-Pour, Naser; Jafari, Mohieddin

2016-01-01

The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein-protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets.
Uncovering disease mechanisms through network biology in the era of Next Generation Sequencing

NASA Astrophysics Data System (ADS)

Piñero, Janet; Berenstein, Ariel; Gonzalez-Perez, Abel; Chernomoretz, Ariel; Furlong, Laura I.

2016-04-01

Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules.
Uncovering disease mechanisms through network biology in the era of Next Generation Sequencing

PubMed Central

Piñero, Janet; Berenstein, Ariel; Gonzalez-Perez, Abel; Chernomoretz, Ariel; Furlong, Laura I.

2016-01-01

Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules. PMID:27080396
Topological analysis of metabolic networks integrating co-segregating transcriptomes and metabolomes in type 2 diabetic rat congenic series.

PubMed

Dumas, Marc-Emmanuel; Domange, Céline; Calderari, Sophie; Martínez, Andrea Rodríguez; Ayala, Rafael; Wilder, Steven P; Suárez-Zamorano, Nicolas; Collins, Stephan C; Wallis, Robert H; Gu, Quan; Wang, Yulan; Hue, Christophe; Otto, Georg W; Argoud, Karène; Navratil, Vincent; Mitchell, Steve C; Lindon, John C; Holmes, Elaine; Cazier, Jean-Baptiste; Nicholson, Jeremy K; Gauguier, Dominique

2016-09-30

The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus occurs through complex organ-specific cellular mechanisms and networks contributing to impaired insulin secretion and insulin resistance. Genome-wide gene expression profiling systems can dissect the genetic contributions to metabolome and transcriptome regulations. The integrative analysis of multiple gene expression traits and metabolic phenotypes (i.e., metabotypes) together with their underlying genetic regulation remains a challenge. Here, we introduce a systems genetics approach based on the topological analysis of a combined molecular network made of genes and metabolites identified through expression and metabotype quantitative trait locus mapping (i.e., eQTL and mQTL) to prioritise biological characterisation of candidate genes and traits. We used systematic metabotyping by 1 H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualize the shortest paths between metabolites and genes significantly associated with each genomic block. Despite strong genomic similarities (95-99 %) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting the metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific mQTLs and genome-wide eQTLs. Variation in key metabolites like glucose, succinate, lactate, or 3-hydroxybutyrate and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing the shortest path length drove prioritization of biological validations by gene silencing. These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulation and to characterize novel functional roles for genes determining tissue-specific metabolism.
Systems biology approach to late-onset Alzheimer's disease genome-wide association study identifies novel candidate genes validated using brain expression data and Caenorhabditis elegans experiments.

PubMed

Mukherjee, Shubhabrata; Russell, Joshua C; Carr, Daniel T; Burgess, Jeremy D; Allen, Mariet; Serie, Daniel J; Boehme, Kevin L; Kauwe, John S K; Naj, Adam C; Fardo, David W; Dickson, Dennis W; Montine, Thomas J; Ertekin-Taner, Nilufer; Kaeberlein, Matt R; Crane, Paul K

2017-10-01

We sought to determine whether a systems biology approach may identify novel late-onset Alzheimer's disease (LOAD) loci. We performed gene-wide association analyses and integrated results with human protein-protein interaction data using network analyses. We performed functional validation on novel genes using a transgenic Caenorhabditis elegans Aβ proteotoxicity model and evaluated novel genes using brain expression data from people with LOAD and other neurodegenerative conditions. We identified 13 novel candidate LOAD genes outside chromosome 19. Of those, RNA interference knockdowns of the C. elegans orthologs of UBC, NDUFS3, EGR1, and ATP5H were associated with Aβ toxicity, and NDUFS3, SLC25A11, ATP5H, and APP were differentially expressed in the temporal cortex. Network analyses identified novel LOAD candidate genes. We demonstrated a functional role for four of these in a C. elegans model and found enrichment of differentially expressed genes in the temporal cortex. Copyright © 2017 the Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
Genome co-amplification upregulates a mitotic gene network activity that predicts outcome and response to mitotic protein inhibitors in breast cancer

DOE PAGES

Hu, Zhi; Mao, Jian-Hua; Curtis, Christina; ...

2016-07-01

Background: High mitotic activity is associated with the genesis and progression of many cancers. Small molecule inhibitors of mitotic apparatus proteins are now being developed and evaluated clinically as anticancer agents. With clinical trials of several of these experimental compounds underway, it is important to understand the molecular mechanisms that determine high mitotic activity, identify tumor subtypes that carry molecular aberrations that confer high mitotic activity, and to develop molecular markers that distinguish which tumors will be most responsive to mitotic apparatus inhibitors. Methods: We identified a coordinately regulated mitotic apparatus network by analyzing gene expression profiles for 53 malignantmore » and non-malignant human breast cancer cell lines and two separate primary breast tumor datasets. We defined the mitotic network activity index (MNAI) as the sum of the transcriptional levels of the 54 coordinately regulated mitotic apparatus genes. The effect of those genes on cell growth was evaluated by small interfering RNA (siRNA). Results: High MNAI was enriched in basal-like breast tumors and was associated with reduced survival duration and preferential sensitivity to i nhibitors of the mitotic apparatus proteins, polo-like kinase, centromere associated protein E and aurora kinase designated GSK462364, GSK923295 and GSK1070916, respectively. Co-amplification of regions of chromosomes 8q24, 10p15-p12, 12p13, and 17q24-q25 was associated with the transcriptional upregulation of this network of 54 mitotic apparatus genes, and we identify transcription factors that localize to these regions and putatively regulate mitotic activity. Knockdown of the mitotic network by siRNA identified 22 genes that might be considered as additional therapeutic targets for this clinically relevant patient subgroup. Conclusions: We define a molecular signature which may guide therapeutic approaches for tumors with high mitotic network activity.« less

Genome co-amplification upregulates a mitotic gene network activity that predicts outcome and response to mitotic protein inhibitors in breast cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hu, Zhi; Mao, Jian-Hua; Curtis, Christina

Background: High mitotic activity is associated with the genesis and progression of many cancers. Small molecule inhibitors of mitotic apparatus proteins are now being developed and evaluated clinically as anticancer agents. With clinical trials of several of these experimental compounds underway, it is important to understand the molecular mechanisms that determine high mitotic activity, identify tumor subtypes that carry molecular aberrations that confer high mitotic activity, and to develop molecular markers that distinguish which tumors will be most responsive to mitotic apparatus inhibitors. Methods: We identified a coordinately regulated mitotic apparatus network by analyzing gene expression profiles for 53 malignantmore » and non-malignant human breast cancer cell lines and two separate primary breast tumor datasets. We defined the mitotic network activity index (MNAI) as the sum of the transcriptional levels of the 54 coordinately regulated mitotic apparatus genes. The effect of those genes on cell growth was evaluated by small interfering RNA (siRNA). Results: High MNAI was enriched in basal-like breast tumors and was associated with reduced survival duration and preferential sensitivity to i nhibitors of the mitotic apparatus proteins, polo-like kinase, centromere associated protein E and aurora kinase designated GSK462364, GSK923295 and GSK1070916, respectively. Co-amplification of regions of chromosomes 8q24, 10p15-p12, 12p13, and 17q24-q25 was associated with the transcriptional upregulation of this network of 54 mitotic apparatus genes, and we identify transcription factors that localize to these regions and putatively regulate mitotic activity. Knockdown of the mitotic network by siRNA identified 22 genes that might be considered as additional therapeutic targets for this clinically relevant patient subgroup. Conclusions: We define a molecular signature which may guide therapeutic approaches for tumors with high mitotic network activity.« less
Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

PubMed

Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

2017-10-01

The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.
Trainable Gene Regulation Networks with Applications to Drosophila Pattern Formation

NASA Technical Reports Server (NTRS)

Mjolsness, Eric

2000-01-01

This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila melanogaster. For details the reader is referred to the papers introduced below. It will then introduce a new gene regulation network model which can describe promoter-level substructure in gene regulation. As described in chapter 2, gene regulation may be thought of as a combination of cis-acting regulation by the extended promoter of a gene (including all regulatory sequences) by way of the transcription complex, and of trans-acting regulation by the transcription factor products of other genes. If we simplify the cis-action by using a phenomenological model which can be tuned to data, such as a unit or other small portion of an artificial neural network, then the full transacting interaction between multiple genes during development can be modelled as a larger network which can again be tuned or trained to data. The larger network will in general need to have recurrent (feedback) connections since at least some real gene regulation networks do. This is the basic modeling approach taken, which describes how a set of recurrent neural networks can be used as a modeling language for multiple developmental processes including gene regulation within a single cell, cell-cell communication, and cell division. Such network models have been called "gene circuits", "gene regulation networks", or "genetic regulatory networks", sometimes without distinguishing the models from the actual modeled systems.
Robust Learning of High-dimensional Biological Networks with Bayesian Networks

NASA Astrophysics Data System (ADS)

Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.
Population Connectivity Measures of Fishery-Targeted Coral Reef Species to Inform Marine Reserve Network Design in Fiji

PubMed Central

Eastwood, Erin K.; López, Elora H.; Drew, Joshua A.

2016-01-01

Coral reef fish serve as food sources to coastal communities worldwide, yet are vulnerable to mounting anthropogenic pressures like overfishing and climate change. Marine reserve networks have become important tools for mitigating these pressures, and one of the most critical factors in determining their spatial design is the degree of connectivity among different populations of species prioritized for protection. To help inform the spatial design of an expanded reserve network in Fiji, we used rapidly evolving mitochondrial genes to investigate connectivity patterns of three coral reef species targeted by fisheries in Fiji: Epinephelus merra (Serranidae), Halichoeres trimaculatus (Labridae), and Holothuria atra (Holothuriidae). The two fish species, E. merra and Ha. trimaculatus, exhibited low genetic structuring and high amounts of gene flow, whereas the sea cucumber Ho. atra displayed high genetic partitioning and predominantly westward gene flow. The idiosyncratic patterns observed among these species indicate that patterns of connectivity in Fiji are likely determined by a combination of oceanographic and ecological characteristics. Our data indicate that in the cases of species with high connectivity, other factors such as representation or political availability may dictate where reserves are placed. In low connectivity species, ensuring upstream and downstream connections is critical. PMID:26805954
A hybrid network-based method for the detection of disease-related genes

NASA Astrophysics Data System (ADS)

Cui, Ying; Cai, Meng; Dai, Yang; Stanley, H. Eugene

2018-02-01

Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein-protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.
Differential DNA methylation marks and gene comethylation of COPD in African-Americans with COPD exacerbations.

PubMed

Busch, Robert; Qiu, Weiliang; Lasky-Su, Jessica; Morrow, Jarrett; Criner, Gerard; DeMeo, Dawn

2016-11-05

Chronic obstructive pulmonary disease (COPD) is the third-leading cause of death worldwide. Identifying COPD-associated DNA methylation marks in African-Americans may contribute to our understanding of racial disparities in COPD susceptibility. We determined differentially methylated genes and co-methylation network modules associated with COPD in African-Americans recruited during exacerbations of COPD and smoking controls from the Pennsylvania Study of Chronic Obstructive Pulmonary Exacerbations (PA-SCOPE) cohort. We assessed DNA methylation from whole blood samples in 362 African-American smokers in the PA-SCOPE cohort using the Illumina Infinium HumanMethylation27 BeadChip Array. Final analysis included 19302 CpG probes annotated to the nearest gene transcript after quality control. We tested methylation associations with COPD case-control status using mixed linear models. Weighted gene comethylation networks were constructed using weighted gene coexpression network analysis (WGCNA) and network modules were analyzed for association with COPD. There were five differentially methylated CpG probes significantly associated with COPD among African-Americans at an FDR less than 5 %, and seven additional probes that approached significance at an FDR less than 10 %. The top ranked gene association was MAML1, which has been shown to affect NOTCH-dependent angiogenesis in murine lung. Network modeling yielded the "yellow" and "blue" comethylation modules which were significantly associated with COPD (p-value 4 × 10 -10 and 4 × 10 -9 , respectively). The yellow module was enriched for gene sets related to inflammatory pathways known to be relevant to COPD. The blue module contained the top ranked genes in the concurrent differential methylation analysis (FXYD1/LGI4, gene significance p-value 1.2 × 10 -26 ; MAML1, p-value 2.0 × 10 -26 ; CD72, p-value 2.1 × 10 -25 ; and LPO, p-value 7.2 × 10 -25 ), and was significantly associated with lung development processes in Gene Ontology gene-set enrichment analysis. We identified 12 differentially methylated CpG sites associated with COPD that mapped to biologically plausible genes. Network module comethylation patterns have identified candidate genes that may be contributing to racial differences in COPD susceptibility and severity. COPD-associated comethylation modules contained genes previously associated with lung disease and inflammation and recapitulated known COPD-associated genes. The genes implicated by differential methylation and WGCNA analysis may provide mechanistic targets contributing to COPD susceptibility, exacerbations, and outcomes among African-Americans. Trial Registration: NCT00774176 , Registry: ClinicalTrials.gov, URL: www.clinicaltrials.gov , Date of Enrollment of First Participant: June 2004, Date Registered: 04 January 2008 (retrospectively registered).
Gene network biological validity based on gene-gene interaction relevance.

PubMed

Gómez-Vela, Francisco; Díaz-Díaz, Norberto

2014-01-01

In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown.
APG: an Active Protein-Gene network model to quantify regulatory signals in complex biological systems.

PubMed

Wang, Jiguang; Sun, Yidan; Zheng, Si; Zhang, Xiang-Sun; Zhou, Huarong; Chen, Luonan

2013-01-01

Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information.
APG: an Active Protein-Gene Network Model to Quantify Regulatory Signals in Complex Biological Systems

PubMed Central

Wang, Jiguang; Sun, Yidan; Zheng, Si; Zhang, Xiang-Sun; Zhou, Huarong; Chen, Luonan

2013-01-01

Synergistic interactions among transcription factors (TFs) and their cofactors collectively determine gene expression in complex biological systems. In this work, we develop a novel graphical model, called Active Protein-Gene (APG) network model, to quantify regulatory signals of transcription in complex biomolecular networks through integrating both TF upstream-regulation and downstream-regulation high-throughput data. Firstly, we theoretically and computationally demonstrate the effectiveness of APG by comparing with the traditional strategy based only on TF downstream-regulation information. We then apply this model to study spontaneous type 2 diabetic Goto-Kakizaki (GK) and Wistar control rats. Our biological experiments validate the theoretical results. In particular, SP1 is found to be a hidden TF with changed regulatory activity, and the loss of SP1 activity contributes to the increased glucose production during diabetes development. APG model provides theoretical basis to quantitatively elucidate transcriptional regulation by modelling TF combinatorial interactions and exploiting multilevel high-throughput information. PMID:23346354
METscout: a pathfinder exploring the landscape of metabolites, enzymes and transporters.

PubMed

Geffers, Lars; Tetzlaff, Benjamin; Cui, Xiao; Yan, Jun; Eichele, Gregor

2013-01-01

METscout (http://metscout.mpg.de) brings together metabolism and gene expression landscapes. It is a MySQL relational database linking biochemical pathway information with 3D patterns of gene expression determined by robotic in situ hybridization in the E14.5 mouse embryo. The sites of expression of ∼1500 metabolic enzymes and of ∼350 solute carriers (SLCs) were included and are accessible as single cell resolution images and in the form of semi-quantitative image abstractions. METscout provides several graphical web-interfaces allowing navigation through complex anatomical and metabolic information. Specifically, the database shows where in the organism each of the many metabolic reactions take place and where SLCs transport metabolites. To link enzymatic reactions and transport, the KEGG metabolic reaction network was extended to include metabolite transport. This network in conjunction with spatial expression pattern of the network genes allows for a tracing of metabolic reactions and transport processes across the entire body of the embryo.
Systematic Evaluation of Molecular Networks for Discovery of Disease Genes. | Office of Cancer Genomics

Cancer.gov

Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall.
SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

PubMed

Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

2013-01-01

Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.
A novel strategy of integrated microarray analysis identifies CENPA, CDK1 and CDC20 as a cluster of diagnostic biomarkers in lung adenocarcinoma.

PubMed

Liu, Wan-Ting; Wang, Yang; Zhang, Jing; Ye, Fei; Huang, Xiao-Hui; Li, Bin; He, Qing-Yu

2018-07-01

Lung adenocarcinoma (LAC) is the most lethal cancer and the leading cause of cancer-related death worldwide. The identification of meaningful clusters of co-expressed genes or representative biomarkers may help improve the accuracy of LAC diagnoses. Public databases, such as the Gene Expression Omnibus (GEO), provide rich resources of valuable information for clinics, however, the integration of multiple microarray datasets from various platforms and institutes remained a challenge. To determine potential indicators of LAC, we performed genome-wide relative significance (GWRS), genome-wide global significance (GWGS) and support vector machine (SVM) analyses progressively to identify robust gene biomarker signatures from 5 different microarray datasets that included 330 samples. The top 200 genes with robust signatures were selected for integrative analysis according to "guilt-by-association" methods, including protein-protein interaction (PPI) analysis and gene co-expression analysis. Of these 200 genes, only 10 genes showed both intensive PPI network and high gene co-expression correlation (r > 0.8). IPA analysis of this regulatory networks suggested that the cell cycle process is a crucial determinant of LAC. CENPA, as well as two linked hub genes CDK1 and CDC20, are determined to be potential indicators of LAC. Immunohistochemical staining showed that CENPA, CDK1 and CDC20 were highly expressed in LAC cancer tissue with co-expression patterns. A Cox regression model indicated that LAC patients with CENPA + /CDK1 + and CENPA + /CDC20 + were high-risk groups in terms of overall survival. In conclusion, our integrated microarray analysis demonstrated that CENPA, CDK1 and CDC20 might serve as novel cluster of prognostic biomarkers for LAC, and the cooperative unit of three genes provides a technically simple approach for identification of LAC patients. Copyright © 2018 Elsevier B.V. All rights reserved.
Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon.

PubMed

Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi

2017-01-01

We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation

PubMed Central

van Hijum, Sacha A. F. T.; Medema, Marnix H.; Kuipers, Oscar P.

2009-01-01

Summary: A major part of organismal complexity and versatility of prokaryotes resides in their ability to fine-tune gene expression to adequately respond to internal and external stimuli. Evolution has been very innovative in creating intricate mechanisms by which different regulatory signals operate and interact at promoters to drive gene expression. The regulation of target gene expression by transcription factors (TFs) is governed by control logic brought about by the interaction of regulators with TF binding sites (TFBSs) in cis-regulatory regions. A factor that in large part determines the strength of the response of a target to a given TF is motif stringency, the extent to which the TFBS fits the optimal TFBS sequence for a given TF. Advances in high-throughput technologies and computational genomics allow reconstruction of transcriptional regulatory networks in silico. To optimize the prediction of transcriptional regulatory networks, i.e., to separate direct regulation from indirect regulation, a thorough understanding of the control logic underlying the regulation of gene expression is required. This review summarizes the state of the art of the elements that determine the functionality of TFBSs by focusing on the molecular biological mechanisms and evolutionary origins of cis-regulatory regions. PMID:19721087
Degrees of separation as a statistical tool for evaluating candidate genes.

PubMed

Nelson, Ronald M; Pettersson, Mats E

2014-12-01

Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
An Arabidopsis gene regulatory network for secondary cell wall synthesis

DOE PAGES

Taylor-Teeples, M.; Lin, L.; de Lucas, M.; ...

2014-12-24

The plant cell wall is an important factor for determining cell shape, function and response to the environment. Secondary cell walls, such as those found in xylem, are composed of cellulose, hemicelluloses and lignin and account for the bulk of plant biomass. The coordination between transcriptional regulation of synthesis for each polymer is complex and vital to cell function. A regulatory hierarchy of developmental switches has been proposed, although the full complement of regulators remains unknown. In this paper, we present a protein–DNA network between Arabidopsis thaliana transcription factors and secondary cell wall metabolic genes with gene expression regulated bymore » a series of feed-forward loops. This model allowed us to develop and validate new hypotheses about secondary wall gene regulation under abiotic stress. Distinct stresses are able to perturb targeted genes to potentially promote functional adaptation. Finally, these interactions will serve as a foundation for understanding the regulation of a complex, integral plant component.« less
Incorporating networks in a probabilistic graphical model to find drivers for complex human diseases.

PubMed

Mezlini, Aziz M; Goldenberg, Anna

2017-10-01

Discovering genetic mechanisms driving complex diseases is a hard problem. Existing methods often lack power to identify the set of responsible genes. Protein-protein interaction networks have been shown to boost power when detecting gene-disease associations. We introduce a Bayesian framework, Conflux, to find disease associated genes from exome sequencing data using networks as a prior. There are two main advantages to using networks within a probabilistic graphical model. First, networks are noisy and incomplete, a substantial impediment to gene discovery. Incorporating networks into the structure of a probabilistic models for gene inference has less impact on the solution than relying on the noisy network structure directly. Second, using a Bayesian framework we can keep track of the uncertainty of each gene being associated with the phenotype rather than returning a fixed list of genes. We first show that using networks clearly improves gene detection compared to individual gene testing. We then show consistently improved performance of Conflux compared to the state-of-the-art diffusion network-based method Hotnet2 and a variety of other network and variant aggregation methods, using randomly generated and literature-reported gene sets. We test Hotnet2 and Conflux on several network configurations to reveal biases and patterns of false positives and false negatives in each case. Our experiments show that our novel Bayesian framework Conflux incorporates many of the advantages of the current state-of-the-art methods, while offering more flexibility and improved power in many gene-disease association scenarios.
Discovering disease-associated genes in weighted protein-protein interaction networks

NASA Astrophysics Data System (ADS)

Cui, Ying; Cai, Meng; Stanley, H. Eugene

2018-04-01

Although there have been many network-based attempts to discover disease-associated genes, most of them have not taken edge weight - which quantifies their relative strength - into consideration. We use connection weights in a protein-protein interaction (PPI) network to locate disease-related genes. We analyze the topological properties of both weighted and unweighted PPI networks and design an improved random forest classifier to distinguish disease genes from non-disease genes. We use a cross-validation test to confirm that weighted networks are better able to discover disease-associated genes than unweighted networks, which indicates that including link weight in the analysis of network properties provides a better model of complex genotype-phenotype associations.

The transfer and transformation of collective network information in gene-matched networks.

PubMed

Kitsukawa, Takashi; Yagi, Takeshi

2015-10-09

Networks, such as the human society network, social and professional networks, and biological system networks, contain vast amounts of information. Information signals in networks are distributed over nodes and transmitted through intricately wired links, making the transfer and transformation of such information difficult to follow. Here we introduce a novel method for describing network information and its transfer using a model network, the Gene-matched network (GMN), in which nodes (neurons) possess attributes (genes). In the GMN, nodes are connected according to their expression of common genes. Because neurons have multiple genes, the GMN is cluster-rich. We show that, in the GMN, information transfer and transformation were controlled systematically, according to the activity level of the network. Furthermore, information transfer and transformation could be traced numerically with a vector using genes expressed in the activated neurons, the active-gene array, which was used to assess the relative activity among overlapping neuronal groups. Interestingly, this coding style closely resembles the cell-assembly neural coding theory. The method introduced here could be applied to many real-world networks, since many systems, including human society and various biological systems, can be represented as a network of this type.
Information-dependent enrichment analysis reveals time-dependent transcriptional regulation of the estrogen pathway of toxicity.

PubMed

Pendse, Salil N; Maertens, Alexandra; Rosenberg, Michael; Roy, Dipanwita; Fasani, Rick A; Vantangoli, Marguerite M; Madnick, Samantha J; Boekelheide, Kim; Fornace, Albert J; Odwin, Shelly-Ann; Yager, James D; Hartung, Thomas; Andersen, Melvin E; McMullen, Patrick D

2017-04-01

The twenty-first century vision for toxicology involves a transition away from high-dose animal studies to in vitro and computational models (NRC in Toxicity testing in the 21st century: a vision and a strategy, The National Academies Press, Washington, DC, 2007). This transition requires mapping pathways of toxicity by understanding how in vitro systems respond to chemical perturbation. Uncovering transcription factors/signaling networks responsible for gene expression patterns is essential for defining pathways of toxicity, and ultimately, for determining the chemical modes of action through which a toxicant acts. Traditionally, transcription factor identification is achieved via chromatin immunoprecipitation studies and summarized by calculating which transcription factors are statistically associated with up- and downregulated genes. These lists are commonly determined via statistical or fold-change cutoffs, a procedure that is sensitive to statistical power and may not be as useful for determining transcription factor associations. To move away from an arbitrary statistical or fold-change-based cutoff, we developed, in the context of the Mapping the Human Toxome project, an enrichment paradigm called information-dependent enrichment analysis (IDEA) to guide identification of the transcription factor network. We used a test case of activation in MCF-7 cells by 17β estradiol (E2). Using this new approach, we established a time course for transcriptional and functional responses to E2. ERα and ERβ were associated with short-term transcriptional changes in response to E2. Sustained exposure led to recruitment of additional transcription factors and alteration of cell cycle machinery. TFAP2C and SOX2 were the transcription factors most highly correlated with dose. E2F7, E2F1, and Foxm1, which are involved in cell proliferation, were enriched only at 24 h. IDEA should be useful for identifying candidate pathways of toxicity. IDEA outperforms gene set enrichment analysis (GSEA) and provides similar results to weighted gene correlation network analysis, a platform that helps to identify genes not annotated to pathways.
Characterizing gene sets using discriminative random walks with restart on heterogeneous biological networks.

PubMed

Blatti, Charles; Sinha, Saurabh

2016-07-15

Analysis of co-expressed gene sets typically involves testing for enrichment of different annotations or 'properties' such as biological processes, pathways, transcription factor binding sites, etc., one property at a time. This common approach ignores any known relationships among the properties or the genes themselves. It is believed that known biological relationships among genes and their many properties may be exploited to more accurately reveal commonalities of a gene set. Previous work has sought to achieve this by building biological networks that combine multiple types of gene-gene or gene-property relationships, and performing network analysis to identify other genes and properties most relevant to a given gene set. Most existing network-based approaches for recognizing genes or annotations relevant to a given gene set collapse information about different properties to simplify (homogenize) the networks. We present a network-based method for ranking genes or properties related to a given gene set. Such related genes or properties are identified from among the nodes of a large, heterogeneous network of biological information. Our method involves a random walk with restarts, performed on an initial network with multiple node and edge types that preserve more of the original, specific property information than current methods that operate on homogeneous networks. In this first stage of our algorithm, we find the properties that are the most relevant to the given gene set and extract a subnetwork of the original network, comprising only these relevant properties. We then re-rank genes by their similarity to the given gene set, based on a second random walk with restarts, performed on the above subnetwork. We demonstrate the effectiveness of this algorithm for ranking genes related to Drosophila embryonic development and aggressive responses in the brains of social animals. DRaWR was implemented as an R package available at veda.cs.illinois.edu/DRaWR. blatti@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Reverse engineering biological networks :applications in immune responses to bio-toxins.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martino, Anthony A.; Sinclair, Michael B.; Davidson, George S.

Our aim is to determine the network of events, or the regulatory network, that defines an immune response to a bio-toxin. As a model system, we are studying T cell regulatory network triggered through tyrosine kinase receptor activation using a combination of pathway stimulation and time-series microarray experiments. Our approach is composed of five steps (1) microarray experiments and data error analysis, (2) data clustering, (3) data smoothing and discretization, (4) network reverse engineering, and (5) network dynamics analysis and fingerprint identification. The technological outcome of this study is a suite of experimental protocols and computational tools that reverse engineermore » regulatory networks provided gene expression data. The practical biological outcome of this work is an immune response fingerprint in terms of gene expression levels. Inferring regulatory networks from microarray data is a new field of investigation that is no more than five years old. To the best of our knowledge, this work is the first attempt that integrates experiments, error analyses, data clustering, inference, and network analysis to solve a practical problem. Our systematic approach of counting, enumeration, and sampling networks matching experimental data is new to the field of network reverse engineering. The resulting mathematical analyses and computational tools lead to new results on their own and should be useful to others who analyze and infer networks.« less
Plasticity of gene-regulatory networks controlling sex determination: of masters, slaves, usual suspects, newcomers, and usurpators.

PubMed

Herpin, Amaury; Schartl, Manfred

2015-10-01

Sexual dimorphism is one of the most pervasive and diverse features of animal morphology, physiology, and behavior. Despite the generality of the phenomenon itself, the mechanisms controlling how sex is determined differ considerably among various organismic groups, have evolved repeatedly and independently, and the underlying molecular pathways can change quickly during evolution. Even within closely related groups of organisms for which the development of gonads on the morphological, histological, and cell biological level is undistinguishable, the molecular control and the regulation of the factors involved in sex determination and gonad differentiation can be substantially different. The biological meaning of the high molecular plasticity of an otherwise common developmental program is unknown. While comparative studies suggest that the downstream effectors of sex-determining pathways tend to be more stable than the triggering mechanisms at the top, it is still unclear how conserved the downstream networks are and how all components work together. After many years of stasis, when the molecular basis of sex determination was amenable only in the few classical model organisms (fly, worm, mouse), recently, sex-determining genes from several animal species have been identified and new studies have elucidated some novel regulatory interactions and biological functions of the downstream network, particularly in vertebrates. These data have considerably changed our classical perception of a simple linear developmental cascade that makes the decision for the embryo to develop as male or female, and how it evolves. © 2015 The Authors.
Mapping Gene Associations in Human Mitochondria using Clinical Disease Phenotypes

PubMed Central

Scharfe, Curt; Lu, Henry Horng-Shing; Neuenburg, Jutta K.; Allen, Edward A.; Li, Guan-Cheng; Klopstock, Thomas; Cowan, Tina M.; Enns, Gregory M.; Davis, Ronald W.

2009-01-01

Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes. PMID:19390613
The Orphan Disease Networks

PubMed Central

Zhang, Minlu; Zhu, Cheng; Jacomy, Alexis; Lu, Long J.; Jegga, Anil G.

2011-01-01

The low prevalence rate of orphan diseases (OD) requires special combined efforts to improve diagnosis, prevention, and discovery of novel therapeutic strategies. To identify and investigate relationships based on shared genes or shared functional features, we have conducted a bioinformatic-based global analysis of all orphan diseases with known disease-causing mutant genes. Starting with a bipartite network of known OD and OD-causing mutant genes and using the human protein interactome, we first construct and topologically analyze three networks: the orphan disease network, the orphan disease-causing mutant gene network, and the orphan disease-causing mutant gene interactome. Our results demonstrate that in contrast to the common disease-causing mutant genes that are predominantly nonessential, a majority of orphan disease-causing mutant genes are essential. In confirmation of this finding, we found that OD-causing mutant genes are topologically important in the protein interactome and are ubiquitously expressed. Additionally, functional enrichment analysis of those genes in which mutations cause ODs shows that a majority result in premature death or are lethal in the orthologous mouse gene knockout models. To address the limitations of traditional gene-based disease networks, we also construct and analyze OD networks on the basis of shared enriched features (biological processes, cellular components, pathways, phenotypes, and literature citations). Analyzing these functionally-linked OD networks, we identified several additional OD-OD relations that are both phenotypically similar and phenotypically diverse. Surprisingly, we observed that the wiring of the gene-based and other feature-based OD networks are largely different; this suggests that the relationship between ODs cannot be fully captured by the gene-based network alone. PMID:21664998
Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network.

PubMed

Alvarez-Ponce, David; Feyertag, Felix; Chakraborty, Sandip

2017-06-01

The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein-protein interaction data set and the human signal transduction network-a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities

PubMed Central

2011-01-01

Background Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element. Results This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network. Conclusions The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions. PMID:21668997
Molecular ecological network analyses.

PubMed

Deng, Ye; Jiang, Yi-Huei; Yang, Yunfeng; He, Zhili; Luo, Feng; Zhou, Jizhong

2012-05-30

Understanding the interaction among different species within a community and their responses to environmental changes is a central goal in ecology. However, defining the network structure in a microbial community is very challenging due to their extremely high diversity and as-yet uncultivated status. Although recent advance of metagenomic technologies, such as high throughout sequencing and functional gene arrays, provide revolutionary tools for analyzing microbial community structure, it is still difficult to examine network interactions in a microbial community based on high-throughput metagenomics data. Here, we describe a novel mathematical and bioinformatics framework to construct ecological association networks named molecular ecological networks (MENs) through Random Matrix Theory (RMT)-based methods. Compared to other network construction methods, this approach is remarkable in that the network is automatically defined and robust to noise, thus providing excellent solutions to several common issues associated with high-throughput metagenomics data. We applied it to determine the network structure of microbial communities subjected to long-term experimental warming based on pyrosequencing data of 16 S rRNA genes. We showed that the constructed MENs under both warming and unwarming conditions exhibited topological features of scale free, small world and modularity, which were consistent with previously described molecular ecological networks. Eigengene analysis indicated that the eigengenes represented the module profiles relatively well. In consistency with many other studies, several major environmental traits including temperature and soil pH were found to be important in determining network interactions in the microbial communities examined. To facilitate its application by the scientific community, all these methods and statistical tools have been integrated into a comprehensive Molecular Ecological Network Analysis Pipeline (MENAP), which is open-accessible now (http://ieg2.ou.edu/MENA). The RMT-based molecular ecological network analysis provides powerful tools to elucidate network interactions in microbial communities and their responses to environmental changes, which are fundamentally important for research in microbial ecology and environmental microbiology.
Reverse Engineering of Modified Genes by Bayesian Network Analysis Defines Molecular Determinants Critical to the Development of Glioblastoma

PubMed Central

Kunkle, Brian W.; Yoo, Changwon; Roy, Deodutta

2013-01-01

In this study we have identified key genes that are critical in development of astrocytic tumors. Meta-analysis of microarray studies which compared normal tissue to astrocytoma revealed a set of 646 differentially expressed genes in the majority of astrocytoma. Reverse engineering of these 646 genes using Bayesian network analysis produced a gene network for each grade of astrocytoma (Grade I–IV), and ‘key genes’ within each grade were identified. Genes found to be most influential to development of the highest grade of astrocytoma, Glioblastoma multiforme were: COL4A1, EGFR, BTF3, MPP2, RAB31, CDK4, CD99, ANXA2, TOP2A, and SERBP1. All of these genes were up-regulated, except MPP2 (down regulated). These 10 genes were able to predict tumor status with 96–100% confidence when using logistic regression, cross validation, and the support vector machine analysis. Markov genes interact with NFkβ, ERK, MAPK, VEGF, growth hormone and collagen to produce a network whose top biological functions are cancer, neurological disease, and cellular movement. Three of the 10 genes - EGFR, COL4A1, and CDK4, in particular, seemed to be potential ‘hubs of activity’. Modified expression of these 10 Markov Blanket genes increases lifetime risk of developing glioblastoma compared to the normal population. The glioblastoma risk estimates were dramatically increased with joint effects of 4 or more than 4 Markov Blanket genes. Joint interaction effects of 4, 5, 6, 7, 8, 9 or 10 Markov Blanket genes produced 9, 13, 20.9, 26.7, 52.8, 53.2, 78.1 or 85.9%, respectively, increase in lifetime risk of developing glioblastoma compared to normal population. In summary, it appears that modified expression of several ‘key genes’ may be required for the development of glioblastoma. Further studies are needed to validate these ‘key genes’ as useful tools for early detection and novel therapeutic options for these tumors. PMID:23737970
Detecting complexes from edge-weighted PPI networks via genes expression analysis.

PubMed

Zhang, Zehua; Song, Jian; Tang, Jijun; Xu, Xinying; Guo, Fei

2018-04-24

Identifying complexes from PPI networks has become a key problem to elucidate protein functions and identify signal and biological processes in a cell. Proteins binding as complexes are important roles of life activity. Accurate determination of complexes in PPI networks is crucial for understanding principles of cellular organization. We propose a novel method to identify complexes on PPI networks, based on different co-expression information. First, we use Markov Cluster Algorithm with an edge-weighting scheme to calculate complexes on PPI networks. Then, we propose some significant features, such as graph information and gene expression analysis, to filter and modify complexes predicted by Markov Cluster Algorithm. To evaluate our method, we test on two experimental yeast PPI networks. On DIP network, our method has Precision and F-Measure values of 0.6004 and 0.5528. On MIPS network, our method has F-Measure and S n values of 0.3774 and 0.3453. Comparing to existing methods, our method improves Precision value by at least 0.1752, F-Measure value by at least 0.0448, S n value by at least 0.0771. Experiments show that our method achieves better results than some state-of-the-art methods for identifying complexes on PPI networks, with the prediction quality improved in terms of evaluation criteria.
Functional Genomic Analysis of the let-7 Regulatory Network in Caenorhabditis elegans

PubMed Central

Zisoulis, Dimitrios G.; Lovci, Michael T.; Melnik-Martinez, Katya V.; Yeo, Gene W.; Pasquinelli, Amy E.

2013-01-01

The let-7 microRNA (miRNA) regulates cellular differentiation across many animal species. Loss of let-7 activity causes abnormal development in Caenorhabditis elegans and unchecked cellular proliferation in human cells, which contributes to tumorigenesis. These defects are due to improper expression of protein-coding genes normally under let-7 regulation. While some direct targets of let-7 have been identified, the genome-wide effect of let-7 insufficiency in a developing animal has not been fully investigated. Here we report the results of molecular and genetic assays aimed at determining the global network of genes regulated by let-7 in C. elegans. By screening for mis-regulated genes that also contribute to let-7 mutant phenotypes, we derived a list of physiologically relevant potential targets of let-7 regulation. Twenty new suppressors of the rupturing vulva or extra seam cell division phenotypes characteristic of let-7 mutants emerged. Three of these genes, opt-2, prmt-1, and T27D12.1, were found to associate with Argonaute in a let-7–dependent manner and are likely novel direct targets of this miRNA. Overall, a complex network of genes with various activities is subject to let-7 regulation to coordinate developmental timing across tissues during worm development. PMID:23516374
Synthetic biology: Novel approaches for microbiology.

PubMed

Padilla-Vaca, Felipe; Anaya-Velázquez, Fernando; Franco, Bernardo

2015-06-01

In the past twenty years, molecular genetics has created powerful tools for genetic manipulation of living organisms. Whole genome sequencing has provided necessary information to assess knowledge on gene function and protein networks. In addition, new tools permit to modify organisms to perform desired tasks. Gene function analysis is speed up by novel approaches that couple both high throughput data generation and mining. Synthetic biology is an emerging field that uses tools for generating novel gene networks, whole genome synthesis and engineering. New applications in biotechnological, pharmaceutical and biomedical research are envisioned for synthetic biology. In recent years these new strategies have opened up the possibilities to study gene and genome editing, creation of novel tools for functional studies in virus, parasites and pathogenic bacteria. There is also the possibility to re-design organisms to generate vaccine subunits or produce new pharmaceuticals to combat multi-drug resistant pathogens. In this review we provide our opinion on the applicability of synthetic biology strategies for functional studies of pathogenic organisms and some applications such as genome editing and gene network studies to further comprehend virulence factors and determinants in pathogenic organisms. We also discuss what we consider important ethical issues for this field of molecular biology, especially for potential misuse of the new technologies. Copyright© by the Spanish Society for Microbiology and Institute for Catalan Studies.
Unsupervised, statistically-based systems biology approach for unraveling the genetics of complex traits: A demonstration with ethanol metabolism.

PubMed

Lusk, Ryan; Saba, Laura M; Vanderlinden, Lauren A; Zidek, Vaclav; Silhavy, Jan; Pravenec, Michal; Hoffman, Paula L; Tabakoff, Boris

2018-04-24

A statistical pipeline was developed and used for determining candidate genes and candidate gene co-expression networks involved in two alcohol (i.e., ethanol) metabolism phenotypes, namely alcohol clearance and acetate area under the curve (AUC) in a recombinant inbred (HXB/BXH) rat panel. The approach was also used to provide an indication of how ethanol metabolism can impact the normal function of the identified networks. RNA was extracted from alcohol-naïve liver tissue of 30 strains of HXB/BXH recombinant inbred rats. The reconstructed transcripts were quantitated and data was used to construct gene co-expression modules and networks. A separate group of rats, comprising the same 30 strains, were injected with ethanol (2 gm/kg) for measurement of blood ethanol and acetate levels. These data were used for QTL analysis of the rate of ethanol disappearance and circulating acetate levels. The analysis pipeline required calculation of the module eigengene values, the correction of these values with ethanol metabolism rates and acetate levels across the rat strains and the determination of the eigengene QTLs. For a module to be considered a candidate for determining phenotype, the module eigengene values had to have significant correlation with the strain phenotypic values and the module eigengene QTLs had to overlap the phenotypic QTLs. Of the 658 transcript co-expression modules generated from liver RNA sequencing data, a single module satisfied all criteria for being a candidate for determining the alcohol clearance trait. This module contained two alcohol dehydrogenase genes, including the gene whose product was previously shown to be responsible for the majority of alcohol elimination in the rat. This module was also the only module identified as a candidate for influencing circulating acetate levels. This module was also linked to the process of generation and utilization of retinoic acid as related to the autonomous immune response. We propose that our analytical pipeline can successfully identify genetic regions and transcripts which predispose a particular phenotype and our analysis provides functional context for co-expression module components. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Signal Correlations in Ecological Niches Can Shape the Organization and Evolution of Bacterial Gene Regulatory Networks

PubMed Central

Dufour, Yann S.; Donohue, Timothy J.

2015-01-01

Transcriptional regulation plays a significant role in the biological response of bacteria to changing environmental conditions. Therefore, mapping transcriptional regulatory networks is an important step not only in understanding how bacteria sense and interpret their environment but also to identify the functions involved in biological responses to specific conditions. Recent experimental and computational developments have facilitated the characterization of regulatory networks on a genome-wide scale in model organisms. In addition, the multiplication of complete genome sequences has encouraged comparative analyses to detect conserved regulatory elements and infer regulatory networks in other less well-studied organisms. However, transcription regulation appears to evolve rapidly, thus, creating challenges for the transfer of knowledge to nonmodel organisms. Nevertheless, the mechanisms and constraints driving the evolution of regulatory networks have been the subjects of numerous analyses, and several models have been proposed. Overall, the contributions of mutations, recombination, and horizontal gene transfer are complex. Finally, the rapid evolution of regulatory networks plays a significant role in the remarkable capacity of bacteria to adapt to new or changing environments. Conversely, the characteristics of environmental niches determine the selective pressures and can shape the structure of regulatory network accordingly. PMID:23046950
Orthoscape: a cytoscape application for grouping and visualization KEGG based gene networks by taxonomy and homology principles.

PubMed

Mustafin, Zakhar Sergeevich; Lashin, Sergey Alexandrovich; Matushkin, Yury Georgievich; Gunbin, Konstantin Vladimirovich; Afonnikov, Dmitry Arkadievich

2017-01-27

There are many available software tools for visualization and analysis of biological networks. Among them, Cytoscape ( http://cytoscape.org/ ) is one of the most comprehensive packages, with many plugins and applications which extends its functionality by providing analysis of protein-protein interaction, gene regulatory and gene co-expression networks, metabolic, signaling, neural as well as ecological-type networks including food webs, communities networks etc. Nevertheless, only three plugins tagged 'network evolution' found in Cytoscape official app store and in literature. We have developed a new Cytoscape 3.0 application Orthoscape aimed to facilitate evolutionary analysis of gene networks and visualize the results. Orthoscape aids in analysis of evolutionary information available for gene sets and networks by highlighting: (1) the orthology relationships between genes; (2) the evolutionary origin of gene network components; (3) the evolutionary pressure mode (diversifying or stabilizing, negative or positive selection) of orthologous groups in general and/or branch-oriented mode. The distinctive feature of Orthoscape is the ability to control all data analysis steps via user-friendly interface. Orthoscape allows its users to analyze gene networks or separated gene sets in the context of evolution. At each step of data analysis, Orthoscape also provides for convenient visualization and data manipulation.
The comprehensive liver transcriptome of two cattle breeds with different intramuscular fat content.

PubMed

Wang, Xi; Zhang, Yuanqing; Zhang, Xizhong; Wang, Dongcai; Jin, Guang; Li, Bo; Xu, Fang; Cheng, Jing; Zhang, Feng; Wu, Sujun; Rui, Su; He, Jiang; Zhang, Ronghua; Liu, Wenzhong

2017-08-26

Intramuscular fat (IMF) content is an important determinant factor of meat quality in cattle. There is significant difference in IMF content between Jinnan and Simmental cattle. Here, to identify candidate genes and networks associated with IMF deposition, we deeply explored the transcriptome architecture of liver in these two cattle breeds. We sequenced the liver transcriptome of five Jinnan and three Simmental cattle, yielding about 413.9 million sequencing reads. 124 differentially expressed genes (DEGs) were detected, of which 53 were up-regulated and 71 were down-regulated in Jinnan cattle. 1282 potentially novel genes were also identified. Gene ontology analysis revealed these DEGs (including CYP21A2, PC, ACACB, APOA1, and FADS2) were significantly enriched in lipid biosynthetic process, regulation of cholesterol esterification, reverse cholesterol transport, and regulation of lipoprotein lipase activity. Genes involved in pyruvate metabolism pathway were also significantly overrepresented. Moreover, we identified an interaction network which related to lipid metabolism, which might be contributed to the IMF deposition in cattle. We concluded that the DEGs involved in the regulation of lipid metabolism could play an important role in IMF deposition. Overall, we proposed a new panel of candidate genes and interaction networks that can be associated with IMF deposition and used as biomarkers in cattle breeding. Copyright © 2017 Elsevier Inc. All rights reserved.
An Integrative data mining approach to identifying Adverse ...

EPA Pesticide Factsheets

The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP
Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks.

PubMed

Nariai, N; Kim, S; Imoto, S; Miyano, S

2004-01-01

We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.

Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift.

PubMed

Laarits, T; Bordalo, P; Lemos, B

2016-08-01

Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Integration of a splicing regulatory network within the meiotic gene expression program of Saccharomyces cerevisiae

PubMed Central

Munding, Elizabeth M.; Igel, A. Haller; Shiue, Lily; Dorighi, Kristel M.; Treviño, Lisa R.; Ares, Manuel

2010-01-01

Splicing regulatory networks are essential components of eukaryotic gene expression programs, yet little is known about how they are integrated with transcriptional regulatory networks into coherent gene expression programs. Here we define the MER1 splicing regulatory network and examine its role in the gene expression program during meiosis in budding yeast. Mer1p splicing factor promotes splicing of just four pre-mRNAs. All four Mer1p-responsive genes also require Nam8p for splicing activation by Mer1p; however, other genes require Nam8p but not Mer1p, exposing an overlapping meiotic splicing network controlled by Nam8p. MER1 mRNA and three of the four Mer1p substrate pre-mRNAs are induced by the transcriptional regulator Ume6p. This unusual arrangement delays expression of Mer1p-responsive genes relative to other genes under Ume6p control. Products of Mer1p-responsive genes are required for initiating and completing recombination and for activation of Ndt80p, the activator of the transcriptional network required for subsequent steps in the program. Thus, the MER1 splicing regulatory network mediates the dependent relationship between the UME6 and NDT80 transcriptional regulatory networks in the meiotic gene expression program. This study reveals how splicing regulatory networks can be interlaced with transcriptional regulatory networks in eukaryotic gene expression programs. PMID:21123654
Association of variants in innate immune genes with asthma and eczema

PubMed Central

Sharma, Sunita; Poon, Audrey; Himes, Blanca E.; Lasky-Su, Jessica; Sordillo, Joanne E.; Belanger, Kathleen; Milton, Donald K.; Bracken, Michael B.; Triche, Elizabeth W.; Leaderer, Brian P.; Gold, Diane R.; Litonjua, Augusto A.

2012-01-01

Background The innate immune pathway is important in the pathogenesis of asthma and eczema. However, only a few variants in these genes have been associated with either disease. We investigate the association between polymorphisms of genes in the innate immune pathway with childhood asthma and eczema. In addition, we compare individual associations with those discovered using a multivariate approach. Methods Using a novel method, case control based association testing (C2BAT), 569 single nucleotide polymorphisms (SNPs) in 44 innate immune genes were tested for association with asthma and eczema in children from the Boston Home Allergens and Asthma Study and the Connecticut Childhood Asthma Study. The screening algorithm was used to identify the top SNPs associated with asthma and eczema. We next investigated the interaction of innate immune variants with asthma and eczema risk using Bayesian networks. Results After correction for multiple comparisons, 7 SNPs in 6 genes (CARD25, TGFB1, LY96, ACAA1, DEFB1, and IFNG) were associated with asthma (adjusted p-value<0.02), while 5 SNPs in 3 different genes (CD80, STAT4, and IRAKI) were significantly associated with eczema (adjusted p-value < 0.02). None of these SNPs were associated with both asthma and eczema. Bayesian network analysis identified 4 SNPs that were predictive of asthma and 10 SNPs that predicted eczema. Of the genes identified using Bayesian networks, only CD80 was associated with eczema in the single-SNP study. Using novel methodology that allows for screening and replication in the same population, we have identified associations of innate immune genes with asthma and eczema. Bayesian network analysis suggests that additional SNPs influence disease susceptibility via SNP interactions. Conclusion Our findings suggest that innate immune genes contribute to the pathogenesis of asthma and eczema, and that these diseases likely have different genetic determinants. PMID:22192168
Construct and Compare Gene Coexpression Networks with DAPfinder and DAPview.

PubMed

Skinner, Jeff; Kotliarov, Yuri; Varma, Sudhir; Mine, Karina L; Yambartsev, Anatoly; Simon, Richard; Huyen, Yentram; Morgun, Andrey

2011-07-14

DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks and identify significant differences in pairwise gene-gene coexpression between two phenotypes. Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma experiments and microarray simulations demonstrate the utility of these tools. DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks.
Sry and SoxE genes: How they participate in mammalian sex determination and gonadal development?

PubMed

She, Zhen-Yu; Yang, Wan-Xi

2017-03-01

In mammals, sex determination defines the differentiation of the bipotential genital ridge into either testes or ovaries. Sry, the mammalian Y-chromosomal testis-determining gene, is a master regulator of male sex determination. It acts to switch the undifferentiated genital ridge towards testis development, triggering the adoption of a male fate. Sry initiates a cascade of gene networks through the direct regulation of Sox9 expression and promotes supporting cell differentiation, Leydig cell specification, vasculature formation and testis cord development. In the absence of Sry, alternative genetic cascades, including female sex-determining genes RSPO1, Wnt4/β-catenin and Foxl2, are involved in the formation of female genitalia and the maintenance of female ovarian development. The mutual antagonisms between male and female sex-determining pathways are crucial in not just the initiation but also the maintenance of the somatic sex of the gonad throughout the organism's lifetime. Any imbalances in above sex-determining genes can cause disorders of sex development in humans and mice. In this review, we provide a detailed summary of the expression profiles, biochemical properties and developmental functions of Sry and SoxE genes in embryonic testis development and adult gonadal development. We also briefly summarize the dedicate balances between male and female sex-determining genes in mammalian sex development, with particular highlights on the molecular actions of Sry and Sox9 transcription factors. Copyright © 2016 Elsevier Ltd. All rights reserved.
Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways

PubMed Central

Azimzadeh Jamalkandi, Sadegh; Mozhgani, Sayed-Hamidreza; Gholami Pourbadie, Hamid; Mirzaie, Mehdi; Noorbakhsh, Farshid; Vaziri, Behrouz; Gholami, Alireza; Ansari-Pour, Naser; Jafari, Mohieddin

2016-01-01

The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein–protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets. PMID:27872612
On the Interplay between the Evolvability and Network Robustness in an Evolutionary Biological Network: A Systems Biology Approach

PubMed Central

Chen, Bor-Sen; Lin, Ying-Po

2011-01-01

In the evolutionary process, the random transmission and mutation of genes provide biological diversities for natural selection. In order to preserve functional phenotypes between generations, gene networks need to evolve robustly under the influence of random perturbations. Therefore, the robustness of the phenotype, in the evolutionary process, exerts a selection force on gene networks to keep network functions. However, gene networks need to adjust, by variations in genetic content, to generate phenotypes for new challenges in the network’s evolution, ie, the evolvability. Hence, there should be some interplay between the evolvability and network robustness in evolutionary gene networks. In this study, the interplay between the evolvability and network robustness of a gene network and a biochemical network is discussed from a nonlinear stochastic system point of view. It was found that if the genetic robustness plus environmental robustness is less than the network robustness, the phenotype of the biological network is robust in evolution. The tradeoff between the genetic robustness and environmental robustness in evolution is discussed from the stochastic stability robustness and sensitivity of the nonlinear stochastic biological network, which may be relevant to the statistical tradeoff between bias and variance, the so-called bias/variance dilemma. Further, the tradeoff could be considered as an antagonistic pleiotropic action of a gene network and discussed from the systems biology perspective. PMID:22084563
Identification of potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma.

PubMed

Pan, Yue; Lu, Lingyun; Chen, Junquan; Zhong, Yong; Dai, Zhehao

2018-01-01

This study aimed to identify potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma by comprehensive bioinformatics analysis. Data of gene expression profiles (GSE28424) and miRNA expression profiles (GSE28423) were downloaded from GEO database. The differentially expressed genes (DEGs) and miRNAs (DEMIs) were obtained by R Bioconductor packages. Functional and enrichment analyses of selected genes were performed using DAVID database. Protein-protein interaction (PPI) network was constructed by STRING and visualized in Cytoscape. The relationships among the DEGs and module in PPI network were analyzed by plug-in NetworkAnalyzer and MCODE seperately. Through the TargetScan and comparing target genes with DEGs, the miRNA-mRNA regulation network was established. Totally 346 DEGs and 90 DEMIs were found to be differentially expressed. These DEGs were enriched in biological processes and KEGG pathway of inflammatory immune response. 25 genes in the PPI network were selected as hub genes. Top 10 hub genes were TYROBP, HLA-DRA, VWF, PPBP, SERPING1, HLA-DPA1, SERPINA1, KIF20A, FERMT3, HLA-E. PPI network of DEGs followed a pattern of power law network and met the characteristics of small-world network. MCODE analysis identified 4 clusters and the most significant cluster consisted of 11 nodes and 55 edges. SEPP1, CKS2, TCAP, BPI were identified as the seed genes in their own clusters, respectively. The miRNA-mRNA regulation network which was composed of 89 pairs was established. MiR-210 had the highest connectivity with 12 target genes. Among the predicted target of MiR-96, HLA-DPA1 and TYROBP were the hub genes. Our study indicated possible differentially expressed genes and miRNA, and microRNA-mRNA negative regulatory networks in osteosarcoma by bioinformatics analysis, which may provide novel insights for unraveling pathogenesis of osteosarcoma.
Diversification of transcription factor-DNA interactions and the evolution of gene regulatory networks.

PubMed

Rogers, Julia M; Bulyk, Martha L

2018-04-25

Sequence-specific transcription factors (TFs) bind short DNA sequences in the genome to regulate the expression of target genes. In the last decade, numerous technical advances have enabled the determination of the DNA-binding specificities of many of these factors. Large-scale screens of many TFs enabled the creation of databases of TF DNA-binding specificities, typically represented as position weight matrices (PWMs). Although great progress has been made in determining and predicting binding specificities systematically, there are still many surprises to be found when studying a particular TF's interactions with DNA in detail. Paralogous TFs' binding specificities can differ in subtle ways, in a manner that is not immediately apparent from looking at their PWMs. These differences affect gene regulatory outputs and enable TFs to rewire transcriptional networks over evolutionary time. This review discusses recent observations made in the study of TF-DNA interactions that highlight the importance of continued in-depth analysis of TF-DNA interactions and their inherent complexity. This article is categorized under: Biological Mechanisms > Regulatory Biology. © 2018 Wiley Periodicals, Inc.
Topological and organizational properties of the products of house-keeping and tissue-specific genes in protein-protein interaction networks.

PubMed

Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing

2009-03-11

Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene-encoded proteins are attached to the core at more peripheral positions of the networks.
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

PubMed

Valentini, Giorgio; Paccanaro, Alberto; Caniza, Horacio; Romero, Alfonso E; Re, Matteo

2014-06-01

In the context of "network medicine", gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different "informativeness" embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
TNF-alpha -308G/A and -238G/A polymorphisms and its protein network associated with type 2 diabetes mellitus.

PubMed

Jamil, Kaiser; Jayaraman, Archana; Ahmad, Javeed; Joshi, Sindhu; Yerra, Shiva Kumar

2017-09-01

Several reports document the role of tumor necrosis factor alpha ( TNF-α ) and lipid metabolism in the context of acute inflammation as a causative factor in obesity-associated insulin resistance and as one of the causative parameter of type 2 diabetes mellitus (T2DM). Our aim was to investigate the association between -308G/A and -238G/A polymorphisms located in the promoter region of the TNF-α gene in T2DM in the Indian population with bioinformatics analysis of TNF-α protein networking with an aim to find new target sites for the treatment of T2DM. Demographics of 100 diabetes patients and 100 healthy volunteers were collected in a structured proforma and 3 ml blood samples were obtained from the study group, after approval of Institutional Ethics Committee of the hospital (IEC). The information on clinical parameters was obtained from medical records. Genomic DNA was extracted; PCR-RFLP was performed using TNF-α primers specific to detect the presence of SNPs. Various bioinformatics tools such as STRING software were used to determine its network with other associated genes. The PCR-RFLP studies showed that among the -238G/A types the GG genotype was 87%, GA genotype was 12% and AA genotype was 1%. Almost a similar pattern of results was obtained with TNF-α -308G/A polymorphism. The results obtained were evaluated statistically to determine the significance. By constructing TNF-α protein interaction network we could analyze ontology and hubness of the network to identify the networking of this gene which may influence the functioning of other genes in promoting T2DM. We could identify new targets in T2DM which may function in association with TNF-α . Through hub analysis of TNF-α protein network we have identified three novel proteins RIPK1, BIRC2 and BIRC3 which may contribute to TNF- mediated T2DM pathogenesis. In conclusion, our study indicated that some of the genotypes of TNF-α -308G/A, -238G/A were not significantly associated to type 2 diabetes mellitus, but TNF-α -308G/A polymorphism was reported to be a potent risk factor for diabetes in higher age (>45) groups. Also, the novel hub proteins may serve as new targets against TNF-α T2DM pathogenesis.
Enhancing gene regulatory network inference through data integration with markov random fields

DOE PAGES

Banf, Michael; Rhee, Seung Y.

2017-02-01

Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less
Enhancing gene regulatory network inference through data integration with markov random fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Banf, Michael; Rhee, Seung Y.

Here, a gene regulatory network links transcription factors to their target genes and represents a map of transcriptional regulation. Much progress has been made in deciphering gene regulatory networks computationally. However, gene regulatory network inference for most eukaryotic organisms remain challenging. To improve the accuracy of gene regulatory network inference and facilitate candidate selection for experimentation, we developed an algorithm called GRACE (Gene Regulatory network inference ACcuracy Enhancement). GRACE exploits biological a priori and heterogeneous data integration to generate high- confidence network predictions for eukaryotic organisms using Markov Random Fields in a semi-supervised fashion. GRACE uses a novel optimization schememore » to integrate regulatory evidence and biological relevance. It is particularly suited for model learning with sparse regulatory gold standard data. We show GRACE’s potential to produce high confidence regulatory networks compared to state of the art approaches using Drosophila melanogaster and Arabidopsis thaliana data. In an A. thaliana developmental gene regulatory network, GRACE recovers cell cycle related regulatory mechanisms and further hypothesizes several novel regulatory links, including a putative control mechanism of vascular structure formation due to modifications in cell proliferation.« less
Differentially Coexpressed Disease Gene Identification Based on Gene Coexpression Network.

PubMed

Jiang, Xue; Zhang, Han; Quan, Xiongwen

2016-01-01

Screening disease-related genes by analyzing gene expression data has become a popular theme. Traditional disease-related gene selection methods always focus on identifying differentially expressed gene between case samples and a control group. These traditional methods may not fully consider the changes of interactions between genes at different cell states and the dynamic processes of gene expression levels during the disease progression. However, in order to understand the mechanism of disease, it is important to explore the dynamic changes of interactions between genes in biological networks at different cell states. In this study, we designed a novel framework to identify disease-related genes and developed a differentially coexpressed disease-related gene identification method based on gene coexpression network (DCGN) to screen differentially coexpressed genes. We firstly constructed phase-specific gene coexpression network using time-series gene expression data and defined the conception of differential coexpression of genes in coexpression network. Then, we designed two metrics to measure the value of gene differential coexpression according to the change of local topological structures between different phase-specific networks. Finally, we conducted meta-analysis of gene differential coexpression based on the rank-product method. Experimental results demonstrated the feasibility and effectiveness of DCGN and the superior performance of DCGN over other popular disease-related gene selection methods through real-world gene expression data sets.
Network diffusion-based analysis of high-throughput data for the detection of differentially enriched modules

PubMed Central

Bersanelli, Matteo; Mosca, Ettore; Remondini, Daniel; Castellani, Gastone; Milanesi, Luciano

2016-01-01

A relation exists between network proximity of molecular entities in interaction networks, functional similarity and association with diseases. The identification of network regions associated with biological functions and pathologies is a major goal in systems biology. We describe a network diffusion-based pipeline for the interpretation of different types of omics in the context of molecular interaction networks. We introduce the network smoothing index, a network-based quantity that allows to jointly quantify the amount of omics information in genes and in their network neighbourhood, using network diffusion to define network proximity. The approach is applicable to both descriptive and inferential statistics calculated on omics data. We also show that network resampling, applied to gene lists ranked by quantities derived from the network smoothing index, indicates the presence of significantly connected genes. As a proof of principle, we identified gene modules enriched in somatic mutations and transcriptional variations observed in samples of prostate adenocarcinoma (PRAD). In line with the local hypothesis, network smoothing index and network resampling underlined the existence of a connected component of genes harbouring molecular alterations in PRAD. PMID:27731320
Laplacian normalization and random walk on heterogeneous networks for disease-gene prioritization.

PubMed

Zhao, Zhi-Qin; Han, Guo-Sheng; Yu, Zu-Guo; Li, Jinyan

2015-08-01

Random walk on heterogeneous networks is a recently emerging approach to effective disease gene prioritization. Laplacian normalization is a technique capable of normalizing the weight of edges in a network. We use this technique to normalize the gene matrix and the phenotype matrix before the construction of the heterogeneous network, and also use this idea to define the transition matrices of the heterogeneous network. Our method has remarkably better performance than the existing methods for recovering known gene-phenotype relationships. The Shannon information entropy of the distribution of the transition probabilities in our networks is found to be smaller than the networks constructed by the existing methods, implying that a higher number of top-ranked genes can be verified as disease genes. In fact, the most probable gene-phenotype relationships ranked within top 3 or top 5 in our gene lists can be confirmed by the OMIM database for many cases. Our algorithms have shown remarkably superior performance over the state-of-the-art algorithms for recovering gene-phenotype relationships. All Matlab codes can be available upon email request. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integration of multi-omics data for integrative gene regulatory network inference.

PubMed

Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun; Kang, Mingon

2017-01-01

Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called 'multi-omics data', that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN's capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed.
Integration of multi-omics data for integrative gene regulatory network inference

PubMed Central

Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun

2017-01-01

Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called ‘multi-omics data’, that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN’s capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed. PMID:29354189
Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model.

PubMed

Ni, Jingchao; Koyuturk, Mehmet; Tong, Hanghang; Haines, Jonathan; Xu, Rong; Zhang, Xiang

2016-11-10

Accurately prioritizing candidate disease genes is an important and challenging problem. Various network-based methods have been developed to predict potential disease genes by utilizing the disease similarity network and molecular networks such as protein interaction or gene co-expression networks. Although successful, a common limitation of the existing methods is that they assume all diseases share the same molecular network and a single generic molecular network is used to predict candidate genes for all diseases. However, different diseases tend to manifest in different tissues, and the molecular networks in different tissues are usually different. An ideal method should be able to incorporate tissue-specific molecular networks for different diseases. In this paper, we develop a robust and flexible method to integrate tissue-specific molecular networks for disease gene prioritization. Our method allows each disease to have its own tissue-specific network(s). We formulate the problem of candidate gene prioritization as an optimization problem based on network propagation. When there are multiple tissue-specific networks available for a disease, our method can automatically infer the relative importance of each tissue-specific network. Thus it is robust to the noisy and incomplete network data. To solve the optimization problem, we develop fast algorithms which have linear time complexities in the number of nodes in the molecular networks. We also provide rigorous theoretical foundations for our algorithms in terms of their optimality and convergence properties. Extensive experimental results show that our method can significantly improve the accuracy of candidate gene prioritization compared with the state-of-the-art methods. In our experiments, we compare our methods with 7 popular network-based disease gene prioritization algorithms on diseases from Online Mendelian Inheritance in Man (OMIM) database. The experimental results demonstrate that our methods recover true associations more accurately than other methods in terms of AUC values, and the performance differences are significant (with paired t-test p-values less than 0.05). This validates the importance to integrate tissue-specific molecular networks for studying disease gene prioritization and show the superiority of our network models and ranking algorithms toward this purpose. The source code and datasets are available at http://nijingchao.github.io/CRstar/ .

Gene regulation is governed by a core network in hepatocellular carcinoma.

PubMed

Gu, Zuguang; Zhang, Chenyu; Wang, Jin

2012-05-01

Hepatocellular carcinoma (HCC) is one of the most lethal cancers worldwide, and the mechanisms that lead to the disease are still relatively unclear. However, with the development of high-throughput technologies it is possible to gain a systematic view of biological systems to enhance the understanding of the roles of genes associated with HCC. Thus, analysis of the mechanism of molecule interactions in the context of gene regulatory networks can reveal specific sub-networks that lead to the development of HCC. In this study, we aimed to identify the most important gene regulations that are dysfunctional in HCC generation. Our method for constructing gene regulatory network is based on predicted target interactions, experimentally-supported interactions, and co-expression model. Regulators in the network included both transcription factors and microRNAs to provide a complete view of gene regulation. Analysis of gene regulatory network revealed that gene regulation in HCC is highly modular, in which different sets of regulators take charge of specific biological processes. We found that microRNAs mainly control biological functions related to mitochondria and oxidative reduction, while transcription factors control immune responses, extracellular activity and the cell cycle. On the higher level of gene regulation, there exists a core network that organizes regulations between different modules and maintains the robustness of the whole network. There is direct experimental evidence for most of the regulators in the core gene regulatory network relating to HCC. We infer it is the central controller of gene regulation. Finally, we explored the influence of the core gene regulatory network on biological pathways. Our analysis provides insights into the mechanism of transcriptional and post-transcriptional control in HCC. In particular, we highlight the importance of the core gene regulatory network; we propose that it is highly related to HCC and we believe further experimental validation is worthwhile.
Co-expression of mitosis-regulating genes contributes to malignant progression and prognosis in oligodendrogliomas

PubMed Central

Liu, Yanwei; Hu, Huimin; Zhang, Chuanbao; Wang, Haoyuan; Zhang, Wenlong; Wang, Zheng; Li, Mingyang; Zhang, Wei; Zhou, Dabiao; Jiang, Tao

2015-01-01

The clinical prognosis of patients with glioma is determined by tumor grades, but tumors of different subtypes with equal malignancy grade usually have different prognosis that is largely determined by genetic abnormalities. Oligodendrogliomas (ODs) are the second most common type of gliomas. In this study, integrative analyses found that distribution of TCGA transcriptomic subtypes was associated with grade progression in ODs. To identify critical gene(s) associated with tumor grades and TCGA subtypes, we analyzed 34 normal brain tissue (NBT), 146 WHO grade II and 130 grade III ODs by microarray and RNA sequencing, and identified a co-expression network of six genes (AURKA, NDC80,CENPK, KIAA0101, TIMELESS and MELK) that was associated with tumor grades and TCGA subtypes as well as Ki-67 expression. Validation of the six genes was performed by qPCR in additional 28 ODs. Importantly, these genes also were validated in four high-grade recurrent gliomas and the initial lower-grade gliomas resected from the same patients. Finally, the RNA data on two genes with the highest discrimination potential (AURKA and NDC80) and Ki-67 were validated on an independent cohort (5 NBTs and 86 ODs) by immunohistochemistry. Knockdown of AURKA and NDC80 by siRNAs suppressed Ki-67 expression and proliferation of gliomas cells. Survival analysis showed that high expression of the six genes corporately indicated a poor survival outcome. Correlation and protein interaction analysis provided further evidence for this co-expression network. These data suggest that the co-expression of the six mitosis-regulating genes was associated with malignant progression and prognosis in ODs. PMID:26468983
Temporal transcriptional logic of dynamic regulatory networks underlying nitrogen signaling and use in plants.

PubMed

Varala, Kranthi; Marshall-Colón, Amy; Cirrone, Jacopo; Brooks, Matthew D; Pasquino, Angelo V; Léran, Sophie; Mittal, Shipra; Rock, Tara M; Edwards, Molly B; Kim, Grace J; Ruffel, Sandrine; McCombie, W Richard; Shasha, Dennis; Coruzzi, Gloria M

2018-06-19

This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our "just-in-time" analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to "prune" the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF "N-specificity" index. This refined GRN identifies the temporal relationship of known/validated regulators of N signaling (NLP7/8, TGA1/4, NAC4, HRS1, and LBD37/38/39) and 146 additional regulators. Six TFs-CRF4, SNZ, CDF1, HHO5/6, and PHL1-validated herein regulate a significant number of genes in the dynamic N response, targeting 54% of N-uptake/assimilation pathway genes. Phenotypically, inducible overexpression of CRF4 in planta regulates genes resulting in altered biomass, root development, and 15 NO 3 - uptake, specifically under low-N conditions. This dynamic N-signaling GRN now provides the temporal "transcriptional logic" for 155 candidate TFs to improve nitrogen use efficiency with potential agricultural applications. Broadly, these time-based approaches can uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine. Copyright © 2018 the Author(s). Published by PNAS.
Comparative Transcriptome Analysis between Gynoecious and Monoecious Plants Identifies Regulatory Networks Controlling Sex Determination in Jatropha curcas

PubMed Central

Chen, Mao-Sheng; Pan, Bang-Zhen; Fu, Qiantang; Tao, Yan-Bin; Martínez-Herrera, Jorge; Niu, Longjian; Ni, Jun; Dong, Yuling; Zhao, Mei-Li; Xu, Zeng-Fu

2017-01-01

Most germplasms of the biofuel plant Jatropha curcas are monoecious. A gynoecious genotype of J. curcas was found, whose male flowers are aborted at early stage of inflorescence development. To investigate the regulatory mechanism of transition from monoecious to gynoecious plants, a comparative transcriptome analysis between gynoecious and monoecious inflorescences were performed. A total of 3,749 genes differentially expressed in two developmental stages of inflorescences were identified. Among them, 32 genes were involved in floral development, and 70 in phytohormone biosynthesis and signaling pathways. Six genes homologous to KNOTTED1-LIKE HOMEOBOX GENE 6 (KNAT6), MYC2, SHI-RELATED SEQUENCE 5 (SRS5), SHORT VEGETATIVE PHASE (SVP), TERMINAL FLOWER 1 (TFL1), and TASSELSEED2 (TS2), which control floral development, were considered as candidate regulators that may be involved in sex differentiation in J. curcas. Abscisic acid, auxin, gibberellin, and jasmonate biosynthesis were lower, whereas cytokinin biosynthesis was higher in gynoecious than that in monoecious inflorescences. Moreover, the exogenous application of gibberellic acid (GA3) promoted perianth development in male flowers and partly prevented pistil development in female flowers to generate neutral flowers in gynoecious inflorescences. The arrest of stamen primordium at early development stage probably causes the abortion of male flowers to generate gynoecious individuals. These results suggest that some floral development genes and phytohormone signaling pathways orchestrate the process of sex determination in J. curcas. Our study provides a basic framework for the regulation networks of sex determination in J. curcas and will be helpful for elucidating the evolution of the plant reproductive system. PMID:28144243
Comparative Transcriptome Analysis between Gynoecious and Monoecious Plants Identifies Regulatory Networks Controlling Sex Determination in Jatropha curcas.

PubMed

Chen, Mao-Sheng; Pan, Bang-Zhen; Fu, Qiantang; Tao, Yan-Bin; Martínez-Herrera, Jorge; Niu, Longjian; Ni, Jun; Dong, Yuling; Zhao, Mei-Li; Xu, Zeng-Fu

2016-01-01

Most germplasms of the biofuel plant Jatropha curcas are monoecious. A gynoecious genotype of J. curcas was found, whose male flowers are aborted at early stage of inflorescence development. To investigate the regulatory mechanism of transition from monoecious to gynoecious plants, a comparative transcriptome analysis between gynoecious and monoecious inflorescences were performed. A total of 3,749 genes differentially expressed in two developmental stages of inflorescences were identified. Among them, 32 genes were involved in floral development, and 70 in phytohormone biosynthesis and signaling pathways. Six genes homologous to KNOTTED1-LIKE HOMEOBOX GENE 6 ( KNAT6 ), MYC2 , SHI-RELATED SEQUENCE 5 ( SRS5 ), SHORT VEGETATIVE PHASE ( SVP ), TERMINAL FLOWER 1 ( TFL1 ), and TASSELSEED2 ( TS2 ), which control floral development, were considered as candidate regulators that may be involved in sex differentiation in J. curcas . Abscisic acid, auxin, gibberellin, and jasmonate biosynthesis were lower, whereas cytokinin biosynthesis was higher in gynoecious than that in monoecious inflorescences. Moreover, the exogenous application of gibberellic acid (GA 3 ) promoted perianth development in male flowers and partly prevented pistil development in female flowers to generate neutral flowers in gynoecious inflorescences. The arrest of stamen primordium at early development stage probably causes the abortion of male flowers to generate gynoecious individuals. These results suggest that some floral development genes and phytohormone signaling pathways orchestrate the process of sex determination in J. curcas . Our study provides a basic framework for the regulation networks of sex determination in J. curcas and will be helpful for elucidating the evolution of the plant reproductive system.
Inference of cancer-specific gene regulatory networks using soft computing rules.

PubMed

Wang, Xiaosheng; Gotoh, Osamu

2010-03-24

Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.

PubMed

Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J

2016-11-04

Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types, duplication ages and co-expression consequences.
Global Mapping of the Yeast Genetic Interaction Network

NASA Astrophysics Data System (ADS)

Tong, Amy Hin Yan; Lesage, Guillaume; Bader, Gary D.; Ding, Huiming; Xu, Hong; Xin, Xiaofeng; Young, James; Berriz, Gabriel F.; Brost, Renee L.; Chang, Michael; Chen, YiQun; Cheng, Xin; Chua, Gordon; Friesen, Helena; Goldberg, Debra S.; Haynes, Jennifer; Humphries, Christine; He, Grace; Hussein, Shamiza; Ke, Lizhu; Krogan, Nevan; Li, Zhijian; Levinson, Joshua N.; Lu, Hong; Ménard, Patrice; Munyana, Christella; Parsons, Ainslie B.; Ryan, Owen; Tonikian, Raffi; Roberts, Tania; Sdicu, Anne-Marie; Shapiro, Jesse; Sheikh, Bilal; Suter, Bernhard; Wong, Sharyl L.; Zhang, Lan V.; Zhu, Hongwei; Burd, Christopher G.; Munro, Sean; Sander, Chris; Rine, Jasper; Greenblatt, Jack; Peter, Matthias; Bretscher, Anthony; Bell, Graham; Roth, Frederick P.; Brown, Grant W.; Andrews, Brenda; Bussey, Howard; Boone, Charles

2004-02-01

A genetic interaction network containing ~1000 genes and ~4000 interactions was mapped by crossing mutations in 132 different query genes into a set of ~4700 viable gene yeast deletion mutants and scoring the double mutant progeny for fitness defects. Network connectivity was predictive of function because interactions often occurred among functionally related genes, and similar patterns of interactions tended to identify components of the same pathway. The genetic network exhibited dense local neighborhoods; therefore, the position of a gene on a partially mapped network is predictive of other genetic interactions. Because digenic interactions are common in yeast, similar networks may underlie the complex genetics associated with inherited phenotypes in other organisms.
The Evolution of Gene Regulatory Networks that Define Arthropod Body Plans.

PubMed

Auman, Tzach; Chipman, Ariel D

2017-09-01

Our understanding of the genetics of arthropod body plan development originally stems from work on Drosophila melanogaster from the late 1970s and onward. In Drosophila, there is a relatively detailed model for the network of gene interactions that proceeds in a sequential-hierarchical fashion to define the main features of the body plan. Over the years, we have a growing understanding of the networks involved in defining the body plan in an increasing number of arthropod species. It is now becoming possible to tease out the conserved aspects of these networks and to try to reconstruct their evolution. In this contribution, we focus on several key nodes of these networks, starting from early patterning in which the main axes are determined and the broad morphological domains of the embryo are defined, and on to later stage wherein the growth zone network is active in sequential addition of posterior segments. The pattern of conservation of networks is very patchy, with some key aspects being highly conserved in all arthropods and others being very labile. Many aspects of early axis patterning are highly conserved, as are some aspects of sequential segment generation. In contrast, regional patterning varies among different taxa, and some networks, such as the terminal patterning network, are only found in a limited range of taxa. The growth zone segmentation network is ancient and is probably plesiomorphic to all arthropods. In some insects, it has undergone significant modification to give rise to a more hardwired network that generates individual segments separately. In other insects and in most arthropods, the sequential segmentation network has undergone a significant amount of systems drift, wherein many of the genes have changed. However, it maintains a conserved underlying logic and function. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Transcriptome profiling analysis reveals biomarkers in colon cancer samples of various differentiation

PubMed Central

Yu, Tonghu; Zhang, Huaping; Qi, Hong

2018-01-01

The aim of the present study was to investigate more colon cancer-related genes in different stages. Gene expression profile E-GEOD-62932 was extracted for differentially expressed gene (DEG) screening. Series test of cluster analysis was used to obtain significant trending models. Based on the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, functional and pathway enrichment analysis were processed and a pathway relation network was constructed. Gene co-expression network and gene signal network were constructed for common DEGs. The DEGs with the same trend were clustered and in total, 16 clusters with statistical significance were obtained. The screened DEGs were enriched into small molecule metabolic process and metabolic pathways. The pathway relation network was constructed with 57 nodes. A total of 328 common DEGs were obtained. Gene signal network was constructed with 71 nodes. Gene co-expression network was constructed with 161 nodes and 211 edges. ABCD3, CPT2, AGL and JAM2 are potential biomarkers for the diagnosis of colon cancer. PMID:29928385
Gene networks and the evolution of plant morphology.

PubMed

Das Gupta, Mainak; Tsiantis, Miltos

2018-06-06

Elaboration of morphology depends on the precise orchestration of gene expression by key regulatory genes. The hierarchy and relationship among the participating genes is commonly known as gene regulatory network (GRN). Therefore, the evolution of morphology ultimately occurs by the rewiring of gene network structures or by the co-option of gene networks to novel domains. The availability of high-resolution expression data combined with powerful statistical tools have opened up new avenues to formulate and test hypotheses on how diverse gene networks influence trait development and diversity. Here we summarize recent studies based on both big-data and genetics approaches to understand the evolution of plant form and physiology. We also discuss recent genome-wide investigations on how studying open-chromatin regions may help study the evolution of gene expression patterns. Copyright © 2018. Published by Elsevier Ltd.
The Role of Retinal Determination Gene Network (RDGN) in Hormone Signaling Transduction and Prostate Tumorigenes

DTIC Science & Technology

2014-10-01

McCue P, Lisanti MP, Wang C, Davis RJ, Mardon G, Pestell RG. The Endogenous Cell-Fate Factor Dachshund Restrains Prostate Epithelial Cell Migration via...Loro E, Pestell RG. “Inhibition of Breast Tumor Stem Cells Expansion by the Endogenous Cell Fate Determination Factor Dachshund.” Chapter in Volume
MINER: exploratory analysis of gene interaction networks by machine learning from expression data.

PubMed

Kadupitige, Sidath Randeni; Leung, Kin Chun; Sellmeier, Julia; Sivieng, Jane; Catchpoole, Daniel R; Bain, Michael E; Gaëta, Bruno A

2009-12-03

The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. We have developed MINER (Microarray Interactive Network Exploration and Representation), an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Cancer-related marketing centrality motifs acting as pivot units in the human signaling network and mediating cross-talk between biological pathways.

PubMed

Li, Wan; Chen, Lina; Li, Xia; Jia, Xu; Feng, Chenchen; Zhang, Liangcai; He, Weiming; Lv, Junjie; He, Yuehan; Li, Weiguo; Qu, Xiaoli; Zhou, Yanyan; Shi, Yuchen

2013-12-01

Network motifs in central positions are considered to not only have more in-coming and out-going connections but are also localized in an area where more paths reach the networks. These central motifs have been extensively investigated to determine their consistent functions or associations with specific function categories. However, their functional potentials in the maintenance of cross-talk between different functional communities are unclear. In this paper, we constructed an integrated human signaling network from the Pathway Interaction Database. We identified 39 essential cancer-related motifs in central roles, which we called cancer-related marketing centrality motifs, using combined centrality indices on the system level. Our results demonstrated that these cancer-related marketing centrality motifs were pivotal units in the signaling network, and could mediate cross-talk between 61 biological pathways (25 could be mediated by one motif on average), most of which were cancer-related pathways. Further analysis showed that molecules of most marketing centrality motifs were in the same or adjacent subcellular localizations, such as the motif containing PI3K, PDK1 and AKT1 in the plasma membrane, to mediate signal transduction between 32 cancer-related pathways. Finally, we analyzed the pivotal roles of cancer genes in these marketing centrality motifs in the pathogenesis of cancers, and found that non-cancer genes were potential cancer-related genes.
A Temperature-Responsive Network Links Cell Shape and Virulence Traits in a Primary Fungal Pathogen

PubMed Central

Beyhan, Sinem; Gutierrez, Matias; Voorhies, Mark; Sil, Anita

2013-01-01

Survival at host temperature is a critical trait for pathogenic microbes of humans. Thermally dimorphic fungal pathogens, including Histoplasma capsulatum, are soil fungi that undergo dramatic changes in cell shape and virulence gene expression in response to host temperature. How these organisms link changes in temperature to both morphologic development and expression of virulence traits is unknown. Here we elucidate a temperature-responsive transcriptional network in H. capsulatum, which switches from a filamentous form in the environment to a pathogenic yeast form at body temperature. The circuit is driven by three highly conserved factors, Ryp1, Ryp2, and Ryp3, that are required for yeast-phase growth at 37°C. Ryp factors belong to distinct families of proteins that control developmental transitions in fungi: Ryp1 is a member of the WOPR family of transcription factors, and Ryp2 and Ryp3 are both members of the Velvet family of proteins whose molecular function is unknown. Here we provide the first evidence that these WOPR and Velvet proteins interact, and that Velvet proteins associate with DNA to drive gene expression. Using genome-wide chromatin immunoprecipitation studies, we determine that Ryp1, Ryp2, and Ryp3 associate with a large common set of genomic loci that includes known virulence genes, indicating that the Ryp factors directly control genes required for pathogenicity in addition to their role in regulating cell morphology. We further dissect the Ryp regulatory circuit by determining that a fourth transcription factor, which we name Ryp4, is required for yeast-phase growth and gene expression, associates with DNA, and displays interdependent regulation with Ryp1, Ryp2, and Ryp3. Finally, we define cis-acting motifs that recruit the Ryp factors to their interwoven network of temperature-responsive target genes. Taken together, our results reveal a positive feedback circuit that directs a broad transcriptional switch between environmental and pathogenic states in response to temperature. PMID:23935449
Comparative analysis of protein interactome networks prioritizes candidate genes with cancer signatures.

PubMed

Li, Yongsheng; Sahni, Nidhi; Yi, Song

2016-11-29

Comprehensive understanding of human cancer mechanisms requires the identification of a thorough list of cancer-associated genes, which could serve as biomarkers for diagnoses and therapies in various types of cancer. Although substantial progress has been made in functional studies to uncover genes involved in cancer, these efforts are often time-consuming and costly. Therefore, it remains challenging to comprehensively identify cancer candidate genes. Network-based methods have accelerated this process through the analysis of complex molecular interactions in the cell. However, the extent to which various interactome networks can contribute to prediction of candidate genes responsible for cancer is still enigmatic. In this study, we evaluated different human protein-protein interactome networks and compared their application to cancer gene prioritization. Our results indicate that network analyses can increase the power to identify novel cancer genes. In particular, such predictive power can be enhanced with the use of unbiased systematic protein interaction maps for cancer gene prioritization. Functional analysis reveals that the top ranked genes from network predictions co-occur often with cancer-related terms in literature, and further, these candidate genes are indeed frequently mutated across cancers. Finally, our study suggests that integrating interactome networks with other omics datasets could provide novel insights into cancer-associated genes and underlying molecular mechanisms.
Genome-Wide Responses of Female Fruit Flies Subjected to Divergent Mating Regimes

PubMed Central

Gerrard, Dave T.; Fricke, Claudia; Edward, Dominic A.; Edwards, Dylan R.; Chapman, Tracey

2013-01-01

Elevated rates of mating and reproduction cause decreased female survival and lifetime reproductive success across a wide range of taxa from flies to humans. These costs are fundamentally important to the evolution of life histories. Here we investigate the potential mechanistic basis of this classic life history component. We conducted 4 independent replicated experiments in which female Drosophila melanogaster were subjected to ‘high’ and ‘low’ mating regimes, resulting in highly significant differences in lifespan. We sampled females for transcriptomic analysis at day 10 of life, before the visible onset of ageing, and used Tiling expression arrays to detect differential gene expression in two body parts (abdomen versus head+thorax). The divergent mating regimes were associated with significant differential expression in a network of genes showing evidence for interactions with ecdysone receptor. Preliminary experimental manipulation of two genes in that network with roles in post-transcriptional modification (CG11486, eyegone) tended to enhance sensitivity to mating costs. However, the subtle nature of those effects suggests substantial functional redundancy or parallelism in this gene network, which could buffer females against excessive responses. There was also evidence for differential expression in genes involved in germline maintenance, cell proliferation and in gustation / odorant reception. Interestingly, we detected differential expression in three specific genes (EcR, keap1, lbk1) and one class of genes (gustation / odorant receptors) with previously reported roles in determining lifespan. Our results suggest that high and low mating regimes that lead to divergence in lifespan are associated with changes in the expression of genes such as reproductive hormones, that influence resource allocation to the germ line, and that may modify post-translational gene expression. This predicts that the correct signalling of nutrient levels to the reproductive system is important for maintaining organismal integrity. PMID:23826372
DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network.

PubMed

Mistry, Divya; Wise, Roger P; Dickerson, Julie A

2017-01-01

Identification of central genes and proteins in biomolecular networks provides credible candidates for pathway analysis, functional analysis, and essentiality prediction. The DiffSLC centrality measure predicts central and essential genes and proteins using a protein-protein interaction network. Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures helped identify critical genes and proteins in biomolecular networks. The proposed centrality measure, DiffSLC, combines the number of interactions of a protein and the gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLC is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from the Saccharomyces Gene Deletion Project, which show that DiffSLC detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLC a stronger alternative to other centrality methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLC is implemented using the igraph package in R, and networkx package in Python. The python package can be obtained from git.io/diffslcpy. The R implementation and code to reproduce the analysis is available via git.io/diffslc.
Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

PubMed Central

Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

2014-01-01

Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154
Regulatory network involving miRNAs and genes in serous ovarian carcinoma

PubMed Central

Zhao, Haiyan; Xu, Hao; Xue, Luchen

2017-01-01

Serous ovarian carcinoma (SOC) is one of the most life-threatening types of gynecological malignancy, but the pathogenesis of SOC remains unknown. Previous studies have indicated that differentially expressed genes and microRNAs (miRNAs) serve important functions in SOC. However, genes and miRNAs are identified in a disperse form, and limited information is known about the regulatory association between miRNAs and genes in SOC. In the present study, three regulatory networks were hierarchically constructed, including a differentially-expressed network, a related network and a global network to reveal associations between each factor. In each network, there were three types of factors, which were genes, miRNAs and transcription factors that interact with each other. Focus was placed on the differentially-expressed network, in which all genes and miRNAs were differentially expressed and therefore may have affected the development of SOC. Following the comparison and analysis between the three networks, a number of signaling pathways which demonstrated differentially expressed elements were highlighted. Subsequently, the upstream and downstream elements of differentially expressed miRNAs and genes were listed, and a number of key elements (differentially expressed miRNAs, genes and TFs predicted using the P-match method) were analyzed. The differentially expressed network partially illuminated the pathogenesis of SOC. It was hypothesized that if there was no differential expression of miRNAs and genes, SOC may be prevented and treatment may be identified. The present study provided a theoretical foundation for gene therapy for SOC. PMID:29113276

A human functional protein interaction network and its application to cancer data analysis

PubMed Central

2010-01-01

Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
Optimal stabilization of Boolean networks through collective influence

NASA Astrophysics Data System (ADS)

Wang, Jiannan; Pei, Sen; Wei, Wei; Feng, Xiangnan; Zheng, Zhiming

2018-03-01

Boolean networks have attracted much attention due to their wide applications in describing dynamics of biological systems. During past decades, much effort has been invested in unveiling how network structure and update rules affect the stability of Boolean networks. In this paper, we aim to identify and control a minimal set of influential nodes that is capable of stabilizing an unstable Boolean network. For locally treelike Boolean networks with biased truth tables, we propose a greedy algorithm to identify influential nodes in Boolean networks by minimizing the largest eigenvalue of a modified nonbacktracking matrix. We test the performance of the proposed collective influence algorithm on four different networks. Results show that the collective influence algorithm can stabilize each network with a smaller set of nodes compared with other heuristic algorithms. Our work provides a new insight into the mechanism that determines the stability of Boolean networks, which may find applications in identifying virulence genes that lead to serious diseases.
Environmental sex determination mechanisms in reptiles.

PubMed

Merchant-Larios, H; Díaz-Hernández, V

2013-01-01

Temperature-dependent sex determination (TSD) was first discovered in reptiles. Since then, a great diversity of sex-determining responses to temperature has been reported. Higher temperatures can produce either males or females, and the temperature ranges and lengths of exposure that influence TSD are remarkably variable among species. In addition, transitory gene regulatory networks leading to gonadal TSD have evolved. Although most genes involved in gonadal development are conserved in vertebrates, including TSD species, temporal and spatial gene expression patterns vary among species. Despite variation in TSD pattern and gene expression heterochrony, the structural framework, the medullary cords, and cortex of the bipotential gonad have been strongly conserved. Aromatase (CYP19), which regulates gonadal estrogen levels, is proposed to be the main target of a putative thermosensitive factor for TSD. However, manipulation of estrogen levels rarely mimics the precise timing of temperature effects on expression of gonadal genes, as occurs with TSD. Estrogen levels may influence sex determination or gonad differentiation depending on the species. Furthermore, the process leading to sex determination under the influence of temperature poses problems that are not encountered by species with genetic sex determination. Yolk steroids of maternal origin and steroids produced by the embryonic nervous system should also be considered as sources of hormones that may play a role in TSD. Copyright © 2012 S. Karger AG, Basel.
A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning

PubMed Central

2018-01-01

Risk stratification model for lung cancer with gene expression profile is of great interest. Instead of previous models based on individual prognostic genes, we aimed to develop a novel system-level risk stratification model for lung adenocarcinoma based on gene coexpression network. Using multiple microarray, gene coexpression network analysis was performed to identify survival-related networks. A deep learning based risk stratification model was constructed with representative genes of these networks. The model was validated in two test sets. Survival analysis was performed using the output of the model to evaluate whether it could predict patients' survival independent of clinicopathological variables. Five networks were significantly associated with patients' survival. Considering prognostic significance and representativeness, genes of the two survival-related networks were selected for input of the model. The output of the model was significantly associated with patients' survival in two test sets and training set (p < 0.00001, p < 0.0001 and p = 0.02 for training and test sets 1 and 2, resp.). In multivariate analyses, the model was associated with patients' prognosis independent of other clinicopathological features. Our study presents a new perspective on incorporating gene coexpression networks into the gene expression signature and clinical application of deep learning in genomic data science for prognosis prediction. PMID:29581968
Identification of Human Disease Genes from Interactome Network Using Graphlet Interaction

PubMed Central

Yang, Lun; Wei, Dong-Qing; Qi, Ying-Xin; Jiang, Zong-Lai

2014-01-01

Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes. PMID:24465923
Genes2Networks: connecting lists of gene symbols using mammalian protein interactions databases.

PubMed

Berger, Seth I; Posner, Jeremy M; Ma'ayan, Avi

2007-10-04

In recent years, mammalian protein-protein interaction network databases have been developed. The interactions in these databases are either extracted manually from low-throughput experimental biomedical research literature, extracted automatically from literature using techniques such as natural language processing (NLP), generated experimentally using high-throughput methods such as yeast-2-hybrid screens, or interactions are predicted using an assortment of computational approaches. Genes or proteins identified as significantly changing in proteomic experiments, or identified as susceptibility disease genes in genomic studies, can be placed in the context of protein interaction networks in order to assign these genes and proteins to pathways and protein complexes. Genes2Networks is a software system that integrates the content of ten mammalian interaction network datasets. Filtering techniques to prune low-confidence interactions were implemented. Genes2Networks is delivered as a web-based service using AJAX. The system can be used to extract relevant subnetworks created from "seed" lists of human Entrez gene symbols. The output includes a dynamic linkable three color web-based network map, with a statistical analysis report that identifies significant intermediate nodes used to connect the seed list. Genes2Networks is powerful web-based software that can help experimental biologists to interpret lists of genes and proteins such as those commonly produced through genomic and proteomic experiments, as well as lists of genes and proteins associated with disease processes. This system can be used to find relationships between genes and proteins from seed lists, and predict additional genes or proteins that may play key roles in common pathways or protein complexes.
A statistical method for measuring activation of gene regulatory networks.

PubMed

Esteves, Gustavo H; Reis, Luiz F L

2018-06-13

Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Integrated analysis of microRNA and gene expression profiles reveals a functional regulatory module associated with liver fibrosis.

PubMed

Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong

2017-12-15

Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.
Transcriptional responses in thyroid tissues from rats treated with a tumorigenic and a non-tumorigenic triazole conazole fungicide.

PubMed

Hester, Susan D; Nesnow, Stephen

2008-03-15

Conazoles are azole-containing fungicides that are used in agriculture and medicine. Conazoles can induce follicular cell adenomas of the thyroid in rats after chronic bioassay. The goal of this study was to identify pathways and networks of genes that were associated with thyroid tumorigenesis through transcriptional analyses. To this end, we compared transcriptional profiles from tissues of rats treated with a tumorigenic and a non-tumorigenic conazole. Triadimefon, a rat thyroid tumorigen, and myclobutanil, which was not tumorigenic in rats after a 2-year bioassay, were administered in the feed to male Wistar/Han rats for 30 or 90 days similar to the treatment conditions previously used in their chronic bioassays. Thyroid gene expression was determined using high density Affymetrix GeneChips (Rat 230_2). Gene expression was analyzed by the Gene Set Expression Analyses method which clearly separated the tumorigenic treatments (tumorigenic response group (TRG)) from the non-tumorigenic treatments (non-tumorigenic response group (NRG)). Core genes from these gene sets were mapped to canonical, metabolic, and GeneGo processes and these processes compared across group and treatment time. Extensive analyses were performed on the 30-day gene sets as they represented the major perturbations. Gene sets in the 30-day TRG group had over representation of fatty acid metabolism, oxidation, and degradation processes (including PPARgamma and CYP involvement), and of cell proliferation responses. Core genes from these gene sets were combined into networks and found to possess signaling interactions. In addition, the core genes in each gene set were compared with genes known to be associated with human thyroid cancer. Among the genes that appeared in both rat and human data sets were: Acaca, Asns, Cebpg, Crem, Ddit3, Gja1, Grn, Jun, Junb, and Vegf. These genes were major contributors in the previously developed network from triadimefon-treated rat thyroids. It is postulated that triadimefon induces oxidative response genes and activates the nuclear receptor, Ppargamma, initiating transcription of gene products and signaling to a series of genes involved in cell proliferation.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints

PubMed Central

Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi

2017-01-01

The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
HGPEC: a Cytoscape app for prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network.

PubMed

Le, Duc-Hau; Pham, Van-Huy

2017-06-15

Finding gene-disease and disease-disease associations play important roles in the biomedical area and many prioritization methods have been proposed for this goal. Among them, approaches based on a heterogeneous network of genes and diseases are considered state-of-the-art ones, which achieve high prediction performance and can be used for diseases with/without known molecular basis. Here, we developed a Cytoscape app, namely HGPEC, based on a random walk with restart algorithm on a heterogeneous network of genes and diseases. This app can prioritize candidate genes and diseases by employing a heterogeneous network consisting of a network of genes/proteins and a phenotypic disease similarity network. Based on the rankings, novel disease-gene and disease-disease associations can be identified. These associations can be supported with network- and rank-based visualization as well as evidences and annotations from biomedical data. A case study on prediction of novel breast cancer-associated genes and diseases shows the abilities of HGPEC. In addition, we showed prominence in the performance of HGPEC compared to other tools for prioritization of candidate disease genes. Taken together, our app is expected to effectively predict novel disease-gene and disease-disease associations and support network- and rank-based visualization as well as biomedical evidences for such the associations.
Machine Learning-Assisted Network Inference Approach to Identify a New Class of Genes that Coordinate the Functionality of Cancer Networks.

PubMed

Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu

2017-08-01

Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.
Leveraging multiple gene networks to prioritize GWAS candidate genes via network representation learning.

PubMed

Wu, Mengmeng; Zeng, Wanwen; Liu, Wenqiang; Lv, Hairong; Chen, Ting; Jiang, Rui

2018-06-03

Genome-wide association studies (GWAS) have successfully discovered a number of disease-associated genetic variants in the past decade, providing an unprecedented opportunity for deciphering genetic basis of human inherited diseases. However, it is still a challenging task to extract biological knowledge from the GWAS data, due to such issues as missing heritability and weak interpretability. Indeed, the fact that the majority of discovered loci fall into noncoding regions without clear links to genes has been preventing the characterization of their functions and appealing for a sophisticated approach to bridge genetic and genomic studies. Towards this problem, network-based prioritization of candidate genes, which performs integrated analysis of gene networks with GWAS data, has emerged as a promising direction and attracted much attention. However, most existing methods overlook the sparse and noisy properties of gene networks and thus may lead to suboptimal performance. Motivated by this understanding, we proposed a novel method called REGENT for integrating multiple gene networks with GWAS data to prioritize candidate genes for complex diseases. We leveraged a technique called the network representation learning to embed a gene network into a compact and robust feature space, and then designed a hierarchical statistical model to integrate features of multiple gene networks with GWAS data for the effective inference of genes associated with a disease of interest. We applied our method to six complex diseases and demonstrated the superior performance of REGENT over existing approaches in recovering known disease-associated genes. We further conducted a pathway analysis and showed that the ability of REGENT to discover disease-associated pathways. We expect to see applications of our method to a broad spectrum of diseases for post-GWAS analysis. REGENT is freely available at https://github.com/wmmthu/REGENT. Copyright © 2018 Elsevier Inc. All rights reserved.
Integration of biological networks and gene expression data using Cytoscape

PubMed Central

Cline, Melissa S; Smoot, Michael; Cerami, Ethan; Kuchinsky, Allan; Landys, Nerius; Workman, Chris; Christmas, Rowan; Avila-Campilo, Iliana; Creech, Michael; Gross, Benjamin; Hanspers, Kristina; Isserlin, Ruth; Kelley, Ryan; Killcoyne, Sarah; Lotia, Samad; Maere, Steven; Morris, John; Ono, Keiichiro; Pavlovic, Vuk; Pico, Alexander R; Vailaya, Aditya; Wang, Peng-Liang; Adler, Annette; Conklin, Bruce R; Hood, Leroy; Kuiper, Martin; Sander, Chris; Schmulevich, Ilya; Schwikowski, Benno; Warner, Guy J; Ideker, Trey; Bader, Gary D

2013-01-01

Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape. PMID:17947979
Convergent evolution of gene networks by single-gene duplications in higher eukaryotes.

PubMed

Amoutzias, Gregory D; Robertson, David L; Oliver, Stephen G; Bornberg-Bauer, Erich

2004-03-01

By combining phylogenetic, proteomic and structural information, we have elucidated the evolutionary driving forces for the gene-regulatory interaction networks of basic helix-loop-helix transcription factors. We infer that recurrent events of single-gene duplication and domain rearrangement repeatedly gave rise to distinct networks with almost identical hub-based topologies, and multiple activators and repressors. We thus provide the first empirical evidence for scale-free protein networks emerging through single-gene duplications, the dominant importance of molecular modularity in the bottom-up construction of complex biological entities, and the convergent evolution of networks.
Enhancing biological relevance of a weighted gene co-expression network for functional module identification.

PubMed

Prom-On, Santitham; Chanthaphan, Atthawut; Chan, Jonathan Hoyin; Meechai, Asawin

2011-02-01

Relationships among gene expression levels may be associated with the mechanisms of the disease. While identifying a direct association such as a difference in expression levels between case and control groups links genes to disease mechanisms, uncovering an indirect association in the form of a network structure may help reveal the underlying functional module associated with the disease under scrutiny. This paper presents a method to improve the biological relevance in functional module identification from the gene expression microarray data by enhancing the structure of a weighted gene co-expression network using minimum spanning tree. The enhanced network, which is called a backbone network, contains only the essential structural information to represent the gene co-expression network. The entire backbone network is decoupled into a number of coherent sub-networks, and then the functional modules are reconstructed from these sub-networks to ensure minimum redundancy. The method was tested with a simulated gene expression dataset and case-control expression datasets of autism spectrum disorder and colorectal cancer studies. The results indicate that the proposed method can accurately identify clusters in the simulated dataset, and the functional modules of the backbone network are more biologically relevant than those obtained from the original approach.
Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein–Protein Interaction Network

PubMed Central

Feyertag, Felix; Chakraborty, Sandip

2017-01-01

Abstract The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets. PMID:28854629
Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

PubMed

Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

2015-10-01

Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.
Network-based integration of GWAS and gene expression identifies a HOX-centric network associated with serous ovarian cancer risk

PubMed Central

Kar, Siddhartha P.; Tyrer, Jonathan P.; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie T.; Beckmann, Matthias W.; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F.; Edwards, Robert P.; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K.; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K.; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston-Campbell, Lara E.; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Monteiro, Alvaro N. A.; Freedman, Matthew L.; Gayther, Simon A.; Pharoah, Paul D. P.

2015-01-01

Background Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by co-expression may also be enriched for additional EOC risk associations. Methods We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly co-expressed with each selected TF gene in the unified microarray data set of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this data set were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Results Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P<0.05 and FDR<0.05). These results were replicated (P<0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. Conclusion We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Impact Network analysis integrating large, context-specific data sets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. PMID:26209509
Systematic Evaluation of Molecular Networks for Discovery of Disease Genes.

PubMed

Huang, Justin K; Carlin, Daniel E; Yu, Michael Ku; Zhang, Wei; Kreisberg, Jason F; Tamayo, Pablo; Ideker, Trey

2018-04-25

Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research. Copyright © 2018 Elsevier Inc. All rights reserved.

Molecular and functional definition of the developing human striatum.

PubMed

Onorati, Marco; Castiglioni, Valentina; Biasci, Daniele; Cesana, Elisabetta; Menon, Ramesh; Vuono, Romina; Talpo, Francesca; Laguna Goya, Rocio; Lyons, Paul A; Bulfamante, Gaetano P; Muzio, Luca; Martino, Gianvito; Toselli, Mauro; Farina, Cinthia; Barker, Roger A; Biella, Gerardo; Cattaneo, Elena

2014-12-01

The complexity of the human brain derives from the intricate interplay of molecular instructions during development. Here we systematically investigated gene expression changes in the prenatal human striatum and cerebral cortex during development from post-conception weeks 2 to 20. We identified tissue-specific gene coexpression networks, differentially expressed genes and a minimal set of bimodal genes, including those encoding transcription factors, that distinguished striatal from neocortical identities. Unexpected differences from mouse striatal development were discovered. We monitored 36 determinants at the protein level, revealing regional domains of expression and their refinement, during striatal development. We electrophysiologically profiled human striatal neurons differentiated in vitro and determined their refined molecular and functional properties. These results provide a resource and opportunity to gain global understanding of how transcriptional and functional processes converge to specify human striatal and neocortical neurons during development.
cGRNB: a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets.

PubMed

Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue

2013-01-01

We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
Towards systems genetic analyses in barley: Integration of phenotypic, expression and genotype data into GeneNetwork.

PubMed

Druka, Arnis; Druka, Ilze; Centeno, Arthur G; Li, Hongqiang; Sun, Zhaohui; Thomas, William T B; Bonar, Nicola; Steffenson, Brian J; Ullrich, Steven E; Kleinhofs, Andris; Wise, Roger P; Close, Timothy J; Potokina, Elena; Luo, Zewei; Wagner, Carola; Schweizer, Günther F; Marshall, David F; Kearsey, Michael J; Williams, Robert W; Waugh, Robbie

2008-11-18

A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community. Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork http://www.genenetwork.org. GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them. By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.
Reveal genes functionally associated with ACADS by a network study.

PubMed

Chen, Yulong; Su, Zhiguang

2015-09-15

Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.
Core Promoter Functions in the Regulation of Gene Expression of Drosophila Dorsal Target Genes*

PubMed Central

Zehavi, Yonathan; Kuznetsov, Olga; Ovadia-Shochat, Avital; Juven-Gershon, Tamar

2014-01-01

Developmental processes are highly dependent on transcriptional regulation by RNA polymerase II. The RNA polymerase II core promoter is the ultimate target of a multitude of transcription factors that control transcription initiation. Core promoters consist of core promoter motifs, e.g. the initiator, TATA box, and the downstream core promoter element (DPE), which confer specific properties to the core promoter. Here, we explored the importance of core promoter functions in the dorsal-ventral developmental gene regulatory network. This network includes multiple genes that are activated by different nuclear concentrations of Dorsal, an NFκB homolog transcription factor, along the dorsal-ventral axis. We show that over two-thirds of Dorsal target genes contain DPE sequence motifs, which is significantly higher than the proportion of DPE-containing promoters in Drosophila genes. We demonstrate that multiple Dorsal target genes are evolutionarily conserved and functionally dependent on the DPE. Furthermore, we have analyzed the activation of key Dorsal target genes by Dorsal, as well as by another Rel family transcription factor, Relish, and the dependence of their activation on the DPE motif. Using hybrid enhancer-promoter constructs in Drosophila cells and embryo extracts, we have demonstrated that the core promoter composition is an important determinant of transcriptional activity of Dorsal target genes. Taken together, our results provide evidence for the importance of core promoter composition in the regulation of Dorsal target genes. PMID:24634215
WGCNA: an R package for weighted correlation network analysis.

PubMed

Langfelder, Peter; Horvath, Steve

2008-12-29

Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.
An intersection network based on combining SNP co-association and RNA co-expression networks for feed utilization traits in Japanese Black cattle.

PubMed

Okada, D; Endo, S; Matsuda, H; Ogawa, S; Taniguchi, Y; Katsuta, T; Watanabe, T; Iwaisaki, H

2018-05-12

Genome-wide association studies (GWAS) of quantitative traits have detected numerous genetic associations, but they encounter difficulties in pinpointing prominent candidate genes and inferring gene networks. The present study used a systems genetics approach integrating GWAS results with external RNA-expression data to detect candidate gene networks in feed utilization and growth traits of Japanese Black cattle, which are matters of concern. A SNP co-association network was derived from significant correlations between SNPs with effects estimated by GWAS across seven phenotypic traits. The resulting network genes contained significant numbers of annotations related to the traits. Using bovine transcriptome data from a public database, an RNA co-expression network was inferred based on the similarity of expression patterns across different tissues. An intersection network was then generated by superimposing the SNP and RNA networks and extracting shared interactions. This intersection network contained four tissue-specific modules: nervous system, reproductive system, muscular system, and glands. To characterize the structure (topographical properties) of the three networks, their scale-free properties were evaluated, which revealed that the intersection network was the most scale-free. In the sub-network containing the most connected transcription factors (URI1, ROCK2 and ETV6), most genes were widely expressed across tissues, and genes previously shown to be involved in the traits were found. Results indicated that the current approach might be used to construct a gene network that better reflects biological information, providing encouragement for the genetic dissection of economically important quantitative traits.
Statistical identification of gene association by CID in application of constructing ER regulatory network

PubMed Central

Liu, Li-Yu D; Chen, Chien-Yu; Chen, Mei-Ju M; Tsai, Ming-Shian; Lee, Cho-Han S; Phang, Tzu L; Chang, Li-Yun; Kuo, Wen-Hung; Hwa, Hsiao-Lin; Lien, Huang-Chun; Jung, Shih-Ming; Lin, Yi-Shing; Chang, King-Jen; Hsieh, Fon-Jou

2009-01-01

Background A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). Results The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. Conclusion CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. Availability the implementation of CID in R codes can be freely downloaded from . PMID:19292896
Gene Expression Network Reconstruction by Convex Feature Selection when Incorporating Genetic Perturbations

PubMed Central

Logsdon, Benjamin A.; Mezey, Jason

2010-01-01

Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL), which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1), and genes involved in endocytosis (RCY1), the spindle checkpoint (BUB2), sulfonate catabolism (JLP1), and cell-cell communication (PRM7). Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data. PMID:21152011
GFD-Net: A novel semantic similarity methodology for the analysis of gene networks.

PubMed

Díaz-Montaña, Juan J; Díaz-Díaz, Norberto; Gómez-Vela, Francisco

2017-04-01

Since the popularization of biological network inference methods, it has become crucial to create methods to validate the resulting models. Here we present GFD-Net, the first methodology that applies the concept of semantic similarity to gene network analysis. GFD-Net combines the concept of semantic similarity with the use of gene network topology to analyze the functional dissimilarity of gene networks based on Gene Ontology (GO). The main innovation of GFD-Net lies in the way that semantic similarity is used to analyze gene networks taking into account the network topology. GFD-Net selects a functionality for each gene (specified by a GO term), weights each edge according to the dissimilarity between the nodes at its ends and calculates a quantitative measure of the network functional dissimilarity, i.e. a quantitative value of the degree of dissimilarity between the connected genes. The robustness of GFD-Net as a gene network validation tool was demonstrated by performing a ROC analysis on several network repositories. Furthermore, a well-known network was analyzed showing that GFD-Net can also be used to infer knowledge. The relevance of GFD-Net becomes more evident in Section "GFD-Net applied to the study of human diseases" where an example of how GFD-Net can be applied to the study of human diseases is presented. GFD-Net is available as an open-source Cytoscape app which offers a user-friendly interface to configure and execute the algorithm as well as the ability to visualize and interact with the results(http://apps.cytoscape.org/apps/gfdnet). Copyright © 2017 Elsevier Inc. All rights reserved.
A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

PubMed Central

RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

2015-01-01

The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

PubMed Central

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

2012-01-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease

PubMed Central

Mäkinen, Ville-Petteri; Civelek, Mete; Meng, Qingying; Zhang, Bin; Zhu, Jun; Levian, Candace; Huan, Tianxiao; Segrè, Ayellet V.; Ghosh, Sujoy; Vivar, Juan; Nikpay, Majid; Stewart, Alexandre F. R.; Nelson, Christopher P.; Willenborg, Christina; Erdmann, Jeanette; Blakenberg, Stefan; O'Donnell, Christopher J.; März, Winfried; Laaksonen, Reijo; Epstein, Stephen E.; Kathiresan, Sekar; Shah, Svati H.; Hazen, Stanley L.; Reilly, Muredach P.; Lusis, Aldons J.; Samani, Nilesh J.; Schunkert, Heribert; Quertermous, Thomas; McPherson, Ruth; Yang, Xia; Assimes, Themistocles L.

2014-01-01

The majority of the heritability of coronary artery disease (CAD) remains unexplained, despite recent successes of genome-wide association studies (GWAS) in identifying novel susceptibility loci. Integrating functional genomic data from a variety of sources with a large-scale meta-analysis of CAD GWAS may facilitate the identification of novel biological processes and genes involved in CAD, as well as clarify the causal relationships of established processes. Towards this end, we integrated 14 GWAS from the CARDIoGRAM Consortium and two additional GWAS from the Ottawa Heart Institute (25,491 cases and 66,819 controls) with 1) genetics of gene expression studies of CAD-relevant tissues in humans, 2) metabolic and signaling pathways from public databases, and 3) data-driven, tissue-specific gene networks from a multitude of human and mouse experiments. We not only detected CAD-associated gene networks of lipid metabolism, coagulation, immunity, and additional networks with no clear functional annotation, but also revealed key driver genes for each CAD network based on the topology of the gene regulatory networks. In particular, we found a gene network involved in antigen processing to be strongly associated with CAD. The key driver genes of this network included glyoxalase I (GLO1) and peptidylprolyl isomerase I (PPIL1), which we verified as regulatory by siRNA experiments in human aortic endothelial cells. Our results suggest genetic influences on a diverse set of both known and novel biological processes that contribute to CAD risk. The key driver genes for these networks highlight potential novel targets for further mechanistic studies and therapeutic interventions. PMID:25033284
Is My Network Module Preserved and Reproducible?

PubMed Central

Langfelder, Peter; Luo, Rui; Oldham, Michael C.; Horvath, Steve

2011-01-01

In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation. PMID:21283776
Combinatorial influence of environmental parameters on transcription factor activity.

PubMed

Knijnenburg, T A; Wessels, L F A; Reinders, M J T

2008-07-01

Cells receive a wide variety of environmental signals, which are often processed combinatorially to generate specific genetic responses. Changes in transcript levels, as observed across different environmental conditions, can, to a large extent, be attributed to changes in the activity of transcription factors (TFs). However, in unraveling these transcription regulation networks, the actual environmental signals are often not incorporated into the model, simply because they have not been measured. The unquantified heterogeneity of the environmental parameters across microarray experiments frustrates regulatory network inference. We propose an inference algorithm that models the influence of environmental parameters on gene expression. The approach is based on a yeast microarray compendium of chemostat steady-state experiments. Chemostat cultivation enables the accurate control and measurement of many of the key cultivation parameters, such as nutrient concentrations, growth rate and temperature. The observed transcript levels are explained by inferring the activity of TFs in response to combinations of cultivation parameters. The interplay between activated enhancers and repressors that bind a gene promoter determine the possible up- or downregulation of the gene. The model is translated into a linear integer optimization problem. The resulting regulatory network identifies the combinatorial effects of environmental parameters on TF activity and gene expression. The Matlab code is available from the authors upon request. Supplementary data are available at Bioinformatics online.
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study.

PubMed

Feltus, F Alex; Ficklin, Stephen P; Gibson, Scott M; Smith, Melissa C

2013-06-05

In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired.
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study

PubMed Central

2013-01-01

Background In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. Results A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Conclusions Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired. PMID:23738693
Tillering and panicle branching genes in rice.

PubMed

Liang, Wei-hong; Shang, Fei; Lin, Qun-ting; Lou, Chen; Zhang, Jing

2014-03-01

Rice (Oryza sativa L.) is one of the most important staple food crops in the world, and rice tillering and panicle branching are important traits determining grain yield. Since the gene MONOCULM 1 (MOC 1) was first characterized as a key regulator in controlling rice tillering and branching, great progress has been achieved in identifying important genes associated with grain yield, elucidating the genetic basis of yield-related traits. Some of these important genes were shown to be applicable for molecular breeding of high-yielding rice. This review focuses on recent advances, with emphasis on rice tillering and panicle branching genes, and their regulatory networks. Copyright © 2013 Elsevier B.V. All rights reserved.
A Novel Characterization of Amalgamated Networks in Natural Systems

PubMed Central

Barranca, Victor J.; Zhou, Douglas; Cai, David

2015-01-01

Densely-connected networks are prominent among natural systems, exhibiting structural characteristics often optimized for biological function. To reveal such features in highly-connected networks, we introduce a new network characterization determined by a decomposition of network-connectivity into low-rank and sparse components. Based on these components, we discover a new class of networks we define as amalgamated networks, which exhibit large functional groups and dense connectivity. Analyzing recent experimental findings on cerebral cortex, food-web, and gene regulatory networks, we establish the unique importance of amalgamated networks in fostering biologically advantageous properties, including rapid communication among nodes, structural stability under attacks, and separation of network activity into distinct functional modules. We further observe that our network characterization is scalable with network size and connectivity, thereby identifying robust features significant to diverse physical systems, which are typically undetectable by conventional characterizations of connectivity. We expect that studying the amalgamation properties of biological networks may offer new insights into understanding their structure-function relationships. PMID:26035066
Medium-throughput processing of whole mount in situ hybridisation experiments into gene expression domains.

PubMed

Crombach, Anton; Cicin-Sain, Damjan; Wotton, Karl R; Jaeger, Johannes

2012-01-01

Understanding the function and evolution of developmental regulatory networks requires the characterisation and quantification of spatio-temporal gene expression patterns across a range of systems and species. However, most high-throughput methods to measure the dynamics of gene expression do not preserve the detailed spatial information needed in this context. For this reason, quantification methods based on image bioinformatics have become increasingly important over the past few years. Most available approaches in this field either focus on the detailed and accurate quantification of a small set of gene expression patterns, or attempt high-throughput analysis of spatial expression through binary pattern extraction and large-scale analysis of the resulting datasets. Here we present a robust, "medium-throughput" pipeline to process in situ hybridisation patterns from embryos of different species of flies. It bridges the gap between high-resolution, and high-throughput image processing methods, enabling us to quantify graded expression patterns along the antero-posterior axis of the embryo in an efficient and straightforward manner. Our method is based on a robust enzymatic (colorimetric) in situ hybridisation protocol and rapid data acquisition through wide-field microscopy. Data processing consists of image segmentation, profile extraction, and determination of expression domain boundary positions using a spline approximation. It results in sets of measured boundaries sorted by gene and developmental time point, which are analysed in terms of expression variability or spatio-temporal dynamics. Our method yields integrated time series of spatial gene expression, which can be used to reverse-engineer developmental gene regulatory networks across species. It is easily adaptable to other processes and species, enabling the in silico reconstitution of gene regulatory networks in a wide range of developmental contexts.

Transcription Profiles Reveal Sugar and Hormone Signaling Pathways Mediating Flower Induction in Apple (Malus domestica Borkh.).

PubMed

Xing, Li-Bo; Zhang, Dong; Li, You-Mei; Shen, Ya-Wen; Zhao, Cai-Ping; Ma, Juan-Juan; An, Na; Han, Ming-Yu

2015-10-01

Flower induction in apple (Malus domestica Borkh.) is regulated by complex gene networks that involve multiple signal pathways to ensure flower bud formation in the next year, but the molecular determinants of apple flower induction are still unknown. In this research, transcriptomic profiles from differentiating buds allowed us to identify genes potentially involved in signaling pathways that mediate the regulatory mechanisms of flower induction. A hypothetical model for this regulatory mechanism was obtained by analysis of the available transcriptomic data, suggesting that sugar-, hormone- and flowering-related genes, as well as those involved in cell-cycle induction, participated in the apple flower induction process. Sugar levels and metabolism-related gene expression profiles revealed that sucrose is the initiation signal in flower induction. Complex hormone regulatory networks involved in cytokinin (CK), abscisic acid (ABA) and gibberellic acid pathways also induce apple flower formation. CK plays a key role in the regulation of cell formation and differentiation, and in affecting flowering-related gene expression levels during these processes. Meanwhile, ABA levels and ABA-related gene expression levels gradually increased, as did those of sugar metabolism-related genes, in developing buds, indicating that ABA signals regulate apple flower induction by participating in the sugar-mediated flowering pathway. Furthermore, changes in sugar and starch deposition levels in buds can be affected by ABA content and the expression of the genes involved in the ABA signaling pathway. Thus, multiple pathways, which are mainly mediated by crosstalk between sugar and hormone signals, regulate the molecular network involved in bud growth and flower induction in apple trees. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Immunological network analysis in HPV associated head and neck squamous cancer and implications for disease prognosis.

PubMed

Chen, Xiaohang; Yan, Bingqing; Lou, Huihuang; Shen, Zhenji; Tong, Fangjia; Zhai, Aixia; Wei, Lanlan; Zhang, Fengmin

2018-04-01

Human papillomavirus-positive (HPV+) head and neck squamous cell cancer (HNSCC) exhibits a better prognosis than HPV-negative (HPV-) HNSCC. This difference may in part be due to enhanced immune activation in the HPV+ HNSCC tumor microenvironment. To characterize differences in immune activation between HPV+ and HPV- HNSCC tumors, we identified and annotated differentially expressed genes based upon mRNA expression data from The Cancer Genome Atlas (TCGA). Immune network between immune cells and cytokines was constructed by using single sample Gene Set Enrichment Analysis and conditional mutual information. Multivariate Cox regression analysis was used to determine the prognostic value of immune microenvironment characterization. A total of 1673 differentially expressed genes were functionally annotated. We found that genes upregulated in HPV+ HNSCC are enriched in immune-associated processes. And the up-regulated gene sets were validated by Gene Set Enrichment Analysis. The microenvironment of HPV+ HNSCC exhibited greater numbers of infiltrating B and T cells and fewer neutrophils than HPV- HNSCC. These findings were validated by two independent datasets in the Gene Expression Omnibus (GEO) database. Further analyses of T cell subtypes revealed that cytotoxic T cell subtypes predominated in HPV+ HNSCC. In addition, the ratio of M1/M2 macrophages was much higher in HPV+ HNSCC. The infiltration of these immune cells was correlated with differentially expressed cytokine-associated genes. Enhanced infiltration of B cells and CD8+ T cells were identified as independent protective factors, while high neutrophil infiltration was a risk enhancing factor for HPV+ HNSCC patients. A schematic model of immunological network was established for HPV+ HNSCC to summarize our findings. Copyright © 2018 Elsevier Ltd. All rights reserved.
Transcriptional network inference from functional similarity and expression data: a global supervised approach.

PubMed

Ambroise, Jérôme; Robert, Annie; Macq, Benoit; Gala, Jean-Luc

2012-01-06

An important challenge in system biology is the inference of biological networks from postgenomic data. Among these biological networks, a gene transcriptional regulatory network focuses on interactions existing between transcription factors (TFs) and and their corresponding target genes. A large number of reverse engineering algorithms were proposed to infer such networks from gene expression profiles, but most current methods have relatively low predictive performances. In this paper, we introduce the novel TNIFSED method (Transcriptional Network Inference from Functional Similarity and Expression Data), that infers a transcriptional network from the integration of correlations and partial correlations of gene expression profiles and gene functional similarities through a supervised classifier. In the current work, TNIFSED was applied to predict the transcriptional network in Escherichia coli and in Saccharomyces cerevisiae, using datasets of 445 and 170 affymetrix arrays, respectively. Using the area under the curve of the receiver operating characteristics and the F-measure as indicators, we showed the predictive performance of TNIFSED to be better than unsupervised state-of-the-art methods. TNIFSED performed slightly worse than the supervised SIRENE algorithm for the target genes identification of the TF having a wide range of yet identified target genes but better for TF having only few identified target genes. Our results indicate that TNIFSED is complementary to the SIRENE algorithm, and particularly suitable to discover target genes of "orphan" TFs.
EgoNet: identification of human disease ego-network modules

PubMed Central

2014-01-01

Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628
Gene Regulatory Network Inferences Using a Maximum-Relevance and Maximum-Significance Strategy

PubMed Central

Liu, Wei; Zhu, Wen; Liao, Bo; Chen, Xiangtao

2016-01-01

Recovering gene regulatory networks from expression data is a challenging problem in systems biology that provides valuable information on the regulatory mechanisms of cells. A number of algorithms based on computational models are currently used to recover network topology. However, most of these algorithms have limitations. For example, many models tend to be complicated because of the “large p, small n” problem. In this paper, we propose a novel regulatory network inference method called the maximum-relevance and maximum-significance network (MRMSn) method, which converts the problem of recovering networks into a problem of how to select the regulator genes for each gene. To solve the latter problem, we present an algorithm that is based on information theory and selects the regulator genes for a specific gene by maximizing the relevance and significance. A first-order incremental search algorithm is used to search for regulator genes. Eventually, a strict constraint is adopted to adjust all of the regulatory relationships according to the obtained regulator genes and thus obtain the complete network structure. We performed our method on five different datasets and compared our method to five state-of-the-art methods for network inference based on information theory. The results confirm the effectiveness of our method. PMID:27829000
Reverse engineering and analysis of large genome-scale gene networks

PubMed Central

Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

2013-01-01

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
A co-expression gene network associated with developmental regulation of apple fruit acidity.

PubMed

Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Xu, Kenong

2015-08-01

Apple fruit acidity, which affects the fruit's overall taste and flavor to a large extent, is primarily determined by the concentration of malic acid. Previous studies demonstrated that the major QTL malic acid (Ma) on chromosome 16 is largely responsible for fruit acidity variations in apple. Recent advances suggested that a natural mutation that gives rise to a premature stop codon in one of the two aluminum-activated malate transporter (ALMT)-like genes (called Ma1) is the genetic causal element underlying Ma. However, the natural mutation does not explain the developmental changes of fruit malate levels in a given genotype. Using RNA-seq data from the fruit of 'Golden Delicious' taken at 14 developmental stages from 1 week after full-bloom (WAF01) to harvest (WAF20), we characterized their transcriptomes in groups of high (12.2 ± 1.6 mg/g fw, WAF03-WAF08), mid (7.4 ± 0.5 mg/g fw, WAF01-WAF02 and WAF10-WAF14) and low (5.4 ± 0.4 mg/g fw, WAF16-WAF20) malate concentrations. Detailed analyses showed that a set of 3,066 genes (including Ma1) were expressed not only differentially (P FDR < 0.05) between the high and low malate groups (or between the early and late developmental stages) but also in significant (P < 0.05) correlation with malate concentrations. The 3,066 genes fell in 648 MapMan (sub-) bins or functional classes, and 19 of them were significantly (P FDR < 0.05) co-enriched or co-suppressed in a malate dependent manner. Network inferring using the 363 genes encompassed in the 19 (sub-) bins, identified a major co-expression network of 239 genes. Since the 239 genes were also differentially expressed between the early (WAF03-WAF08) and late (WAF16-WAF20) developmental stages, the major network was considered to be associated with developmental regulation of apple fruit acidity in 'Golden Delicious'.
An integrative data mining approach to identifying adverse outcome pathway signatures.

PubMed

Oki, Noffisat O; Edwards, Stephen W

2016-03-28

The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression, which has been previously established in the literature as a primary cause of glaucoma. These case studies highlight the value in integrating multiple data sources when defining cpAOPs for HTS data. Copyright © 2016. Published by Elsevier Ireland Ltd.
Statistical indicators of collective behavior and functional clusters in gene networks of yeast

NASA Astrophysics Data System (ADS)

Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

2006-03-01

We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Tuning stochastic transition rates in a bistable genetic network.

NASA Astrophysics Data System (ADS)

Chickarmane, Vijay; Peterson, Carsten

2009-03-01

We investigate the stochastic dynamics of a simple genetic network, a toggle switch, in which the system makes transitions between the two alternative states. Our interest is in exploring whether such stochastic transitions, which occur due to the intrinsic noise such as transcriptional and degradation events, can be slowed down/speeded up, without changing the mean expression levels of the two genes, which comprise the toggle network. Such tuning is achieved by linking a signaling network to the toggle switch. The signaling network comprises of a protein, which can exist either in an active (phosphorylated) or inactive (dephosphorylated) form, and where its state is determined by one of the genetic network components. The active form of the protein in turn feeds back on the dynamics of the genetic network. We find that the rate of stochastic transitions from one state to the other, is determined essentially by the speed of phosphorylation, and hence the rate can be modulated by varying the phosphatase levels. We hypothesize that such a network architecture can be implemented as a general mechanism for controlling transition rates and discuss applications in population studies of two differentiated cell lineages, ex: the myeloid/erythroid lineage in hematopoiesis.
Investigating Cell Criticality

NASA Astrophysics Data System (ADS)

Serra, R.; Villani, M.; Damiani, C.; Graudenzi, A.; Ingrami, P.; Colacci, A.

Random Boolean networks provide a way to give a precise meaning to the notion that living beings are in a critical state. Some phenomena which are observed in real biological systems (distribution of "avalanches" in gene knock-out experiments) can be modeled using random Boolean networks, and the results can be analytically proven to depend upon the Derrida parameter, which also determines whether the network is critical. By comparing observed and simulated data one can then draw inferences about the criticality of biological cells, although with some care because of the limited number of experimental observations. The relationship between the criticality of a single network and that of a set of interacting networks, which simulate a tissue or a bacterial colony, is also analyzed by computer simulations.
Reduced Synchronization Persistence in Neural Networks Derived from Atm-Deficient Mice

PubMed Central

Levine-Small, Noah; Yekutieli, Ziv; Aljadeff, Jonathan; Boccaletti, Stefano; Ben-Jacob, Eshel; Barzilai, Ari

2011-01-01

Many neurodegenerative diseases are characterized by malfunction of the DNA damage response. Therefore, it is important to understand the connection between system level neural network behavior and DNA. Neural networks drawn from genetically engineered animals, interfaced with micro-electrode arrays allowed us to unveil connections between networks’ system level activity properties and such genome instability. We discovered that Atm protein deficiency, which in humans leads to progressive motor impairment, leads to a reduced synchronization persistence compared to wild type synchronization, after chemically imposed DNA damage. Not only do these results suggest a role for DNA stability in neural network activity, they also establish an experimental paradigm for empirically determining the role a gene plays on the behavior of a neural network. PMID:21519382
Gene Coexpression Network Alignment and Conservation of Gene Modules between Two Grass Species: Maize and Rice[C][W][OA

PubMed Central

Ficklin, Stephen P.; Feltus, F. Alex

2011-01-01

One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species. PMID:21606319
Gene coexpression network alignment and conservation of gene modules between two grass species: maize and rice.

PubMed

Ficklin, Stephen P; Feltus, F Alex

2011-07-01

One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species.
Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment

PubMed Central

Uddin, Raihan; Singh, Shiva M.

2017-01-01

As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning. PMID:29066959
Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

PubMed

Uddin, Raihan; Singh, Shiva M

2017-01-01

As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning.
WGCNA: an R package for weighted correlation network analysis

PubMed Central

Langfelder, Peter; Horvath, Steve

2008-01-01

Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at . PMID:19114008
Network topologies and convergent aetiologies arising from deletions and duplications observed in individuals with autism.

PubMed

Noh, Hyun Ji; Ponting, Chris P; Boulding, Hannah C; Meader, Stephen; Betancur, Catalina; Buxbaum, Joseph D; Pinto, Dalila; Marshall, Christian R; Lionel, Anath C; Scherer, Stephen W; Webber, Caleb

2013-06-01

Autism Spectrum Disorders (ASD) are highly heritable and characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. Considering four sets of de novo copy number variants (CNVs) identified in 181 individuals with autism and exploiting mouse functional genomics and known protein-protein interactions, we identified a large and significantly interconnected interaction network. This network contains 187 genes affected by CNVs drawn from 45% of the patients we considered and 22 genes previously implicated in ASD, of which 192 form a single interconnected cluster. On average, those patients with copy number changed genes from this network possess changes in 3 network genes, suggesting that epistasis mediated through the network is extensive. Correspondingly, genes that are highly connected within the network, and thus whose copy number change is predicted by the network to be more phenotypically consequential, are significantly enriched among patients that possess only a single ASD-associated network copy number changed gene (p = 0.002). Strikingly, deleted or disrupted genes from the network are significantly enriched in GO-annotated positive regulators (2.3-fold enrichment, corrected p = 2×10(-5)), whereas duplicated genes are significantly enriched in GO-annotated negative regulators (2.2-fold enrichment, corrected p = 0.005). The direction of copy change is highly informative in the context of the network, providing the means through which perturbations arising from distinct deletions or duplications can yield a common outcome. These findings reveal an extensive ASD-associated molecular network, whose topology indicates ASD-relevant mutational deleteriousness and that mechanistically details how convergent aetiologies can result extensively from CNVs affecting pathways causally implicated in ASD.
The Role of mGluR Copy Number Variation in Genetic and Environmental Forms of Syndromic Autism Spectrum Disorder.

PubMed

Wenger, Tara L; Kao, Charlly; McDonald-McGinn, Donna M; Zackai, Elaine H; Bailey, Alice; Schultz, Robert T; Morrow, Bernice E; Emanuel, Beverly S; Hakonarson, Hakon

2016-01-19

While abnormal signaling mediated through metabotropic glutamate receptor 5 (mGluR5) is involved in the pathophysiology of Autism Spectrum Disorder (ASD), Fragile X Syndrome and Tuberous Sclerosis, the role of other mGluRs and their associated signaling network genes in syndromic ASD is unknown. This study sought to determine whether mGluR Copy Number Variants (CNV's) were overrepresented in children with syndromic ASD and if mGluR "second hit" confers additional risk for ASD in 22q11.2 Deletion Syndrome (22q11DS). To determine whether mGluR network CNV'S are enriched in syndromic ASD, we examined microarrays from children with ASD (n = 539). Patient categorization (syndromic vs nonsyndromic) was done via blinded medical chart review in mGluR positive and randomly selected mGluR negative cases. 11.5% of ASD had mGluR CNV's vs. 3.2% in controls (p < 0.001). Syndromic ASD was more prevalent in children with mGluR CNVs (74% vs 16%, p < 0.001). A comparison cohort with 22q11DS (n = 25 with ASD, n = 50 without ASD), all haploinsufficient for mGluR network gene RANBP1, were evaluated for "second mGluR hits". 20% with 22q11.2DS + ASD had "second hits" in mGluR network genes vs 2% in 22q11.2DS-ASD (p < 0.014). We propose that altered RANBP1 expression may provide a mechanistic link for several seemingly unrelated genetic and environmental forms of ASD.
The Role of mGluR Copy Number Variation in Genetic and Environmental Forms of Syndromic Autism Spectrum Disorder

PubMed Central

Wenger, Tara L.; Kao, Charlly; McDonald-McGinn, Donna M.; Zackai, Elaine H.; Bailey, Alice; Schultz, Robert T.; Morrow, Bernice E.; Emanuel, Beverly S.; Hakonarson, Hakon

2016-01-01

While abnormal signaling mediated through metabotropic glutamate receptor 5 (mGluR5) is involved in the pathophysiology of Autism Spectrum Disorder (ASD), Fragile X Syndrome and Tuberous Sclerosis, the role of other mGluRs and their associated signaling network genes in syndromic ASD is unknown. This study sought to determine whether mGluR Copy Number Variants (CNV’s) were overrepresented in children with syndromic ASD and if mGluR “second hit” confers additional risk for ASD in 22q11.2 Deletion Syndrome (22q11DS). To determine whether mGluR network CNV’S are enriched in syndromic ASD, we examined microarrays from children with ASD (n = 539). Patient categorization (syndromic vs nonsyndromic) was done via blinded medical chart review in mGluR positive and randomly selected mGluR negative cases. 11.5% of ASD had mGluR CNV’s vs. 3.2% in controls (p < 0.001). Syndromic ASD was more prevalent in children with mGluR CNVs (74% vs 16%, p < 0.001). A comparison cohort with 22q11DS (n = 25 with ASD, n = 50 without ASD), all haploinsufficient for mGluR network gene RANBP1, were evaluated for “second mGluR hits”. 20% with 22q11.2DS + ASD had “second hits” in mGluR network genes vs 2% in 22q11.2DS-ASD (p < 0.014). We propose that altered RANBP1 expression may provide a mechanistic link for several seemingly unrelated genetic and environmental forms of ASD. PMID:26781481

Gonad Transcriptome Analysis of the Pacific Oyster Crassostrea gigas Identifies Potential Genes Regulating the Sex Determination and Differentiation Process.

PubMed

Yue, Chenyang; Li, Qi; Yu, Hong

2018-04-01

The Pacific oyster Crassostrea gigas is a commercially important bivalve in aquaculture worldwide. C. gigas has a fascinating sexual reproduction system consisting of dioecism, sex change, and occasional hermaphroditism, while knowledge of the molecular mechanisms of sex determination and differentiation is still limited. In this study, the transcriptomes of male and female gonads at different gametogenesis stages were characterized by RNA-seq. Hierarchical clustering based on genes differentially expressed revealed that 1269 genes were expressed specifically in female gonads and 817 genes were expressed increasingly over the course of spermatogenesis. Besides, we identified two and one gene modules related to female and male gonad development, respectively, using weighted gene correlation network analysis (WGCNA). Interestingly, GO and KEGG enrichment analysis showed that neurotransmitter-related terms were significantly enriched in genes related to ovary development, suggesting that the neurotransmitters were likely to regulate female sex differentiation. In addition, two hub genes related to testis development, lncRNA LOC105321313 and Cg-Sh3kbp1, and one hub gene related to ovary development, Cg-Malrd1-like, were firstly investigated. This study points out the role of neurotransmitter and non-coding RNA regulation during gonad development and produces lists of novel relevant candidate genes for further studies. All of these provided valuable information to understand the molecular mechanisms of C. gigas sex determination and differentiation.
Construction of regulatory networks using expression time-series data of a genotyped population.

PubMed

Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E

2011-11-29

The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.
Transcriptional dynamics of a conserved gene expression network associated with craniofacial divergence in Arctic charr.

PubMed

Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut

2014-01-01

Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional targets and upstream regulators showed differential expression between the contrasting morphotypes. Interestingly, although selected network genes showed overlapping expression patterns in situ and no morph differences, Timp2 expression patterns differed between morphs. Our comparative study of transcriptional dynamics in divergent craniofacial morphologies of Arctic charr revealed a conserved network of coexpressed genes sharing functional roles in structural morphogenesis. We also implicate transcriptional regulators of the network as targets for future functional studies.
Construction of local gene network for revealing different liver function of rats fed deep-fried oil with or without resistant starch.

PubMed

Wang, Zhiwei; Liao, Tianqi; Zhou, Zhongkai; Wang, Yuyang; Diao, Yongjia; Strappe, Padraig; Prenzler, Paul; Ayton, Jamie; Blanchard, Chris

2016-09-06

To study the mechanism underlying the liver damage induced by deep-fried oil (DO) consumption and the beneficial effects from resistant starch (RS) supplement, differential gene expression and pathway network were analyzed based on RNA sequencing data from rats. The up/down regulated genes and corresponding signaling pathways were used to construct a novel local gene network (LGN). The topology of the network showed characteristics of small-world network, with some pathways demonstrating a high degree. Some changes in genes led to a larger probability occurrence of disease or infection with DO intake. More importantly, the main pathways were found to be almost the same between the two LGNs (30 pathways overlapped in total 48) with gene expression profile. This finding may indicate that RS supplement in DO-containing diet may mainly regulate the genes that related to DO damage, and RS in the diet may provide direct signals to the liver cells and modulate its effect through a network involving complex gene regulatory events. It is the first attempt to reveal the mechanism of the attenuation of liver dysfunction from RS supplement in the DO-containing diet using differential gene expression and pathway network. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An ultrastructural study of calcitonin gene-related peptide-immunoreactive nerve fibers innervating the rat posterior longitudinal ligament. A morphologic basis for their possible efferent actions.

PubMed

Imai, S; Konttinen, Y T; Tokunaga, Y; Maeda, T; Hukuda, S; Santavirta, S

1997-09-01

The present study investigated ultrastructural characteristics of calcitonin gene-related peptide-immunoreactive nerve fibers in the posterior longitudinal ligament of the rat lumbar spine. To provide a morphologic basis for assessment of the afferent and, in particular, efferent functions of calcitonin gene-related peptide immunoreactive nerves in the posterior longitudinal ligament and their eventual role in degenerative spondylarthropathies and low back pain. Previous studies using light-microscopic localization of sensory neuronal markers such as calcitonin gene-related peptide have reported the presence of sensory fibers in the supporting structures of the vertebral column. Meanwhile, accumulating research data have suggested efferent properties for calcitonin gene-related peptide, i.e., a trophic action that alters the intrinsic properties of target cells not through transient action of synaptic transmission, but through long-lasting signal transmission by the secreted neuropeptides. To verify such trophic, paracrine actions of the calcitonin gene-related peptide-containing fibers in the posterior longitudinal ligament, however, ultrastructural details of the terminals and their spatial relationship to their eventual target structures have to be elucidated. Rat posterior longitudinal ligaments were stained immunohistochemically for calcitonin gene-related peptide. Light-microscopic analysis of the semithin sections facilitated subsequent electron microscopy of specific sites of the posterior longitudinal ligament to determine ultrastructural details and nerve fiber-target relationships. The rat lumbar posterior longitudinal ligament was found to be innervated by two distinctive calcitonin gene-related peptide immunoreactive nerve networks. In immunoelectronmicroscopy, the fibers of the deep network had numerous free nerve endings, whereas those of the superficial network showed spatial associations with other non-calcitonin gene-related peptide immunoreactive components of the network. In both systems, naked axons not covered by the Schwann cells made close spatial contact with smooth muscle cells: of blood vessels and resident posterior longitudinal ligament fibroblasts. The ultrastructural characteristics of the innervation of the rat posterior longitudinal ligament would be compatible not only with a nociceptive function, but also with neuromodulatory, vasoregulatory, and trophic functions, as has already been established in some visceral organs.
Prediction of EST functional relationships via literature mining with user-specified parameters.

PubMed

Wang, Hei-Chia; Huang, Tian-Hsiang

2009-04-01

The massive amount of expressed sequence tags (ESTs) gathered over recent years has triggered great interest in efficient applications for genomic research. In particular, EST functional relationships can be used to determine a possible gene network for biological processes of interest. In recent years, many researchers have tried to determine EST functional relationships by analyzing the biological literature. However, it has been challenging to find efficient prediction methods. Moreover, an annotated EST is usually associated with many functions, so successful methods must be able to distinguish between relevant and irrelevant functions based on user specifications. This paper proposes a method to discover functional relationships between ESTs of interest by analyzing literature from the Medical Literature Analysis and Retrieval System Online, with user-specified parameters for selecting keywords. This method performs better than the multiple kernel documents method in setting up a specific threshold for gathering materials. The method is also able to uncover known functional relationships, as shown by a comparison with the Kyoto Encyclopedia of Genes and Genomes database. The reliable EST relationships predicted by the proposed method can help to construct gene networks for specific biological functions of interest.
Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection.

PubMed

Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne

2005-04-15

The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.
Genomewide Analysis of Aryl Hydrocarbon Receptor Binding Targets Reveals an Extensive Array of Gene Clusters that Control Morphogenetic and Developmental Programs

PubMed Central

Sartor, Maureen A.; Schnekenburger, Michael; Marlowe, Jennifer L.; Reichard, John F.; Wang, Ying; Fan, Yunxia; Ma, Ci; Karyala, Saikumar; Halbleib, Danielle; Liu, Xiangdong; Medvedovic, Mario; Puga, Alvaro

2009-01-01

Background The vertebrate aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that regulates cellular responses to environmental polycyclic and halogenated compounds. The naive receptor is believed to reside in an inactive cytosolic complex that translocates to the nucleus and induces transcription of xenobiotic detoxification genes after activation by ligand. Objectives We conducted an integrative genomewide analysis of AHR gene targets in mouse hepatoma cells and determined whether AHR regulatory functions may take place in the absence of an exogenous ligand. Methods The network of AHR-binding targets in the mouse genome was mapped through a multipronged approach involving chromatin immunoprecipitation/chip and global gene expression signatures. The findings were integrated into a prior functional knowledge base from Gene Ontology, interaction networks, Kyoto Encyclopedia of Genes and Genomes pathways, sequence motif analysis, and literature molecular concepts. Results We found the naive receptor in unstimulated cells bound to an extensive array of gene clusters with functions in regulation of gene expression, differentiation, and pattern specification, connecting multiple morphogenetic and developmental programs. Activation by the ligand displaced the receptor from some of these targets toward sites in the promoters of xenobiotic metabolism genes. Conclusions The vertebrate AHR appears to possess unsuspected regulatory functions that may be potential targets of environmental injury. PMID:19654925
A comparison of honeybee (Apis mellifera) queen, worker and drone larvae by RNA-Seq.

PubMed

He, Xu-Jiang; Jiang, Wu-Jun; Zhou, Mi; Barron, Andrew B; Zeng, Zhi-Jiang

2017-11-06

Honeybees (Apis mellifera) have haplodiploid sex determination: males develop from unfertilized eggs and females develop from fertilized ones. The differences in larval food also determine the development of females. Here we compared the total somatic gene expression profiles of 2-day and 4-day-old drone, queen and worker larvae by RNA-Seq. The results from a co-expression network analysis on all expressed genes showed that 2-day-old drone and worker larvae were closer in gene expression profiles than 2-day-old queen larvae. This indicated that for young larvae (2-day-old) environmental factors such as larval diet have a greater effect on gene expression profiles than ploidy or sex determination. Drones had the most distinct gene expression profiles at the 4-day larval stage, suggesting that haploidy, or sex dramatically affects the gene expression of honeybee larvae. Drone larvae showed fewer differences in gene expression profiles at the 2-day and 4-day time points than the worker and queen larval comparisons (598 against 1190 and 1181), suggesting a different pattern of gene expression regulation during the larval development of haploid males compared to diploid females. This study indicates that early in development the queen caste has the most distinct gene expression profile, perhaps reflecting the very rapid growth and morphological specialization of this caste compared to workers and drones. Later in development the haploid male drones have the most distinct gene expression profile, perhaps reflecting the influence of ploidy or sex determination on gene expression. © 2017 Institute of Zoology, Chinese Academy of Sciences.
Reverse-engineering of gene networks for regulating early blood development from single-cell measurements.

PubMed

Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai

2017-12-28

Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate networks. The research results in this work shows that the developed approach is an efficient and effective method to reverse-engineer gene networks using single-cell experimental observations.
Identification and Analyses of AUX-IAA target genes controlling multiple pathways in developing fiber cells of Gossypium hirsutum L

PubMed Central

Nigam, Deepti; Sawant, Samir V

2013-01-01

Technological development led to an increased interest in systems biological approaches in plants to characterize developmental mechanism and candidate genes relevant to specific tissue or cell morphology. AUX-IAA proteins are important plant-specific putative transcription factors. There are several reports on physiological response of this family in Arabidopsis but in cotton fiber the transcriptional network through which AUX-IAA regulated its target genes is still unknown. in-silico modelling of cotton fiber development specific gene expression data (108 microarrays and 22,737 genes) using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals 3690 putative AUX-IAA target genes of which 139 genes were known to be AUX-IAA co-regulated within Arabidopsis. Further AUX-IAA targeted gene regulatory network (GRN) had substantial impact on the transcriptional dynamics of cotton fiber, as showed by, altered TF networks, and Gene Ontology (GO) biological processes and metabolic pathway associated with its target genes. Analysis of the AUX-IAA-correlated gene network reveals multiple functions for AUX-IAA target genes such as unidimensional cell growth, cellular nitrogen compound metabolic process, nucleosome organization, DNA-protein complex and process related to cell wall. These candidate networks/pathways have a variety of profound impacts on such cellular functions as stress response, cell proliferation, and cell differentiation. While these functions are fairly broad, their underlying TF networks may provide a global view of AUX-IAA regulated gene expression and a GRN that guides future studies in understanding role of AUX-IAA box protein and its targets regulating fiber development. PMID:24497725
LRH-1 and PTF1-L coregulate an exocrine pancreas-specific transcriptional network for digestive function.

PubMed

Holmstrom, Sam R; Deering, Tye; Swift, Galvin H; Poelwijk, Frank J; Mangelsdorf, David J; Kliewer, Steven A; MacDonald, Raymond J

2011-08-15

We have determined the cistrome and transcriptome for the nuclear receptor liver receptor homolog-1 (LRH-1) in exocrine pancreas. Chromatin immunoprecipitation (ChIP)-seq and RNA-seq analyses reveal that LRH-1 directly induces expression of genes encoding digestive enzymes and secretory and mitochondrial proteins. LRH-1 cooperates with the pancreas transcription factor 1-L complex (PTF1-L) in regulating exocrine pancreas-specific gene expression. Elimination of LRH-1 in adult mice reduced the concentration of several lipases and proteases in pancreatic fluid and impaired pancreatic fluid secretion in response to cholecystokinin. Thus, LRH-1 is a key regulator of the exocrine pancreas-specific transcriptional network required for the production and secretion of pancreatic fluid.
Text mining and network analysis to find functional associations of genes in high altitude diseases.

PubMed

Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar

2018-05-02

Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.
Meta-review of protein network regulating obesity between validated obesity candidate genes in the white adipose tissue of high-fat diet-induced obese C57BL/6J mice.

PubMed

Kim, Eunjung; Kim, Eun Jung; Seo, Seung-Won; Hur, Cheol-Goo; McGregor, Robin A; Choi, Myung-Sook

2014-01-01

Worldwide obesity and related comorbidities are increasing, but identifying new therapeutic targets remains a challenge. A plethora of microarray studies in diet-induced obesity models has provided large datasets of obesity associated genes. In this review, we describe an approach to examine the underlying molecular network regulating obesity, and we discuss interactions between obesity candidate genes. We conducted network analysis on functional protein-protein interactions associated with 25 obesity candidate genes identified in a literature-driven approach based on published microarray studies of diet-induced obesity. The obesity candidate genes were closely associated with lipid metabolism and inflammation. Peroxisome proliferator activated receptor gamma (Pparg) appeared to be a core obesity gene, and obesity candidate genes were highly interconnected, suggesting a coordinately regulated molecular network in adipose tissue. In conclusion, the current network analysis approach may help elucidate the underlying molecular network regulating obesity and identify anti-obesity targets for therapeutic intervention.
Interfacing cellular networks of S. cerevisiae and E. coli: Connecting dynamic and genetic information

PubMed Central

2013-01-01

Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484
Transcriptomics of mRNA and egg quality in farmed fish: Some recent developments and future directions.

PubMed

Sullivan, Craig V; Chapman, Robert W; Reading, Benjamin J; Anderson, Paul E

2015-09-15

Maternal mRNA transcripts deposited in growing oocytes regulate early development and are under intensive investigation as determinants of egg quality. The research has evolved from single gene studies to microarray and now RNA-Seq analyses in which mRNA expression by virtually every gene can be assessed and related to gamete quality. Such studies have mainly focused on genes changing two- to several-fold in expression between biological states, and have identified scores of candidate genes and a few gene networks whose functioning is related to successful development. However, ever-increasing yields of information from high throughput methods for detecting transcript abundance have far outpaced progress in methods for analyzing the massive quantities of gene expression data, and especially for meaningful relation of whole transcriptome profiles to gamete quality. We have developed a new approach to this problem employing artificial neural networks and supervised machine learning with other novel bioinformatics procedures to discover a previously unknown level of ovarian transcriptome function at which minute changes in expression of a few hundred genes is highly predictive of egg quality. In this paper, we briefly review the progress in transcriptomics of fish egg quality and discuss some future directions for this field of study. Copyright © 2015 Elsevier Inc. All rights reserved.
The western painted turtle genome, a model for the evolution of extreme physiological adaptations in a slowly evolving lineage

PubMed Central

2013-01-01

Background We describe the genome of the western painted turtle, Chrysemys picta bellii, one of the most widespread, abundant, and well-studied turtles. We place the genome into a comparative evolutionary context, and focus on genomic features associated with tooth loss, immune function, longevity, sex differentiation and determination, and the species' physiological capacities to withstand extreme anoxia and tissue freezing. Results Our phylogenetic analyses confirm that turtles are the sister group to living archosaurs, and demonstrate an extraordinarily slow rate of sequence evolution in the painted turtle. The ability of the painted turtle to withstand complete anoxia and partial freezing appears to be associated with common vertebrate gene networks, and we identify candidate genes for future functional analyses. Tooth loss shares a common pattern of pseudogenization and degradation of tooth-specific genes with birds, although the rate of accumulation of mutations is much slower in the painted turtle. Genes associated with sex differentiation generally reflect phylogeny rather than convergence in sex determination functionality. Among gene families that demonstrate exceptional expansions or show signatures of strong natural selection, immune function and musculoskeletal patterning genes are consistently over-represented. Conclusions Our comparative genomic analyses indicate that common vertebrate regulatory networks, some of which have analogs in human diseases, are often involved in the western painted turtle's extraordinary physiological capacities. As these regulatory pathways are analyzed at the functional level, the painted turtle may offer important insights into the management of a number of human health disorders. PMID:23537068
Ancient trade routes shaped the genetic structure of horses in eastern Eurasia.

PubMed

Warmuth, Vera M; Campana, Michael G; Eriksson, Anders; Bower, Mim; Barker, Graeme; Manica, Andrea

2013-11-01

Animal exchange networks have been shown to play an important role in determining gene flow among domestic animal populations. The Silk Road is one of the oldest continuous exchange networks in human history, yet its effectiveness in facilitating animal exchange across large geographical distances and topographically challenging landscapes has never been explicitly studied. Horses are known to have been traded along the Silk Roads; however, extensive movement of horses in connection with other human activities may have obscured the genetic signature of the Silk Roads. To investigate the role of the Silk Roads in shaping the genetic structure of horses in eastern Eurasia, we analysed microsatellite genotyping data from 455 village horses sampled from 17 locations. Using least-cost path methods, we compared the performance of models containing the Silk Roads as corridors for gene flow with models containing single landscape features. We also determined whether the recent isolation of former Soviet Union countries from the rest of Eurasia has affected the genetic structure of our samples. The overall level of genetic differentiation was low, consistent with historically high levels of gene flow across the study region. The spatial genetic structure was characterized by a significant, albeit weak, pattern of isolation by distance across the continent with no evidence for the presence of distinct genetic clusters. Incorporating landscape features considerably improved the fit of the data; however, when we controlled for geographical distance, only the correlation between genetic differentiation and the Silk Roads remained significant, supporting the effectiveness of this ancient trade network in facilitating gene flow across large geographical distances in a topographically complex landscape. © 2013 John Wiley & Sons Ltd.
Genome wide predictions of miRNA regulation by transcription factors.

PubMed

Ruffalo, Matthew; Bar-Joseph, Ziv

2016-09-01

Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge is the fact that a large fraction of miRNAs are encoded within genes making it hard to determine the specific way in which they are regulated. To enable genome wide predictions of TF-miRNA interactions, we extended semi-supervised machine-learning approaches to integrate a large set of different types of data including sequence, expression, ChIP-seq and epigenetic data. As we show, the methods we develop achieve good performance on both a labeled test set, and when analyzing general co-expression networks. We next analyze mRNA and miRNA cancer expression data, demonstrating the advantage of using the predicted set of interactions for identifying more coherent and relevant modules, genes, and miRNAs. The complete set of predictions is available on the supporting website and can be used by any method that combines miRNAs, genes, and TFs. Code and full set of predictions are available from the supporting website: http://cs.cmu.edu/~mruffalo/tf-mirna/ zivbj@cs.cmu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Thermodynamic Constraints Improve Metabolic Networks.

PubMed

Krumholz, Elias W; Libourel, Igor G L

2017-08-08

In pursuit of establishing a realistic metabolic phenotypic space, the reversibility of reactions is thermodynamically constrained in modern metabolic networks. The reversibility constraints follow from heuristic thermodynamic poise approximations that take anticipated cellular metabolite concentration ranges into account. Because constraints reduce the feasible space, draft metabolic network reconstructions may need more extensive reconciliation, and a larger number of genes may become essential. Notwithstanding ubiquitous application, the effect of reversibility constraints on the predictive capabilities of metabolic networks has not been investigated in detail. Instead, work has focused on the implementation and validation of the thermodynamic poise calculation itself. With the advance of fast linear programming-based network reconciliation, the effects of reversibility constraints on network reconciliation and gene essentiality predictions have become feasible and are the subject of this study. Networks with thermodynamically informed reversibility constraints outperformed gene essentiality predictions compared to networks that were constrained with randomly shuffled constraints. Unconstrained networks predicted gene essentiality as accurately as thermodynamically constrained networks, but predicted substantially fewer essential genes. Networks that were reconciled with sequence similarity data and strongly enforced reversibility constraints outperformed all other networks. We conclude that metabolic network analysis confirmed the validity of the thermodynamic constraints, and that thermodynamic poise information is actionable during network reconciliation. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.

STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

PubMed Central

Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent

2009-01-01

Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a STARNET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at , and does not require user registration. PMID:19828039
Application of artificial neural network model combined with four biomarkers in auxiliary diagnosis of lung cancer.

PubMed

Duan, Xiaoran; Yang, Yongli; Tan, Shanjuan; Wang, Sihua; Feng, Xiaolei; Cui, Liuxin; Feng, Feifei; Yu, Songcheng; Wang, Wei; Wu, Yongjun

2017-08-01

The purpose of the study was to explore the application of artificial neural network model in the auxiliary diagnosis of lung cancer and compare the effects of back-propagation (BP) neural network with Fisher discrimination model for lung cancer screening by the combined detections of four biomarkers of p16, RASSF1A and FHIT gene promoter methylation levels and the relative telomere length. Real-time quantitative methylation-specific PCR was used to detect the levels of three-gene promoter methylation, and real-time PCR method was applied to determine the relative telomere length. BP neural network and Fisher discrimination analysis were used to establish the discrimination diagnosis model. The levels of three-gene promoter methylation in patients with lung cancer were significantly higher than those of the normal controls. The values of Z(P) in two groups were 2.641 (0.008), 2.075 (0.038) and 3.044 (0.002), respectively. The relative telomere lengths of patients with lung cancer (0.93 ± 0.32) were significantly lower than those of the normal controls (1.16 ± 0.57), t = 4.072, P < 0.001. The areas under the ROC curve (AUC) and 95 % CI of prediction set from Fisher discrimination analysis and BP neural network were 0.670 (0.569-0.761) and 0.760 (0.664-0.840). The AUC of BP neural network was higher than that of Fisher discrimination analysis, and Z(P) was 0.76. Four biomarkers are associated with lung cancer. BP neural network model for the prediction of lung cancer is better than Fisher discrimination analysis, and it can provide an excellent and intelligent diagnosis tool for lung cancer.
On construction of stochastic genetic networks based on gene expression sequences.

PubMed

Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya

2005-08-01

Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
Systematic identification of an integrative network module during senescence from time-series gene expression.

PubMed

Park, Chihyun; Yun, So Jeong; Ryu, Sung Jin; Lee, Soyoung; Lee, Young-Sam; Yoon, Youngmi; Park, Sang Chul

2017-03-15

Cellular senescence irreversibly arrests growth of human diploid cells. In addition, recent studies have indicated that senescence is a multi-step evolving process related to important complex biological processes. Most studies analyzed only the genes and their functions representing each senescence phase without considering gene-level interactions and continuously perturbed genes. It is necessary to reveal the genotypic mechanism inferred by affected genes and their interaction underlying the senescence process. We suggested a novel computational approach to identify an integrative network which profiles an underlying genotypic signature from time-series gene expression data. The relatively perturbed genes were selected for each time point based on the proposed scoring measure denominated as perturbation scores. Then, the selected genes were integrated with protein-protein interactions to construct time point specific network. From these constructed networks, the conserved edges across time point were extracted for the common network and statistical test was performed to demonstrate that the network could explain the phenotypic alteration. As a result, it was confirmed that the difference of average perturbation scores of common networks at both two time points could explain the phenotypic alteration. We also performed functional enrichment on the common network and identified high association with phenotypic alteration. Remarkably, we observed that the identified cell cycle specific common network played an important role in replicative senescence as a key regulator. Heretofore, the network analysis from time series gene expression data has been focused on what topological structure was changed over time point. Conversely, we focused on the conserved structure but its context was changed in course of time and showed it was available to explain the phenotypic changes. We expect that the proposed method will help to elucidate the biological mechanism unrevealed by the existing approaches.
A novel heat shock protein alpha 8 (Hspa8) molecular network mediating responses to stress- and ethanol-related behaviors.

PubMed

Urquhart, Kyle R; Zhao, Yinghong; Baker, Jessica A; Lu, Ye; Yan, Lei; Cook, Melloni N; Jones, Byron C; Hamre, Kristin M; Lu, Lu

2016-04-01

Genetic differences mediate individual differences in susceptibility and responses to stress and ethanol, although, the specific molecular pathways that control these responses are not fully understood. Heat shock protein alpha 8 (Hspa8) is a molecular chaperone and member of the heat shock protein family that plays an integral role in the stress response and that has been implicated as an ethanol-responsive gene. Therefore, we assessed its role in mediating responses to stress and ethanol across varying genetic backgrounds. The hippocampus is an important mediator of these responses, and thus, was examined in the BXD family of mice in this study. We conducted bioinformatic analyses to dissect genetic factors modulating Hspa8 expression, identify downstream targets of Hspa8, and examined its role. Hspa8 is trans-regulated by a gene or genes on chromosome 14 and is part of a molecular network that regulates stress- and ethanol-related behaviors. To determine additional components of this network, we identified direct or indirect targets of Hspa8 and show that these genes, as predicted, participate in processes such as protein folding and organic substance metabolic processes. Two phenotypes that map to the Hspa8 locus are anxiety-related and numerous other anxiety- and/or ethanol-related behaviors significantly correlate with Hspa8 expression. To more directly assay this relationship, we examined differences in gene expression following exposure to stress or alcohol and showed treatment-related differential expression of Hspa8 and a subset of the members of its network. Our findings suggest that Hspa8 plays a vital role in genetic differences in responses to stress and ethanol and their interactions.
“Guilt by Association” Is the Exception Rather Than the Rule in Gene Networks

PubMed Central

Gillis, Jesse; Pavlidis, Paul

2012-01-01

Gene networks are commonly interpreted as encoding functional information in their connections. An extensively validated principle called guilt by association states that genes which are associated or interacting are more likely to share function. Guilt by association provides the central top-down principle for analyzing gene networks in functional terms or assessing their quality in encoding functional information. In this work, we show that functional information within gene networks is typically concentrated in only a very few interactions whose properties cannot be reliably related to the rest of the network. In effect, the apparent encoding of function within networks has been largely driven by outliers whose behaviour cannot even be generalized to individual genes, let alone to the network at large. While experimentalist-driven analysis of interactions may use prior expert knowledge to focus on the small fraction of critically important data, large-scale computational analyses have typically assumed that high-performance cross-validation in a network is due to a generalizable encoding of function. Because we find that gene function is not systemically encoded in networks, but dependent on specific and critical interactions, we conclude it is necessary to focus on the details of how networks encode function and what information computational analyses use to extract functional meaning. We explore a number of consequences of this and find that network structure itself provides clues as to which connections are critical and that systemic properties, such as scale-free-like behaviour, do not map onto the functional connectivity within networks. PMID:22479173
Functional signaling and gene regulatory networks between the oocyte and the surrounding cumulus cells.

PubMed

Biase, Fernando H; Kimble, Katelyn M

2018-05-10

The maturation and successful acquisition of developmental competence by an oocyte, the female gamete, during folliculogenesis is highly dependent on molecular interactions with somatic cells. Most of the cellular interactions identified, thus far, are modulated by growth factors, ions or metabolites. We hypothesized that this interaction is also modulated at the transcriptional level, which leads to the formation of gene regulatory networks between the oocyte and cumulus cells. We tested this hypothesis by analyzing transcriptome data from single oocytes and the surrounding cumulus cells collected from antral follicles employing an analytical framework to determine interdependencies at the transcript level. We overlapped our transcriptome data with putative protein-protein interactions and identified hundreds of ligand-receptor pairs that can transduce paracrine signaling between an oocyte and cumulus cells. We determined that 499 ligand-encoding genes expressed in oocytes and cumulus cells are functionally associated with transcription regulation (FDR < 0.05). Ligand-encoding genes with specific expression in oocytes or cumulus cells were enriched for biological functions that are likely associated with the coordinated formation of transzonal projections from cumulus cells that reach the oocyte's membrane. Thousands of gene pairs exhibit significant linear co-expression (absolute correlation > 0.85, FDR < 1.8 × 10 - 5 ) patterns between oocytes and cumulus cells. Hundreds of co-expressing genes showed clustering patterns associated with biological functions (FDR < 0.5) necessary for a coordinated function between the oocyte and cumulus cells during folliculogenesis (i.e. regulation of transcription, translation, apoptosis, cell differentiation and transport). Our analyses revealed a complex and functional gene regulatory circuit between the oocyte and surrounding cumulus cells. The regulatory profile of each cumulus-oocyte complex is likely associated with the oocytes' developmental potential to derive an embryo.
Pluripotency and lineages in the mammalian blastocyst: an evolutionary view.

PubMed

Cañon, Susana; Fernandez-Tresguerres, Beatriz; Manzanares, Miguel

2011-06-01

Early mammalian development is characterized by a highly specific stage, the blastocyst, by which embryonic and extraembryonic lineages have been determined, but pattern formation has not yet begun. The blastocyst is also of interest because cell precursors of the embryo proper retain for a certain time the capability to generate all the cell types of the adult animal. This embryonic pluripotency is established and maintained by a regulatory network under the control of a small set of transcription factors, comprising Oct4, Sox2 and Nanog. This network is largely conserved in eutherian mammals, but there is scarce information about how it arose in vertebrates. We have analysed the conservation of gene regulatory networks controlling blastocyst lineages and pluripotency in the mouse by comparison with the chick. We found that few of elements of the network are novel to mammals; rather, most of them were present before the separation of the mammalian lineage from other amniotes, but acquired novel expression domains during early mammalian development. Our results strongly support the hypothesis that mammalian blastocyst regulatory networks evolved through rewiring of pre-existing components, involving the co-option and duplication of existing genes and the establishment of new regulatory interactions among them.
From Gene Trees to a Dated Allopolyploid Network: Insights from the Angiosperm Genus Viola (Violaceae)

PubMed Central

Marcussen, Thomas; Heier, Lise; Brysting, Anne K.; Oxelman, Bengt; Jakobsen, Kjetill S.

2015-01-01

Allopolyploidization accounts for a significant fraction of speciation events in many eukaryotic lineages. However, existing phylogenetic and dating methods require tree-like topologies and are unable to handle the network-like phylogenetic relationships of lineages containing allopolyploids. No explicit framework has so far been established for evaluating competing network topologies, and few attempts have been made to date phylogenetic networks. We used a four-step approach to generate a dated polyploid species network for the cosmopolitan angiosperm genus Viola L. (Violaceae Batch.). The genus contains ca 600 species and both recent (neo-) and more ancient (meso-) polyploid lineages distributed over 16 sections. First, we obtained DNA sequences of three low-copy nuclear genes and one chloroplast region, from 42 species representing all 16 sections. Second, we obtained fossil-calibrated chronograms for each nuclear gene marker. Third, we determined the most parsimonious multilabeled genome tree and its corresponding network, resolved at the section (not the species) level. Reconstructing the “correct” network for a set of polyploids depends on recovering all homoeologs, i.e., all subgenomes, in these polyploids. Assuming the presence of Viola subgenome lineages that were not detected by the nuclear gene phylogenies (“ghost subgenome lineages”) significantly reduced the number of inferred polyploidization events. We identified the most parsimonious network topology from a set of five competing scenarios differing in the interpretation of homoeolog extinctions and lineage sorting, based on (i) fewest possible ghost subgenome lineages, (ii) fewest possible polyploidization events, and (iii) least possible deviation from expected ploidy as inferred from available chromosome counts of the involved polyploid taxa. Finally, we estimated the homoploid and polyploid speciation times of the most parsimonious network. Homoploid speciation times were estimated by coalescent analysis of gene tree node ages. Polyploid speciation times were estimated by comparing branch lengths and speciation rates of lineages with and without ploidy shifts. Our analyses recognize Viola as an old genus (crown age 31 Ma) whose evolutionary history has been profoundly affected by allopolyploidy. Between 16 and 21 allopolyploidizations are necessary to explain the diversification of the 16 major lineages (sections) of Viola, suggesting that allopolyploidy has accounted for a high percentage—between 67% and 88%—of the speciation events at this level. The theoretical and methodological approaches presented here for (i) constructing networks and (ii) dating speciation events within a network, have general applicability for phylogenetic studies of groups where allopolyploidization has occurred. They make explicit use of a hitherto underexplored source of ploidy information from chromosome counts to help resolve phylogenetic cases where incomplete sequence data hampers network inference. Importantly, the coalescent-based method used herein circumvents the assumption of tree-like evolution required by most techniques for dating speciation events. PMID:25281848
GENE EXPRESSION NETWORKS

EPA Science Inventory

"Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...
Gene regulatory networks and the underlying biology of developmental toxicity

EPA Science Inventory

Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...
Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining.

PubMed

Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R

2018-01-01

The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.
Gene expression complex networks: synthesis, identification, and analysis.

PubMed

Lopes, Fabrício M; Cesar, Roberto M; Costa, Luciano Da F

2011-10-01

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree variation, decreasing its network recovery rate with the increase of . The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Ontology-based literature mining of E. coli vaccine-associated gene interaction networks.

PubMed

Hur, Junguk; Özgür, Arzucan; He, Yongqun

2017-03-14

Pathogenic Escherichia coli infections cause various diseases in humans and many animal species. However, with extensive E. coli vaccine research, we are still unable to fully protect ourselves against E. coli infections. To more rational development of effective and safe E. coli vaccine, it is important to better understand E. coli vaccine-associated gene interaction networks. In this study, we first extended the Vaccine Ontology (VO) to semantically represent various E. coli vaccines and genes used in the vaccine development. We also normalized E. coli gene names compiled from the annotations of various E. coli strains using a pan-genome-based annotation strategy. The Interaction Network Ontology (INO) includes a hierarchy of various interaction-related keywords useful for literature mining. Using VO, INO, and normalized E. coli gene names, we applied an ontology-based SciMiner literature mining strategy to mine all PubMed abstracts and retrieve E. coli vaccine-associated E. coli gene interactions. Four centrality metrics (i.e., degree, eigenvector, closeness, and betweenness) were calculated for identifying highly ranked genes and interaction types. Using vaccine-related PubMed abstracts, our study identified 11,350 sentences that contain 88 unique INO interactions types and 1,781 unique E. coli genes. Each sentence contained at least one interaction type and two unique E. coli genes. An E. coli gene interaction network of genes and INO interaction types was created. From this big network, a sub-network consisting of 5 E. coli vaccine genes, including carA, carB, fimH, fepA, and vat, and 62 other E. coli genes, and 25 INO interaction types was identified. While many interaction types represent direct interactions between two indicated genes, our study has also shown that many of these retrieved interaction types are indirect in that the two genes participated in the specified interaction process in a required but indirect process. Our centrality analysis of these gene interaction networks identified top ranked E. coli genes and 6 INO interaction types (e.g., regulation and gene expression). Vaccine-related E. coli gene-gene interaction network was constructed using ontology-based literature mining strategy, which identified important E. coli vaccine genes and their interactions with other genes through specific interaction types.
Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network.

PubMed

Qin, Tingting; Matmati, Nabil; Tsoi, Lam C; Mohanty, Bidyut K; Gao, Nan; Tang, Jijun; Lawson, Andrew B; Hannun, Yusuf A; Zheng, W Jim

2014-10-01

To enhance our knowledge regarding biological pathway regulation, we took an integrated approach, using the biomedical literature, ontologies, network analyses and experimental investigation to infer novel genes that could modulate biological pathways. We first constructed a novel gene network via a pairwise comparison of all yeast genes' Ontology Fingerprints--a set of Gene Ontology terms overrepresented in the PubMed abstracts linked to a gene along with those terms' corresponding enrichment P-values. The network was further refined using a Bayesian hierarchical model to identify novel genes that could potentially influence the pathway activities. We applied this method to the sphingolipid pathway in yeast and found that many top-ranked genes indeed displayed altered sphingolipid pathway functions, initially measured by their sensitivity to myriocin, an inhibitor of de novo sphingolipid biosynthesis. Further experiments confirmed the modulation of the sphingolipid pathway by one of these genes, PFA4, encoding a palmitoyl transferase. Comparative analysis showed that few of these novel genes could be discovered by other existing methods. Our novel gene network provides a unique and comprehensive resource to study pathway modulations and systems biology in general. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Discovering hidden relationships between renal diseases and regulated genes through 3D network visualizations

PubMed Central

2010-01-01

Background In a recent study, two-dimensional (2D) network layouts were used to visualize and quantitatively analyze the relationship between chronic renal diseases and regulated genes. The results revealed complex relationships between disease type, gene specificity, and gene regulation type, which led to important insights about the underlying biological pathways. Here we describe an attempt to extend our understanding of these complex relationships by reanalyzing the data using three-dimensional (3D) network layouts, displayed through 2D and 3D viewing methods. Findings The 3D network layout (displayed through the 3D viewing method) revealed that genes implicated in many diseases (non-specific genes) tended to be predominantly down-regulated, whereas genes regulated in a few diseases (disease-specific genes) tended to be up-regulated. This new global relationship was quantitatively validated through comparison to 1000 random permutations of networks of the same size and distribution. Our new finding appeared to be the result of using specific features of the 3D viewing method to analyze the 3D renal network. Conclusions The global relationship between gene regulation and gene specificity is the first clue from human studies that there exist common mechanisms across several renal diseases, which suggest hypotheses for the underlying mechanisms. Furthermore, the study suggests hypotheses for why the 3D visualization helped to make salient a new regularity that was difficult to detect in 2D. Future research that tests these hypotheses should enable a more systematic understanding of when and how to use 3D network visualizations to reveal complex regularities in biological networks. PMID:21070623
The effects of graded levels of calorie restriction: VII. Topological rearrangement of hypothalamic aging networks.

PubMed

Derous, Davina; Mitchell, Sharon E; Green, Cara L; Wang, Yingchun; Han, Jing Dong J; Chen, Luonan; Promislow, Daniel E L; Lusseau, David; Speakman, John R; Douglas, Alex

2016-05-01

Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels.
The effects of graded levels of calorie restriction: VII. Topological rearrangement of hypothalamic aging networks

PubMed Central

Derous, Davina; Mitchell, Sharon E.; Green, Cara L.; Wang, Yingchun; Han, Jing Dong J.; Chen, Luonan; Promislow, Daniel E.L.; Lusseau, David; Speakman, John R.; Douglas, Alex

2016-01-01

Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels. PMID:27115072
Bipartite Network Analysis of the Archaeal Virosphere: Evolutionary Connections between Viruses and Capsidless Mobile Elements

PubMed Central

Prangishvili, David

2016-01-01

ABSTRACT Archaea and particularly hyperthermophilic crenarchaea are hosts to many unusual viruses with diverse virion shapes and distinct gene compositions. As is typical of viruses in general, there are no universal genes in the archaeal virosphere. Therefore, to obtain a comprehensive picture of the evolutionary relationships between viruses, network analysis methods are more productive than traditional phylogenetic approaches. Here we present a comprehensive comparative analysis of genomes and proteomes from all currently known taxonomically classified and unclassified, cultivated and uncultivated archaeal viruses. We constructed a bipartite network of archaeal viruses that includes two classes of nodes, the genomes and gene families that connect them. Dissection of this network using formal community detection methods reveals strong modularity, with 10 distinct modules and 3 putative supermodules. However, compared to similar previously analyzed networks of eukaryotic and bacterial viruses, the archaeal virus network is sparsely connected. With the exception of the tailed viruses related to bacteriophages of the order Caudovirales and the families Turriviridae and Sphaerolipoviridae that are linked to a distinct supermodule of eukaryotic and bacterial viruses, there are few connector genes shared by different archaeal virus modules. In contrast, most of these modules include, in addition to viruses, capsidless mobile elements, emphasizing tight evolutionary connections between the two types of entities in archaea. The relative contributions of distinct evolutionary origins, in particular from nonviral elements, and insufficient sampling to the sparsity of the archaeal virus network remain to be determined by further exploration of the archaeal virosphere. IMPORTANCE Viruses infecting archaea are among the most mysterious denizens of the virosphere. Many of these viruses display no genetic or even morphological relationship to viruses of bacteria and eukaryotes, raising questions regarding their origins and position in the global virosphere. Analysis of 5,740 protein sequences from 116 genomes allowed dissection of the archaeal virus network and showed that most groups of archaeal viruses are evolutionarily connected to capsidless mobile genetic elements, including various plasmids and transposons. This finding could reflect actual independent origins of the distinct groups of archaeal viruses from different nonviral elements, providing important insights into the emergence and evolution of the archaeal virome. PMID:27681128
Gene Discovery of Characteristic Metabolic Pathways in the Tea Plant (Camellia sinensis) Using ‘Omics’-Based Network Approaches: A Future Perspective

PubMed Central

Zhang, Shihua; Zhang, Liang; Tai, Yuling; Wang, Xuewen; Ho, Chi-Tang; Wan, Xiaochun

2018-01-01

Characteristic secondary metabolites, including flavonoids, theanine and caffeine, in the tea plant (Camellia sinensis) are the primary sources of the rich flavors, fresh taste, and health benefits of tea. The decoding of genes involved in these characteristic components is still significantly lagging, which lays an obstacle for applied genetic improvement and metabolic engineering. With the popularity of high-throughout transcriptomics and metabolomics, ‘omics’-based network approaches, such as gene co-expression network and gene-to-metabolite network, have emerged as powerful tools for gene discovery of plant-specialized (secondary) metabolism. Thus, it is pivotal to summarize and introduce such system-based strategies in facilitating gene identification of characteristic metabolic pathways in the tea plant (or other plants). In this review, we describe recent advances in transcriptomics and metabolomics for transcript and metabolite profiling, and highlight ‘omics’-based network strategies using successful examples in model and non-model plants. Further, we summarize recent progress in ‘omics’ analysis for gene identification of characteristic metabolites in the tea plant. Limitations of the current strategies are discussed by comparison with ‘omics’-based network approaches. Finally, we demonstrate the potential of introducing such network strategies in the tea plant, with a prospects ending for a promising network discovery of characteristic metabolite genes in the tea plant. PMID:29915604

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

PubMed

Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E

2017-04-12

Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
Investigating the Effects of Imputation Methods for Modelling Gene Networks Using a Dynamic Bayesian Network from Gene Expression Data

PubMed Central

CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md

2014-01-01

Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803
Architecture of the human regulatory network derived from ENCODE data.

PubMed

Gerstein, Mark B; Kundaje, Anshul; Hariharan, Manoj; Landt, Stephen G; Yan, Koon-Kiu; Cheng, Chao; Mu, Xinmeng Jasmine; Khurana, Ekta; Rozowsky, Joel; Alexander, Roger; Min, Renqiang; Alves, Pedro; Abyzov, Alexej; Addleman, Nick; Bhardwaj, Nitin; Boyle, Alan P; Cayting, Philip; Charos, Alexandra; Chen, David Z; Cheng, Yong; Clarke, Declan; Eastman, Catharine; Euskirchen, Ghia; Frietze, Seth; Fu, Yao; Gertz, Jason; Grubert, Fabian; Harmanci, Arif; Jain, Preti; Kasowski, Maya; Lacroute, Phil; Leng, Jing Jane; Lian, Jin; Monahan, Hannah; O'Geen, Henriette; Ouyang, Zhengqing; Partridge, E Christopher; Patacsil, Dorrelyn; Pauli, Florencia; Raha, Debasish; Ramirez, Lucia; Reddy, Timothy E; Reed, Brian; Shi, Minyi; Slifer, Teri; Wang, Jing; Wu, Linfeng; Yang, Xinqiong; Yip, Kevin Y; Zilberman-Schapira, Gili; Batzoglou, Serafim; Sidow, Arend; Farnham, Peggy J; Myers, Richard M; Weissman, Sherman M; Snyder, Michael

2012-09-06

Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the transcription factor binding into a hierarchy and integrated it with other genomic information (for example, microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
Weighted gene co‑expression network analysis in identification of key genes and networks for ischemic‑reperfusion remodeling myocardium.

PubMed

Guo, Nan; Zhang, Nan; Yan, Liqiu; Lian, Zheng; Wang, Jiawang; Lv, Fengfeng; Wang, Yunfei; Cao, Xufen

2018-06-14

Acute myocardial infarction induces ventricular remodeling, which is implicated in dilated heart and heart failure. The pathogenical mechanism of myocardium remodeling remains to be elucidated. The aim of the present study was to identify key genes and networks for myocardium remodeling following ischemia‑reperfusion (IR). First, the mRNA expression data from the National Center for Biotechnology Information database were downloaded to identify differences in mRNA expression of the IR heart at days 2 and 7. Then, weighted gene co‑expression network analysis, hierarchical clustering, protein‑protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were used to identify key genes and networks for the heart remodeling process following IR. A total of 3,321 differentially expressed genes were identified during the heart remodeling process. A total of 6 modules were identified through gene co‑expression network analysis. GO and KEGG analysis results suggested that each module represented a different biological function and was associated with different pathways. Finally, hub genes of each module were identified by PPI network construction. The present study revealed that heart remodeling following IR is a complicated process, involving extracellular matrix organization, neural development, apoptosis and energy metabolism. The dysregulated genes, including SRC proto‑oncogene, non‑receptor tyrosine kinase, discs large MAGUK scaffold protein 1, ATP citrate lyase, RAN, member RAS oncogene family, tumor protein p53, and polo like kinase 2, may be essential for heart remodeling following IR and may be used as potential targets for the inhibition of heart remodeling following acute myocardial infarction.
Integration of Steady-State and Temporal Gene Expression Data for the Inference of Gene Regulatory Networks

PubMed Central

Wang, Yi Kan; Hurley, Daniel G.; Schnell, Santiago; Print, Cristin G.; Crampin, Edmund J.

2013-01-01

We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data. PMID:23967277
A network-based, integrative study to identify core biological pathways that drive breast cancer clinical subtypes

PubMed Central

Dutta, B; Pusztai, L; Qi, Y; André, F; Lazar, V; Bianchini, G; Ueno, N; Agarwal, R; Wang, B; Shiang, C Y; Hortobagyi, G N; Mills, G B; Symmans, W F; Balázsi, G

2012-01-01

Background: The rapid collection of diverse genome-scale data raises the urgent need to integrate and utilise these resources for biological discovery or biomedical applications. For example, diverse transcriptomic and gene copy number variation data are currently collected for various cancers, but relatively few current methods are capable to utilise the emerging information. Methods: We developed and tested a data-integration method to identify gene networks that drive the biology of breast cancer clinical subtypes. The method simultaneously overlays gene expression and gene copy number data on protein–protein interaction, transcriptional-regulatory and signalling networks by identifying coincident genomic and transcriptional disturbances in local network neighborhoods. Results: We identified distinct driver-networks for each of the three common clinical breast cancer subtypes: oestrogen receptor (ER)+, human epidermal growth factor receptor 2 (HER2)+, and triple receptor-negative breast cancers (TNBC) from patient and cell line data sets. Driver-networks inferred from independent datasets were significantly reproducible. We also confirmed the functional relevance of a subset of randomly selected driver-network members for TNBC in gene knockdown experiments in vitro. We found that TNBC driver-network members genes have increased functional specificity to TNBC cell lines and higher functional sensitivity compared with genes selected by differential expression alone. Conclusion: Clinical subtype-specific driver-networks identified through data integration are reproducible and functionally important. PMID:22343619
Development and use of the Cytoscape app GFD-Net for measuring semantic dissimilarity of gene networks

PubMed Central

Diaz-Montana, Juan J.; Diaz-Diaz, Norberto

2014-01-01

Gene networks are one of the main computational models used to study the interaction between different elements during biological processes being widely used to represent gene–gene, or protein–protein interaction complexes. We present GFD-Net, a Cytoscape app for visualizing and analyzing the functional dissimilarity of gene networks. PMID:25400907
Microscopy and bioinformatic analyses of lipid metabolism implicate a sporophytic signaling network supporting pollen development in Arabidopsis.

PubMed

Wang, Yixing; Wu, Hong; Yang, Ming

2008-07-01

The Arabidopsis sporophytic tapetum undergoes a programmed degeneration process to secrete lipid and other materials to support pollen development. However, the molecular mechanism regulating the degeneration process is unknown. To gain insight into this molecular mechanism, we first determined that the most critical period for tapetal secretion to support pollen development is from the vacuolate microspore stage to the early binucleate pollen stage. We then analyzed the expression of enzymes responsible for lipid biosynthesis and degradation with available in-silico data. The genes for these enzymes that are expressed in the stamen but not in the concurrent uninucleate microspore and binucleate pollen are of particular interest, as they presumably hold the clues to unique molecular processes in the sporophytic tissues compared to the gametophytic tissue. No gene for lipid biosynthesis but a single gene encoding a patatin-like protein likely for lipid mobilization was identified based on the selection criterion. A search for genes co-expressed with this gene identified additional genes encoding typical signal transduction components such as a leucine-rich repeat receptor kinase, an extra-large G-protein, other protein kinases, and transcription factors. In addition, proteases, cell wall degradation enzymes, and other proteins were also identified. These proteins thus may be components of a signaling network leading to degradation of a broad range of cellular components. Since a broad range of degradation activities is expected to occur only in the tapetal degeneration process at this stage in the stamen, it is further hypothesized that the signaling network acts in the tapetal degeneration process.
Female mating preferences determine system-level evolution in a gene network model.

PubMed

Fierst, Janna L

2013-06-01

Environmental patterns of directional, stabilizing and fluctuating selection can influence the evolution of system-level properties like evolvability and mutational robustness. Intersexual selection produces strong phenotypic selection and these dynamics may also affect the response to mutation and the potential for future adaptation. In order to to assess the influence of mating preferences on these evolutionary properties, I modeled a male trait and female preference determined by separate gene regulatory networks. I studied three sexual selection scenarios: sexual conflict, a Gaussian model of the Fisher process described in Lande (in Proc Natl Acad Sci 78(6):3721-3725, 1981) and a good genes model in which the male trait signalled his mutational condition. I measured the effects these mating preferences had on the potential for traits and preferences to evolve towards new states, and mutational robustness of both the phenotype and the individual's overall viability. All types of sexual selection increased male phenotypic robustness relative to a randomly mating population. The Fisher model also reduced male evolvability and mutational robustness for viability. Under good genes sexual selection, males evolved an increased mutational robustness for viability. Females choosing their mates is a scenario that is sufficient to create selective forces that impact genetic evolution and shape the evolutionary response to mutation and environmental selection. These dynamics will inevitably develop in any population where sexual selection is operating, and affect the potential for future adaptation.
Modeling coexistence of oscillation and Delta/Notch-mediated lateral inhibition in pancreas development and neurogenesis.

PubMed

Tiedemann, Hendrik B; Schneltzer, Elida; Beckers, Johannes; Przemeck, Gerhard K H; Hrabě de Angelis, Martin

2017-10-07

During pancreas development, Neurog3 positive endocrine progenitors are specified by Delta/Notch (D/N) mediated lateral inhibition in the growing ducts. During neurogenesis, genes that determine the transition from the proneural state to neuronal or glial lineages are oscillating before their expression is sustained. Although the basic gene regulatory network is very similar, cycling gene expression in pancreatic development was not investigated yet, and previous simulations of lateral inhibition in pancreas development excluded by design the possibility of oscillations. To explore this possibility, we developed a dynamic model of a growing duct that results in an oscillatory phase before the determination of endocrine progenitors by lateral inhibition. The basic network (D/N + Hes1 + Neurog3) shows scattered, stable Neurog3 expression after displaying transient expression. Furthermore, we included the Hes1 negative feedback as previously discussed in neurogenesis and show the consequences for Neurog3 expression in pancreatic duct development. Interestingly, a weakened HES1 action on the Hes1 promoter allows the coexistence of stable patterning and oscillations. In conclusion, cycling gene expression and lateral inhibition are not mutually exclusive. In this way, we argue for a unified mode of D/N mediated lateral inhibition in neurogenic and pancreatic progenitor specification. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Gene Expression Correlated with Severe Asthma Characteristics Reveals Heterogeneous Mechanisms of Severe Disease.

PubMed

Modena, Brian D; Bleecker, Eugene R; Busse, William W; Erzurum, Serpil C; Gaston, Benjamin M; Jarjour, Nizar N; Meyers, Deborah A; Milosevic, Jadranka; Tedrow, John R; Wu, Wei; Kaminski, Naftali; Wenzel, Sally E

2017-06-01

Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Identify networks of genes reflective of underlying biological processes that define SA. Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12-21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes.
Gene Expression Correlated with Severe Asthma Characteristics Reveals Heterogeneous Mechanisms of Severe Disease

PubMed Central

Modena, Brian D.; Bleecker, Eugene R.; Busse, William W.; Erzurum, Serpil C.; Gaston, Benjamin M.; Jarjour, Nizar N.; Meyers, Deborah A.; Milosevic, Jadranka; Tedrow, John R.; Wu, Wei; Kaminski, Naftali

2017-01-01

Rationale: Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Objectives: Identify networks of genes reflective of underlying biological processes that define SA. Methods: Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Measurements and Main Results: Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12–21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. Conclusions: In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes. PMID:27984699
Gene Network Rewiring to Study Melanoma Stage Progression and Elements Essential for Driving Melanoma

PubMed Central

Kaushik, Abhinav; Bhatia, Yashuma; Ali, Shakir; Gupta, Dinesh

2015-01-01

Metastatic melanoma patients have a poor prognosis, mainly attributable to the underlying heterogeneity in melanoma driver genes and altered gene expression profiles. These characteristics of melanoma also make the development of drugs and identification of novel drug targets for metastatic melanoma a daunting task. Systems biology offers an alternative approach to re-explore the genes or gene sets that display dysregulated behaviour without being differentially expressed. In this study, we have performed systems biology studies to enhance our knowledge about the conserved property of disease genes or gene sets among mutually exclusive datasets representing melanoma progression. We meta-analysed 642 microarray samples to generate melanoma reconstructed networks representing four different stages of melanoma progression to extract genes with altered molecular circuitry wiring as compared to a normal cellular state. Intriguingly, a majority of the melanoma network-rewired genes are not differentially expressed and the disease genes involved in melanoma progression consistently modulate its activity by rewiring network connections. We found that the shortlisted disease genes in the study show strong and abnormal network connectivity, which enhances with the disease progression. Moreover, the deviated network properties of the disease gene sets allow ranking/prioritization of different enriched, dysregulated and conserved pathway terms in metastatic melanoma, in agreement with previous findings. Our analysis also reveals presence of distinct network hubs in different stages of metastasizing tumor for the same set of pathways in the statistically conserved gene sets. The study results are also presented as a freely available database at http://bioinfo.icgeb.res.in/m3db/. The web-based database resource consists of results from the analysis presented here, integrated with cytoscape web and user-friendly tools for visualization, retrieval and further analysis. PMID:26558755
System Analysis of LWDH Related Genes Based on Text Mining in Biological Networks

PubMed Central

Miao, Yingbo; Zhang, Liangcai; Wang, Yang; Feng, Rennan; Yang, Lei; Zhang, Shihua; Jiang, Yongshuai; Liu, Guiyou

2014-01-01

Liuwei-dihuang (LWDH) is widely used in traditional Chinese medicine (TCM), but its molecular mechanism about gene interactions is unclear. LWDH genes were extracted from the existing literatures based on text mining technology. To simulate the complex molecular interactions that occur in the whole body, protein-protein interaction networks (PPINs) were constructed and the topological properties of LWDH genes were analyzed. LWDH genes have higher centrality properties and may play important roles in the complex biological network environment. It was also found that the distances within LWDH genes are smaller than expected, which means that the communication of LWDH genes during the biological process is rapid and effectual. At last, a comprehensive network of LWDH genes, including the related drugs and regulatory pathways at both the transcriptional and posttranscriptional levels, was constructed and analyzed. The biological network analysis strategy used in this study may be helpful for the understanding of molecular mechanism of TCM. PMID:25243143
Whole blood genome-wide expression profiling and network analysis suggest MELAS master regulators.

PubMed

Mende, Susanne; Royer, Loic; Herr, Alexander; Schmiedel, Janet; Deschauer, Marcus; Klopstock, Thomas; Kostic, Vladimir S; Schroeder, Michael; Reichmann, Heinz; Storch, Alexander

2011-07-01

The heteroplasmic mitochondrial DNA (mtDNA) mutation A3243G causes the mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes (MELAS) syndrome as one of the most frequent mitochondrial diseases. The process of reconfiguration of nuclear gene expression profile to accommodate cellular processes to the functional status of mitochondria might be a key to MELAS disease manifestation and could contribute to its diverse phenotypic presentation. To determine master regulatory protein networks and disease-modifying genes in MELAS syndrome. Analyses of whole blood transcriptomes from 10 MELAS patients using a novel strategy by combining classic Affymetrix oligonucleotide microarray profiling with regulatory and protein interaction network analyses. Hierarchical cluster analysis elucidated that the relative abundance of mutant mtDNA molecules is decisive for the nuclear gene expression response. Further analyses confirmed not only transcription factors already known to be involved in mitochondrial diseases (such as TFAM), but also detected the hypoxia-inducible factor 1 complex, nuclear factor Y and cAMP responsive element-binding protein-related transcription factors as novel master regulators for reconfiguration of nuclear gene expression in response to the MELAS mutation. Correlation analyses of gene alterations and clinico-genetic data detected significant correlations between A3243G-induced nuclear gene expression changes and mutant mtDNA load as well as disease characteristics. These potential disease-modifying genes influencing the expression of the MELAS phenotype are mainly related to clusters primarily unrelated to cellular energy metabolism, but important for nucleic acid and protein metabolism, and signal transduction. Our data thus provide a framework to search for new pathogenetic concepts and potential therapeutic approaches to treat the MELAS syndrome.
RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse

PubMed Central

Liu, Zhi-Ping; Wu, Canglin; Miao, Hongyu; Wu, Hulin

2015-01-01

Transcriptional and post-transcriptional regulation of gene expression is of fundamental importance to numerous biological processes. Nowadays, an increasing amount of gene regulatory relationships have been documented in various databases and literature. However, to more efficiently exploit such knowledge for biomedical research and applications, it is necessary to construct a genome-wide regulatory network database to integrate the information on gene regulatory relationships that are widely scattered in many different places. Therefore, in this work, we build a knowledge-based database, named ‘RegNetwork’, of gene regulatory networks for human and mouse by collecting and integrating the documented regulatory interactions among transcription factors (TFs), microRNAs (miRNAs) and target genes from 25 selected databases. Moreover, we also inferred and incorporated potential regulatory relationships based on transcription factor binding site (TFBS) motifs into RegNetwork. As a result, RegNetwork contains a comprehensive set of experimentally observed or predicted transcriptional and post-transcriptional regulatory relationships, and the database framework is flexibly designed for potential extensions to include gene regulatory networks for other organisms in the future. Based on RegNetwork, we characterized the statistical and topological properties of genome-wide regulatory networks for human and mouse, we also extracted and interpreted simple yet important network motifs that involve the interplays between TF-miRNA and their targets. In summary, RegNetwork provides an integrated resource on the prior information for gene regulatory relationships, and it enables us to further investigate context-specific transcriptional and post-transcriptional regulatory interactions based on domain-specific experimental data. Database URL: http://www.regnetworkweb.org PMID:26424082
The lineage-specific gene ponzr1 is essential for zebrafish pronephric and pharyngeal arch development.

PubMed

Bedell, Victoria M; Person, Anthony D; Larson, Jon D; McLoon, Anna; Balciunas, Darius; Clark, Karl J; Neff, Kevin I; Nelson, Katie E; Bill, Brent R; Schimmenti, Lisa A; Beiraghi, Soraya; Ekker, Stephen C

2012-02-01

The Homeobox (Hox) and Paired box (Pax) gene families are key determinants of animal body plans and organ structure. In particular, they function within regulatory networks that control organogenesis. How these conserved genes elicit differences in organ form and function in response to evolutionary pressures is incompletely understood. We molecularly and functionally characterized one member of an evolutionarily dynamic gene family, plac8 onzin related protein 1 (ponzr1), in the zebrafish. ponzr1 mRNA is expressed early in the developing kidney and pharyngeal arches. Using ponzr1-targeting morpholinos, we show that ponzr1 is required for formation of the glomerulus. Loss of ponzr1 results in a nonfunctional glomerulus but retention of a functional pronephros, an arrangement similar to the aglomerular kidneys found in a subset of marine fish. ponzr1 is integrated into the pax2a pathway, with ponzr1 expression requiring pax2a gene function, and proper pax2a expression requiring normal ponzr1 expression. In addition to pronephric function, ponzr1 is required for pharyngeal arch formation. We functionally demonstrate that ponzr1 can act as a transcription factor or co-factor, providing the first molecular mode of action for this newly described gene family. Together, this work provides experimental evidence of an additional mechanism that incorporates evolutionarily dynamic, lineage-specific gene families into conserved regulatory gene networks to create functional organ diversity.
A Network of HMG-box Transcription Factors Regulates Sexual Cycle in the Fungus Podospora anserina

PubMed Central

Ait Benkhali, Jinane; Coppin, Evelyne; Brun, Sylvain; Peraza-Reyes, Leonardo; Martin, Tom; Dixelius, Christina; Lazar, Noureddine; van Tilbeurgh, Herman; Debuchy, Robert

2013-01-01

High-mobility group (HMG) B proteins are eukaryotic DNA-binding proteins characterized by the HMG-box functional motif. These transcription factors play a pivotal role in global genomic functions and in the control of genes involved in specific developmental or metabolic pathways. The filamentous ascomycete Podospora anserina contains 12 HMG-box genes. Of these, four have been previously characterized; three are mating-type genes that control fertilization and development of the fruit-body, whereas the last one encodes a factor involved in mitochondrial DNA stability. Systematic deletion analysis of the eight remaining uncharacterized HMG-box genes indicated that none were essential for viability, but that seven were involved in the sexual cycle. Two HMG-box genes display striking features. PaHMG5, an ortholog of SpSte11 from Schizosaccharomyces pombe, is a pivotal activator of mating-type genes in P. anserina, whereas PaHMG9 is a repressor of several phenomena specific to the stationary phase, most notably hyphal anastomoses. Transcriptional analyses of HMG-box genes in HMG-box deletion strains indicated that PaHMG5 is at the hub of a network of several HMG-box factors that regulate mating-type genes and mating-type target genes. Genetic analyses revealed that this network also controls fertility genes that are not regulated by mating-type transcription factors. This study points to the critical role of HMG-box members in sexual reproduction in fungi, as 11 out of 12 members were involved in the sexual cycle in P. anserina. PaHMG5 and SpSte11 are conserved transcriptional regulators of mating-type genes, although P. anserina and S. pombe diverged 550 million years ago. Two HMG-box genes, SOX9 and its upstream regulator SRY, also play an important role in sex determination in mammals. The P. anserina and S. pombe mating-type genes and their upstream regulatory factor form a module of HMG-box genes analogous to the SRY/SOX9 module, revealing a commonality of sex regulation in animals and fungi. PMID:23935511
VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine).

PubMed

Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M

2013-12-16

Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis) whereby the recovered sub-networks reconfirm established plant gene functions and also identify novel associations. Together, we present valuable insights into grapevine transcriptional regulation by developing network models applicable to researchers in their prioritisation of gene candidates, for on-going study of biological processes related to grapevine development, metabolism and stress responses.
Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.

PubMed

Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh

2017-01-01

The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. We built a Bayesian network combining random process and greedy search by using Genetic Analysis Workshop 14 (GAW14) dataset to establish EBN of SNPs. Then we predicted the association between SNPs and alcoholism by determining Bayes' prior probability. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As many SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data.

PubMed

Chen, Shuonan; Mar, Jessica C

2018-06-19

A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data. Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other. This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.
How Artificial Intelligence Can Improve Our Understanding of the Genes Associated with Endometriosis: Natural Language Processing of the PubMed Database

PubMed Central

Mashiach, R.; Cohen, S.; Kedem, A.; Baron, A.; Zajicek, M.; Feldman, I.; Seidman, D.; Soriano, D.

2018-01-01

Endometriosis is a disease characterized by the development of endometrial tissue outside the uterus, but its cause remains largely unknown. Numerous genes have been studied and proposed to help explain its pathogenesis. However, the large number of these candidate genes has made functional validation through experimental methodologies nearly impossible. Computational methods could provide a useful alternative for prioritizing those most likely to be susceptibility genes. Using artificial intelligence applied to text mining, this study analyzed the genes involved in the pathogenesis, development, and progression of endometriosis. The data extraction by text mining of the endometriosis-related genes in the PubMed database was based on natural language processing, and the data were filtered to remove false positives. Using data from the text mining and gene network information as input for the web-based tool, 15,207 endometriosis-related genes were ranked according to their score in the database. Characterization of the filtered gene set through gene ontology, pathway, and network analysis provided information about the numerous mechanisms hypothesized to be responsible for the establishment of ectopic endometrial tissue, as well as the migration, implantation, survival, and proliferation of ectopic endometrial cells. Finally, the human genome was scanned through various databases using filtered genes as a seed to determine novel genes that might also be involved in the pathogenesis of endometriosis but which have not yet been characterized. These genes could be promising candidates to serve as useful diagnostic biomarkers and therapeutic targets in the management of endometriosis. PMID:29750165
How Artificial Intelligence Can Improve Our Understanding of the Genes Associated with Endometriosis: Natural Language Processing of the PubMed Database.

PubMed

Bouaziz, J; Mashiach, R; Cohen, S; Kedem, A; Baron, A; Zajicek, M; Feldman, I; Seidman, D; Soriano, D

2018-01-01

Endometriosis is a disease characterized by the development of endometrial tissue outside the uterus, but its cause remains largely unknown. Numerous genes have been studied and proposed to help explain its pathogenesis. However, the large number of these candidate genes has made functional validation through experimental methodologies nearly impossible. Computational methods could provide a useful alternative for prioritizing those most likely to be susceptibility genes. Using artificial intelligence applied to text mining, this study analyzed the genes involved in the pathogenesis, development, and progression of endometriosis. The data extraction by text mining of the endometriosis-related genes in the PubMed database was based on natural language processing, and the data were filtered to remove false positives. Using data from the text mining and gene network information as input for the web-based tool, 15,207 endometriosis-related genes were ranked according to their score in the database. Characterization of the filtered gene set through gene ontology, pathway, and network analysis provided information about the numerous mechanisms hypothesized to be responsible for the establishment of ectopic endometrial tissue, as well as the migration, implantation, survival, and proliferation of ectopic endometrial cells. Finally, the human genome was scanned through various databases using filtered genes as a seed to determine novel genes that might also be involved in the pathogenesis of endometriosis but which have not yet been characterized. These genes could be promising candidates to serve as useful diagnostic biomarkers and therapeutic targets in the management of endometriosis.
Comparison of gene co-networks reveals the molecular mechanisms of the rice (Oryza sativa L.) response to Rhizoctonia solani AG1 IA infection.

PubMed

Zhang, Jinfeng; Zhao, Wenjuan; Fu, Rong; Fu, Chenglin; Wang, Lingxia; Liu, Huainian; Li, Shuangcheng; Deng, Qiming; Wang, Shiquan; Zhu, Jun; Liang, Yueyang; Li, Ping; Zheng, Aiping

2018-05-05

Rhizoctonia solani causes rice sheath blight, an important disease affecting the growth of rice (Oryza sativa L.). Attempts to control the disease have met with little success. Based on transcriptional profiling, we previously identified more than 11,947 common differentially expressed genes (TPM > 10) between the rice genotypes TeQing and Lemont. In the current study, we extended these findings by focusing on an analysis of gene co-expression in response to R. solani AG1 IA and identified gene modules within the networks through weighted gene co-expression network analysis (WGCNA). We compared the different genes assigned to each module and the biological interpretations of gene co-expression networks at early and later modules in the two rice genotypes to reveal differential responses to AG1 IA. Our results show that different changes occurred in the two rice genotypes and that the modules in the two groups contain a number of candidate genes possibly involved in pathogenesis, such as the VQ protein. Furthermore, these gene co-expression networks provide comprehensive transcriptional information regarding gene expression in rice in response to AG1 IA. The co-expression networks derived from our data offer ideas for follow-up experimentation that will help advance our understanding of the translational regulation of rice gene expression changes in response to AG1 IA.
Morphogenesis in Plants: Modeling the Shoot Apical Meristem, and Possible Applications

NASA Technical Reports Server (NTRS)

Mjolsness, Eric; Gor, Victoria; Meyerowitz, Elliot; Mann, Tobias

1998-01-01

A key determinant of overall morphogenesis in flowering plants such as Arabidopsis thaliana is the shoot apical meristem (growing tip of a shoot). Gene regulation networks can be used to model this system. We exhibit a very preliminary two-dimensional model including gene regulation and intercellular signaling, but omitting cell division and dynamical geometry. The model can be trained to have three stable regions of gene expression corresponding to the central zone, peripheral zone, and rib meristem. We also discuss a space-engineering motivation for studying and controlling the morphogenesis of plants using such computational models.
Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network

PubMed Central

Qin, Tingting; Matmati, Nabil; Tsoi, Lam C.; Mohanty, Bidyut K.; Gao, Nan; Tang, Jijun; Lawson, Andrew B.; Hannun, Yusuf A.; Zheng, W. Jim

2014-01-01

To enhance our knowledge regarding biological pathway regulation, we took an integrated approach, using the biomedical literature, ontologies, network analyses and experimental investigation to infer novel genes that could modulate biological pathways. We first constructed a novel gene network via a pairwise comparison of all yeast genes’ Ontology Fingerprints—a set of Gene Ontology terms overrepresented in the PubMed abstracts linked to a gene along with those terms’ corresponding enrichment P-values. The network was further refined using a Bayesian hierarchical model to identify novel genes that could potentially influence the pathway activities. We applied this method to the sphingolipid pathway in yeast and found that many top-ranked genes indeed displayed altered sphingolipid pathway functions, initially measured by their sensitivity to myriocin, an inhibitor of de novo sphingolipid biosynthesis. Further experiments confirmed the modulation of the sphingolipid pathway by one of these genes, PFA4, encoding a palmitoyl transferase. Comparative analysis showed that few of these novel genes could be discovered by other existing methods. Our novel gene network provides a unique and comprehensive resource to study pathway modulations and systems biology in general. PMID:25063300
Improving the measurement of semantic similarity by combining gene ontology and co-functional network: a random walk based approach.

PubMed

Peng, Jiajie; Zhang, Xuanshuo; Hui, Weiwei; Lu, Junya; Li, Qianqian; Liu, Shuhui; Shang, Xuequn

2018-03-19

Gene Ontology (GO) is one of the most popular bioinformatics resources. In the past decade, Gene Ontology-based gene semantic similarity has been effectively used to model gene-to-gene interactions in multiple research areas. However, most existing semantic similarity approaches rely only on GO annotations and structure, or incorporate only local interactions in the co-functional network. This may lead to inaccurate GO-based similarity resulting from the incomplete GO topology structure and gene annotations. We present NETSIM2, a new network-based method that allows researchers to measure GO-based gene functional similarities by considering the global structure of the co-functional network with a random walk with restart (RWR)-based method, and by selecting the significant term pairs to decrease the noise information. Based on the EC number (Enzyme Commission)-based groups of yeast and Arabidopsis, evaluation test shows that NETSIM2 can enhance the accuracy of Gene Ontology-based gene functional similarity. Using NETSIM2 as an example, we found that the accuracy of semantic similarities can be significantly improved after effectively incorporating the global gene-to-gene interactions in the co-functional network, especially on the species that gene annotations in GO are far from complete.
Determining Regulatory Networks Governing the Differentiation of Embryonic Stem Cells to Pancreatic Lineage

NASA Astrophysics Data System (ADS)

Banerjee, Ipsita

2009-03-01

Knowledge of pathways governing cellular differentiation to specific phenotype will enable generation of desired cell fates by careful alteration of the governing network by adequate manipulation of the cellular environment. With this aim, we have developed a novel method to reconstruct the underlying regulatory architecture of a differentiating cell population from discrete temporal gene expression data. We utilize an inherent feature of biological networks, that of sparsity, in formulating the network reconstruction problem as a bi-level mixed-integer programming problem. The formulation optimizes the network topology at the upper level and the network connectivity strength at the lower level. The method is first validated by in-silico data, before applying it to the complex system of embryonic stem (ES) cell differentiation. This formulation enables efficient identification of the underlying network topology which could accurately predict steps necessary for directing differentiation to subsequent stages. Concurrent experimental verification demonstrated excellent agreement with model prediction.
Network pharmacology-based prediction of active compounds and molecular targets in Yijin-Tang acting on hyperlipidaemia and atherosclerosis.

PubMed

Lee, A Yeong; Park, Won; Kang, Tae-Wook; Cha, Min Ho; Chun, Jin Mi

2018-07-15

Yijin-Tang (YJT) is a traditional prescription for the treatment of hyperlipidaemia, atherosclerosis and other ailments related to dampness phlegm, a typical pathological symptom of abnormal body fluid metabolism in Traditional Korean Medicine. However, a holistic network pharmacology approach to understanding the therapeutic mechanisms underlying hyperlipidaemia and atherosclerosis has not been pursued. To examine the network pharmacological potential effects of YJT on hyperlipidaemia and atherosclerosis, we analysed components, performed target prediction and network analysis, and investigated interacting pathways using a network pharmacology approach. Information on compounds in herbal medicines was obtained from public databases, and oral bioavailability and drug-likeness was screened using absorption, distribution, metabolism, and excretion (ADME) criteria. Correlations between compounds and genes were linked using the STITCH database, and genes related to hyperlipidaemia and atherosclerosis were gathered using the GeneCards database. Human genes were identified and subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Network analysis identified 447 compounds in five herbal medicines that were subjected to ADME screening, and 21 compounds and 57 genes formed the main pathways linked to hyperlipidaemia and atherosclerosis. Among them, 10 compounds (naringenin, nobiletin, hesperidin, galangin, glycyrrhizin, homogentisic acid, stigmasterol, 6-gingerol, quercetin and glabridin) were linked to more than four genes, and are bioactive compounds and key chemicals. Core genes in this network were CASP3, CYP1A1, CYP1A2, MMP2 and MMP9. The compound-target gene network revealed close interactions between multiple components and multiple targets, and facilitates a better understanding of the potential therapeutic effects of YJT. Pharmacological network analysis can help to explain the potential effects of YJT for treating dampness phlegm-related diseases such as hyperlipidaemia and atherosclerosis. Copyright © 2018 Elsevier B.V. All rights reserved.
Flower Development

PubMed Central

Alvarez-Buylla, Elena R.; Benítez, Mariana; Corvera-Poiré, Adriana; Chaos Cador, Álvaro; de Folter, Stefan; Gamboa de Buen, Alicia; Garay-Arroyo, Adriana; García-Ponce, Berenice; Jaimes-Miranda, Fabiola; Pérez-Ruiz, Rigoberto V.; Piñeyro-Nelson, Alma; Sánchez-Corrales, Yara E.

2010-01-01

Flowers are the most complex structures of plants. Studies of Arabidopsis thaliana, which has typical eudicot flowers, have been fundamental in advancing the structural and molecular understanding of flower development. The main processes and stages of Arabidopsis flower development are summarized to provide a framework in which to interpret the detailed molecular genetic studies of genes assigned functions during flower development and is extended to recent genomics studies uncovering the key regulatory modules involved. Computational models have been used to study the concerted action and dynamics of the gene regulatory module that underlies patterning of the Arabidopsis inflorescence meristem and specification of the primordial cell types during early stages of flower development. This includes the gene combinations that specify sepal, petal, stamen and carpel identity, and genes that interact with them. As a dynamic gene regulatory network this module has been shown to converge to stable multigenic profiles that depend upon the overall network topology and are thus robust, which can explain the canalization of flower organ determination and the overall conservation of the basic flower plan among eudicots. Comparative and evolutionary approaches derived from Arabidopsis studies pave the way to studying the molecular basis of diverse floral morphologies. PMID:22303253
Incorporating interaction networks into the determination of functionally related hit genes in genomic experiments with Markov random fields

PubMed Central

Robinson, Sean; Nevalainen, Jaakko; Pinna, Guillaume; Campalans, Anna; Radicella, J. Pablo; Guyon, Laurent

2017-01-01

Abstract Motivation: Incorporating gene interaction data into the identification of ‘hit’ genes in genomic experiments is a well-established approach leveraging the ‘guilt by association’ assumption to obtain a network based hit list of functionally related genes. We aim to develop a method to allow for multivariate gene scores and multiple hit labels in order to extend the analysis of genomic screening data within such an approach. Results: We propose a Markov random field-based method to achieve our aim and show that the particular advantages of our method compared with those currently used lead to new insights in previously analysed data as well as for our own motivating data. Our method additionally achieves the best performance in an independent simulation experiment. The real data applications we consider comprise of a survival analysis and differential expression experiment and a cell-based RNA interference functional screen. Availability and implementation: We provide all of the data and code related to the results in the paper. Contact: sean.j.robinson@utu.fi or laurent.guyon@cea.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28881978
Simultaneous learning of instantaneous and time-delayed genetic interactions using novel information theoretic scoring technique

PubMed Central

2012-01-01

Background Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. Results In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. Conclusion By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach. PMID:22691450
Pan- and core- network analysis of co-expression genes in a model plant

DOE PAGES

He, Fei; Maslov, Sergei

2016-12-16

Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Pan- and core- network analysis of co-expression genes in a model plant

DOE Office of Scientific and Technical Information (OSTI.GOV)

He, Fei; Maslov, Sergei

Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Physiological Responses and Gene Co-Expression Network of Mycorrhizal Roots under K+ Deprivation1[OPEN

PubMed Central

Roy, Sushmita

2017-01-01

Arbuscular mycorrhizal (AM) associations enhance the phosphorous and nitrogen nutrition of host plants, but little is known about their role in potassium (K+) nutrition. Medicago truncatula plants were cocultured with the AM fungus Rhizophagus irregularis under high and low K+ regimes for 6 weeks. We determined how K+ deprivation affects plant development and mineral acquisition and how these negative effects are tempered by the AM colonization. The transcriptional response of AM roots under K+ deficiency was analyzed by whole-genome RNA sequencing. K+ deprivation decreased root biomass and external K+ uptake and modulated oxidative stress gene expression in M. truncatula roots. AM colonization induced specific transcriptional responses to K+ deprivation that seem to temper these negative effects. A gene network analysis revealed putative key regulators of these responses. This study confirmed that AM associations provide some tolerance to K+ deprivation to host plants, revealed that AM symbiosis modulates the expression of specific root genes to cope with this nutrient stress, and identified putative regulators participating in these tolerance mechanisms. PMID:28159827
Probabilistic representation of gene regulatory networks.

PubMed

Mao, Linyong; Resat, Haluk

2004-09-22

Recent experiments have established unambiguously that biological systems can have significant cell-to-cell variations in gene expression levels even in isogenic populations. Computational approaches to studying gene expression in cellular systems should capture such biological variations for a more realistic representation. In this paper, we present a new fully probabilistic approach to the modeling of gene regulatory networks that allows for fluctuations in the gene expression levels. The new algorithm uses a very simple representation for the genes, and accounts for the repression or induction of the genes and for the biological variations among isogenic populations simultaneously. Because of its simplicity, introduced algorithm is a very promising approach to model large-scale gene regulatory networks. We have tested the new algorithm on the synthetic gene network library bioengineered recently. The good agreement between the computed and the experimental results for this library of networks, and additional tests, demonstrate that the new algorithm is robust and very successful in explaining the experimental data. The simulation software is available upon request. Supplementary material will be made available on the OUP server.
From experimental design to functional gene networks: DNA microarray contribution to skin ageing research.

PubMed

Benech, P D; Patatian, A

2014-12-01

There is no doubt that the DNA microarray-based technology contributed to increase our knowledge of a wide range of processes. However, integrating genes into functional networks, rather than terms describing generic characteristics, remains an important challenge. The highly context-dependent function of a given gene and feedback mechanisms complexify greatly the interpretation of the data. Moreover, it is difficult to determine whether changes in gene expression are the result or the cause of pathologies or physiological events. In both cases, the difficulty relies on the involvement of processes that, at an early stage, can be protective and later on, deleterious because of their runaway. Each individual cell has its own transcription profile that determines its behaviour and its relationships with its neighbours. This is particularly true when a mechanism such as cell cycle is concerned. Another issue concerns the analyses from samples of different donors. Whereas the statistical tools lead to determine common features among groups, they tend to smooth the overall data and consequently, the selected values represent the 'tip of the iceberg'. There is a significant overlap in the set of genes identified in the different studies on skin ageing processes described in the present review. The reason of this overlap is because most of these genes belong to the basic machinery controlling cell growth and arrest. To get a more full picture of these processes, a hard work has still to be done to determine the precise mechanisms conferring the cell type specificity of ageing. Integrative biology applied to the huge amount of existing microarray data should fulfil gaps, through the characterization of additional actors accounting for the activation of specific signalling pathways at crossing points. Furthermore, computational tools have to be developed taking into account that expression values among similar groups may not vary 'by chance' but may reflect, along with other subtle changes, specific features of one given donor. Through a better stratification, these tools will allow to recover genes from the 'bottom of the iceberg'. Identifying these genes should contribute to understand how skin ages among individuals, thus paving the way for personalized skin care. © 2014 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Finding gene regulatory network candidates using the gene expression knowledge base.

PubMed

Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

2014-12-10

Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining

PubMed Central

Kreula, Sanna M.; Kaewphan, Suwisa; Ginter, Filip

2018-01-01

The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from ‘reading the literature’. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already ‘known’, and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to (i) discover novel candidate associations between different genes or proteins in the network, and (ii) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource. PMID:29844966
Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less

Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation.

PubMed

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; Ng, Patrick; Khraiwesh, Basel; Jaiswal, Ashish; Jijakli, Kenan; Koussa, Joseph; Nelson, David R; Cai, Hong; Yang, Xinping; Chang, Roger L; Papin, Jason; Yu, Haiyuan; Balaji, Santhanam; Salehi-Ashtiani, Kourosh

2016-07-19

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolic network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. The defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.
Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation

DOE PAGES

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; ...

2016-06-14

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less
An integrated and comparative approach towards identification, characterization and functional annotation of candidate genes for drought tolerance in sorghum (Sorghum bicolor (L.) Moench).

PubMed

Woldesemayat, Adugna Abdi; Van Heusden, Peter; Ndimba, Bongani K; Christoffels, Alan

2017-12-22

Drought is the most disastrous abiotic stress that severely affects agricultural productivity worldwide. Understanding the biological basis of drought-regulated traits, requires identification and an in-depth characterization of genetic determinants using model organisms and high-throughput technologies. However, studies on drought tolerance have generally been limited to traditional candidate gene approach that targets only a single gene in a pathway that is related to a trait. In this study, we used sorghum, one of the model crops that is well adapted to arid regions, to mine genes and define determinants for drought tolerance using drought expression libraries and RNA-seq data. We provide an integrated and comparative in silico candidate gene identification, characterization and annotation approach, with an emphasis on genes playing a prominent role in conferring drought tolerance in sorghum. A total of 470 non-redundant functionally annotated drought responsive genes (DRGs) were identified using experimental data from drought responses by employing pairwise sequence similarity searches, pathway and interpro-domain analysis, expression profiling and orthology relation. Comparison of the genomic locations between these genes and sorghum quantitative trait loci (QTLs) showed that 40% of these genes were co-localized with QTLs known for drought tolerance. The genome reannotation conducted using the Program to Assemble Spliced Alignment (PASA), resulted in 9.6% of existing single gene models being updated. In addition, 210 putative novel genes were identified using AUGUSTUS and PASA based analysis on expression dataset. Among these, 50% were single exonic, 69.5% represented drought responsive and 5.7% were complete gene structure models. Analysis of biochemical metabolism revealed 14 metabolic pathways that are related to drought tolerance and also had a strong biological network, among categories of genes involved. Identification of these pathways, signifies the interplay of biochemical reactions that make up the metabolic network, constituting fundamental interface for sorghum defence mechanism against drought stress. This study suggests untapped natural variability in sorghum that could be used for developing drought tolerance. The data presented here, may be regarded as an initial reference point in functional and comparative genomics in the Gramineae family.
Gene regulatory network inference using fused LASSO on multiple data sets

PubMed Central

Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran

2016-01-01

Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687
Co-expression network analysis identified six hub genes in association with metastasis risk and prognosis in hepatocellular carcinoma

PubMed Central

Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu

2017-01-01

Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663
In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer

PubMed Central

Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella

2018-01-01

Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723
Remodeling of Sensorimotor Brain Connectivity in Gpr88-Deficient Mice.

PubMed

Arefin, Tanzil Mahmud; Mechling, Anna E; Meirsman, Aura Carole; Bienert, Thomas; Hübner, Neele Saskia; Lee, Hsu-Lei; Ben Hamida, Sami; Ehrlich, Aliza; Roquet, Dan; Hennig, Jürgen; von Elverfeldt, Dominik; Kieffer, Brigitte Lina; Harsan, Laura-Adela

2017-10-01

Recent studies have demonstrated that orchestrated gene activity and expression support synchronous activity of brain networks. However, there is a paucity of information on the consequences of single gene function on overall brain functional organization and connectivity and how this translates at the behavioral level. In this study, we combined mouse mutagenesis with functional and structural magnetic resonance imaging (MRI) to determine whether targeted inactivation of a single gene would modify whole-brain connectivity in live animals. The targeted gene encodes GPR88 (G protein-coupled receptor 88), an orphan G protein-coupled receptor enriched in the striatum and previously linked to behavioral traits relevant to neuropsychiatric disorders. Connectivity analysis of Gpr88-deficient mice revealed extensive remodeling of intracortical and cortico-subcortical networks. Most prominent modifications were observed at the level of retrosplenial cortex connectivity, central to the default mode network (DMN) whose alteration is considered a hallmark of many psychiatric conditions. Next, somatosensory and motor cortical networks were most affected. These modifications directly relate to sensorimotor gating deficiency reported in mutant animals and also likely underlie their hyperactivity phenotype. Finally, we identified alterations within hippocampal and dorsal striatum functional connectivity, most relevant to a specific learning deficit that we previously reported in Gpr88 -/- animals. In addition, amygdala connectivity with cortex and striatum was weakened, perhaps underlying the risk-taking behavior of these animals. This is the first evidence demonstrating that GPR88 activity shapes the mouse brain functional and structural connectome. The concordance between connectivity alterations and behavior deficits observed in Gpr88-deficient mice suggests a role for GPR88 in brain communication.
Fine-tuning gene networks using simple sequence repeats

PubMed Central

Egbert, Robert G.; Klavins, Eric

2012-01-01

The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
A Systems Approach Identifies Networks and Genes Linking Sleep and Stress: Implications for Neuropsychiatric Disorders

PubMed Central

Jiang, Peng; Scarpa, Joseph R.; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D.; Hao, Ke; Summa, Keith C.; Yang, He S.; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H.; Turek, Fred W.; Kasarskis, Andrew

2016-01-01

SUMMARY Sleep dysfunction and stress susceptibility are co-morbid complex traits, which often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multi-level organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J×A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests the interplay between sleep, stress, and neuropathology emerge from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework to interrogate the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. PMID:25921536
An approach for reduction of false predictions in reverse engineering of gene regulatory networks.

PubMed

Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar

2018-05-14

A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives. Copyright © 2018 Elsevier Ltd. All rights reserved.
The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network.

PubMed

Ding, Fangrui; Tan, Aidi; Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

2016-01-01

Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet contributes to improving the understanding of normal glomerular function and will be useful for detecting target cytoskeleton molecules of interest that may be involved in glomerular diseases in future studies.
The Prediction of Key Cytoskeleton Components Involved in Glomerular Diseases Based on a Protein-Protein Interaction Network

PubMed Central

Ju, Wenjun; Li, Xuejuan; Li, Shao; Ding, Jie

2016-01-01

Maintenance of the physiological morphologies of different types of cells and tissues is essential for the normal functioning of each system in the human body. Dynamic variations in cell and tissue morphologies depend on accurate adjustments of the cytoskeletal system. The cytoskeletal system in the glomerulus plays a key role in the normal process of kidney filtration. To enhance the understanding of the possible roles of the cytoskeleton in glomerular diseases, we constructed the Glomerular Cytoskeleton Network (GCNet), which shows the protein-protein interaction network in the glomerulus, and identified several possible key cytoskeletal components involved in glomerular diseases. In this study, genes/proteins annotated to the cytoskeleton were detected by Gene Ontology analysis, and glomerulus-enriched genes were selected from nine available glomerular expression datasets. Then, the GCNet was generated by combining these two sets of information. To predict the possible key cytoskeleton components in glomerular diseases, we then examined the common regulation of the genes in GCNet in the context of five glomerular diseases based on their transcriptomic data. As a result, twenty-one cytoskeleton components as potential candidate were highlighted for consistently down- or up-regulating in all five glomerular diseases. And then, these candidates were examined in relation to existing known glomerular diseases and genes to determine their possible functions and interactions. In addition, the mRNA levels of these candidates were also validated in a puromycin aminonucleoside(PAN) induced rat nephropathy model and were also matched with existing Diabetic Nephropathy (DN) transcriptomic data. As a result, there are 15 of 21 candidates in PAN induced nephropathy model were consistent with our predication and also 12 of 21 candidates were matched with differentially expressed genes in the DN transcriptomic data. By providing a novel interaction network and prediction, GCNet contributes to improving the understanding of normal glomerular function and will be useful for detecting target cytoskeleton molecules of interest that may be involved in glomerular diseases in future studies. PMID:27227331
ICan: an integrated co-alteration network to identify ovarian cancer-related genes.

PubMed

Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

2015-01-01

Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.
ICan: An Integrated Co-Alteration Network to Identify Ovarian Cancer-Related Genes

PubMed Central

Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

2015-01-01

Background Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. Results We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). Conclusion In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data. PMID:25803614
Regional and temporal differences in gene expression of LH(BETA)T(AG) retinoblastoma tumors.

PubMed

Houston, Samuel K; Pina, Yolanda; Clarke, Jennifer; Koru-Sengul, Tulay; Scott, William K; Nathanson, Lubov; Schefler, Amy C; Murray, Timothy G

2011-07-23

The purpose of this study was to evaluate by microarray the hypothesis that LH(BETA)T(AG) retinoblastoma tumors exhibit regional and temporal variations in gene expression. LH(BETA)T(AG) mice aged 12, 16, and 20 weeks were euthanatized (n = 9). Specimens were taken from five tumor areas (apex, anterior lateral, center, base, and posterior lateral). Samples were hybridized to gene microarrays. The data were preprocessed and analyzed, and genes with a P < 0.01, according to the ANOVA models, and a log(2)-fold change >2.5 were considered to be differentially expressed. Differentially expressed genes were analyzed for overlap with known networks by using pathway analysis tools. There were significant temporal (P < 10(-8)) and regional differences in gene expression for LH(BETA)T(AG) retinoblastoma tumors. At P < 0.01 and log(2)-fold change >2.5, there were significant changes in gene expression of 190 genes apically, 84 genes anterolaterally, 126 genes posteriorly, 56 genes centrally, and 134 genes at the base. Differentially expressed genes overlapped with known networks, with significant involvement in regulation of cellular proliferation and growth, response to oxygen levels and hypoxia, regulation of cellular processes, cellular signaling cascades, and angiogenesis. There are significant temporal and regional variations in the LH(BETA)T(AG) retinoblastoma model. Differentially expressed genes overlap with key pathways that may play pivotal roles in murine retinoblastoma development. These findings suggest the mechanisms involved in tumor growth and progression in murine retinoblastoma tumors and identify pathways for analysis at a functional level, to determine significance in human retinoblastoma. Microarray analysis of LH(BETA)T(AG) retinal tumors showed significant regional and temporal variations in gene expression, including dysregulation of genes involved in hypoxic responses and angiogenesis.
Modeling gene regulatory network motifs using statecharts

PubMed Central

2012-01-01

Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967
Differential C3NET reveals disease networks of direct physical interactions

PubMed Central

2011-01-01

Background Genes might have different gene interactions in different cell conditions, which might be mapped into different networks. Differential analysis of gene networks allows spotting condition-specific interactions that, for instance, form disease networks if the conditions are a disease, such as cancer, and normal. This could potentially allow developing better and subtly targeted drugs to cure cancer. Differential network analysis with direct physical gene interactions needs to be explored in this endeavour. Results C3NET is a recently introduced information theory based gene network inference algorithm that infers direct physical gene interactions from expression data, which was shown to give consistently higher inference performances over various networks than its competitors. In this paper, we present, DC3net, an approach to employ C3NET in inferring disease networks. We apply DC3net on a synthetic and real prostate cancer datasets, which show promising results. With loose cutoffs, we predicted 18583 interactions from tumor and normal samples in total. Although there are no reference interactions databases for the specific conditions of our samples in the literature, we found verifications for 54 of our predicted direct physical interactions from only four of the biological interaction databases. As an example, we predicted that RAD50 with TRF2 have prostate cancer specific interaction that turned out to be having validation from the literature. It is known that RAD50 complex associates with TRF2 in the S phase of cell cycle, which suggests that this predicted interaction may promote telomere maintenance in tumor cells in order to allow tumor cells to divide indefinitely. Our enrichment analysis suggests that the identified tumor specific gene interactions may be potentially important in driving the growth in prostate cancer. Additionally, we found that the highest connected subnetwork of our predicted tumor specific network is enriched for all proliferation genes, which further suggests that the genes in this network may serve in the process of oncogenesis. Conclusions Our approach reveals disease specific interactions. It may help to make experimental follow-up studies more cost and time efficient by prioritizing disease relevant parts of the global gene network. PMID:21777411
Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction.

PubMed

Stojanova, Daniela; Ceci, Michelangelo; Malerba, Donato; Dzeroski, Saso

2013-09-26

Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Our newly developed method for HMC takes into account network information in the learning phase: When used for gene function prediction in the context of PPI networks, the explicit consideration of network autocorrelation increases the predictive performance of the learned models. Overall, we found that this holds for different gene features/ descriptions, functional annotation schemes, and PPI networks: Best results are achieved when the PPI network is dense and contains a large proportion of function-relevant interactions.
Data Imputation in Epistatic MAPs by Network-Guided Matrix Completion

PubMed Central

Žitnik, Marinka; Zupan, Blaž

2015-01-01

Abstract Epistatic miniarray profile (E-MAP) is a popular large-scale genetic interaction discovery platform. E-MAPs benefit from quantitative output, which makes it possible to detect subtle interactions with greater precision. However, due to the limits of biotechnology, E-MAP studies fail to measure genetic interactions for up to 40% of gene pairs in an assay. Missing measurements can be recovered by computational techniques for data imputation, in this way completing the interaction profiles and enabling downstream analysis algorithms that could otherwise be sensitive to missing data values. We introduce a new interaction data imputation method called network-guided matrix completion (NG-MC). The core part of NG-MC is low-rank probabilistic matrix completion that incorporates prior knowledge presented as a collection of gene networks. NG-MC assumes that interactions are transitive, such that latent gene interaction profiles inferred by NG-MC depend on the profiles of their direct neighbors in gene networks. As the NG-MC inference algorithm progresses, it propagates latent interaction profiles through each of the networks and updates gene network weights toward improved prediction. In a study with four different E-MAP data assays and considered protein–protein interaction and gene ontology similarity networks, NG-MC significantly surpassed existing alternative techniques. Inclusion of information from gene networks also allowed NG-MC to predict interactions for genes that were not included in original E-MAP assays, a task that could not be considered by current imputation approaches. PMID:25658751
Prediction and Testing of Biological Networks Underlying Intestinal Cancer

PubMed Central

Mariadason, John M.; Wang, Donghai; Augenlicht, Leonard H.; Chance, Mark R.

2010-01-01

Colorectal cancer progresses through an accumulation of somatic mutations, some of which reside in so-called “driver” genes that provide a growth advantage to the tumor. To identify points of intersection between driver gene pathways, we implemented a network analysis framework using protein interactions to predict likely connections – both precedented and novel – between key driver genes in cancer. We applied the framework to find significant connections between two genes, Apc and Cdkn1a (p21), known to be synergistic in tumorigenesis in mouse models. We then assessed the functional coherence of the resulting Apc-Cdkn1a network by engineering in vivo single node perturbations of the network: mouse models mutated individually at Apc (Apc1638N+/−) or Cdkn1a (Cdkn1a−/−), followed by measurements of protein and gene expression changes in intestinal epithelial tissue. We hypothesized that if the predicted network is biologically coherent (functional), then the predicted nodes should associate more specifically with dysregulated genes and proteins than stochastically selected genes and proteins. The predicted Apc-Cdkn1a network was significantly perturbed at the mRNA-level by both single gene knockouts, and the predictions were also strongly supported based on physical proximity and mRNA coexpression of proteomic targets. These results support the functional coherence of the proposed Apc-Cdkn1a network and also demonstrate how network-based predictions can be statistically tested using high-throughput biological data. PMID:20824133

Network representations of immune system complexity

PubMed Central

Subramanian, Naeha; Torabi-Parizi, Parizad; Gottschalk, Rachel A.; Germain, Ronald N.; Dutta, Bhaskar

2015-01-01

The mammalian immune system is a dynamic multi-scale system composed of a hierarchically organized set of molecular, cellular and organismal networks that act in concert to promote effective host defense. These networks range from those involving gene regulatory and protein-protein interactions underlying intracellular signaling pathways and single cell responses to increasingly complex networks of in vivo cellular interaction, positioning and migration that determine the overall immune response of an organism. Immunity is thus not the product of simple signaling events but rather non-linear behaviors arising from dynamic, feedback-regulated interactions among many components. One of the major goals of systems immunology is to quantitatively measure these complex multi-scale spatial and temporal interactions, permitting development of computational models that can be used to predict responses to perturbation. Recent technological advances permit collection of comprehensive datasets at multiple molecular and cellular levels while advances in network biology support representation of the relationships of components at each level as physical or functional interaction networks. The latter facilitate effective visualization of patterns and recognition of emergent properties arising from the many interactions of genes, molecules, and cells of the immune system. We illustrate the power of integrating ‘omics’ and network modeling approaches for unbiased reconstruction of signaling and transcriptional networks with a focus on applications involving the innate immune system. We further discuss future possibilities for reconstruction of increasingly complex cellular and organism-level networks and development of sophisticated computational tools for prediction of emergent immune behavior arising from the concerted action of these networks. PMID:25625853
Mathematical inference and control of molecular networks from perturbation experiments

NASA Astrophysics Data System (ADS)

Mohammed-Rasheed, Mohammed

One of the main challenges facing biologists and mathematicians in the post genomic era is to understand the behavior of molecular networks and harness this understanding into an educated intervention of the cell. The cell maintains its function via an elaborate network of interconnecting positive and negative feedback loops of genes, RNA and proteins that send different signals to a large number of pathways and molecules. These structures are referred to as genetic regulatory networks (GRNs) or molecular networks. GRNs can be viewed as dynamical systems with inherent properties and mechanisms, such as steady-state equilibriums and stability, that determine the behavior of the cell. The biological relevance of the mathematical concepts are important as they may predict the differentiation of a stem cell, the maintenance of a normal cell, the development of cancer and its aberrant behavior, and the design of drugs and response to therapy. Uncovering the underlying GRN structure from gene/protein expression data, e.g., microarrays or perturbation experiments, is called inference or reverse engineering of the molecular network. Because of the high cost and time consuming nature of biological experiments, the number of available measurements or experiments is very small compared to the number of molecules (genes, RNA and proteins). In addition, the observations are noisy, where the noise is due to the measurements imperfections as well as the inherent stochasticity of genetic expression levels. Intra-cellular activities and extra-cellular environmental attributes are also another source of variability. Thus, the inference of GRNs is, in general, an under-determined problem with a highly noisy set of observations. The ultimate goal of GRN inference and analysis is to be able to intervene within the network, in order to force it away from undesirable cellular states and into desirable ones. However, it remains a major challenge to design optimal intervention strategies in order to affect the time evolution of molecular activity in a desirable manner. In this proposal, we address both the inference and control problems of GRNs. In the first part of the thesis, we consider the control problem. We assume that we are given a general topology network structure, whose dynamics follow a discrete-time Markov chain model. We subsequently develop a comprehensive framework for optimal perturbation control of the network. The aim of the perturbation is to drive the network away from undesirable steady-states and to force it to converge to a unique desirable steady-state. The proposed framework does not make any assumptions about the topology of the initial network (e.g., ergodicity, weak and strong connectivity), and is thus applicable to general topology networks. We define the optimal perturbation as the minimum-energy perturbation measured in terms of the Frobenius norm between the initial and perturbed networks. We subsequently demonstrate that there exists at most one optimal perturbation that forces the network into the desirable steady-state. In the event where the optimal perturbation does not exist, we construct a family of sub-optimal perturbations that approximate the optimal solution arbitrarily closely. In the second part of the thesis, we address the inference problem of GRNs from time series data. We model the dynamics of the molecules using a system of ordinary differential equations corrupted by additive white noise. For large-scale networks, we formulate the inference problem as a constrained maximum likelihood estimation problem. We derive the molecular interactions that maximize the likelihood function while constraining the network to be sparse. We further propose a procedure to recover weak interactions based on the Bayesian information criterion. For small-size networks, we investigated the inference of a globally stable 7-gene melanoma genetic regulatory network from genetic perturbation experiments. We considered five melanoma cell lines, who exhibit different motility/invasion behavior under the same perturbation experiment of gene Wnt5a. The results of the simulations validate both the steady state levels and the experimental data of the perturbation experiments of all five cell lines. The goal of this study is to answer important questions that link the response of the network to perturbations, as measured by the experiments, to its structure, i.e., connectivity. Answers to these questions shed novel insights on the structure of networks and how they react to perturbations.
An in silico assessment of gene function and organization of the phenylpropanoid pathway metabolic networks in Arabidopsis thaliana and limitations thereof

NASA Technical Reports Server (NTRS)

Costa, Michael A.; Collins, R. Eric; Anterola, Aldwin M.; Cochrane, Fiona C.; Davin, Laurence B.; Lewis, Norman G.

2003-01-01

The Arabidopsis genome sequencing in 2000 gave to science the first blueprint of a vascular plant. Its successful completion also prompted the US National Science Foundation to launch the Arabidopsis 2010 initiative, the goal of which is to identify the function of each gene by 2010. In this study, an exhaustive analysis of The Institute for Genomic Research (TIGR) and The Arabidopsis Information Resource (TAIR) databases, together with all currently compiled EST sequence data, was carried out in order to determine to what extent the various metabolic networks from phenylalanine ammonia lyase (PAL) to the monolignols were organized and/or could be predicted. In these databases, there are some 65 genes which have been annotated as encoding putative enzymatic steps in monolignol biosynthesis, although many of them have only very low homology to monolignol pathway genes of known function in other plant systems. Our detailed analysis revealed that presently only 13 genes (two PALs, a cinnamate-4-hydroxylase, a p-coumarate-3-hydroxylase, a ferulate-5-hydroxylase, three 4-coumarate-CoA ligases, a cinnamic acid O-methyl transferase, two cinnamoyl-CoA reductases) and two cinnamyl alcohol dehydrogenases can be classified as having a bona fide (definitive) function; the remaining 52 genes currently have undetermined physiological roles. The EST database entries for this particular set of genes also provided little new insight into how the monolignol pathway was organized in the different tissues and organs, this being perhaps a consequence of both limitations in how tissue samples were collected and in the incomplete nature of the EST collections. This analysis thus underscores the fact that even with genomic sequencing, presumed to provide the entire suite of putative genes in the monolignol-forming pathway, a very large effort needs to be conducted to establish actual catalytic roles (including enzyme versatility), as well as the physiological function(s) for each member of the (multi)gene families present and the metabolic networks that are operative. Additionally, one key to identifying physiological functions for many of these (and other) unknown genes, and their corresponding metabolic networks, awaits the development of technologies to comprehensively study molecular processes at the single cell level in particular tissues and organs, in order to establish the actual metabolic context.
An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein.

PubMed

Wang, Junbai; Wu, Qianqian; Hu, Xiaohua Tony; Tian, Tianhai

2016-11-01

Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation. Copyright © 2016 Elsevier Inc. All rights reserved.
Characterizing mutation-expression network relationships in multiple cancers.

PubMed

Ghazanfar, Shila; Yang, Jean Yee Hwa

2016-08-01

Data made available through large cancer consortia like The Cancer Genome Atlas make for a rich source of information to be studied across and between cancers. In recent years, network approaches have been applied to such data in uncovering the complex interrelationships between mutational and expression profiles, but lack direct testing for expression changes via mutation. In this pan-cancer study we analyze mutation and gene expression information in an integrative manner by considering the networks generated by testing for differences in expression in direct association with specific mutations. We relate our findings among the 19 cancers examined to identify commonalities and differences as well as their characteristics. Using somatic mutation and gene expression information across 19 cancers, we generated mutation-expression networks per cancer. On evaluation we found that our generated networks were significantly enriched for known cancer-related genes, such as skin cutaneous melanoma (p<0.01 using Network of Cancer Genes 4.0). Our framework identified that while different cancers contained commonly mutated genes, there was little concordance between associated gene expression changes among cancers. Comparison between cancers showed a greater overlap of network nodes for cancers with higher overall non-silent mutation load, compared to those with a lower overall non-silent mutation load. This study offers a framework that explores network information through co-analysis of somatic mutations and gene expression profiles. Our pan-cancer application of this approach suggests that while mutations are frequently common among cancer types, the impact they have on the surrounding networks via gene expression changes varies. Despite this finding, there are some cancers for which mutation-associated network behaviour appears to be similar: suggesting a potential framework for uncovering related cancers for which similar therapeutic strategies may be applicable. Our framework for understanding relationships among cancers has been integrated into an interactive R Shiny application, PAn Cancer Mutation Expression Networks (PACMEN), containing dynamic and static network visualization of the mutation-expression networks. PACMEN also features tools for further examination of network topology characteristics among cancers. Copyright © 2016 Elsevier Ltd. All rights reserved.
Towards systems genetic analyses in barley: Integration of phenotypic, expression and genotype data into GeneNetwork

PubMed Central

Druka, Arnis; Druka, Ilze; Centeno, Arthur G; Li, Hongqiang; Sun, Zhaohui; Thomas, William TB; Bonar, Nicola; Steffenson, Brian J; Ullrich, Steven E; Kleinhofs, Andris; Wise, Roger P; Close, Timothy J; Potokina, Elena; Luo, Zewei; Wagner, Carola; Schweizer, Günther F; Marshall, David F; Kearsey, Michael J; Williams, Robert W; Waugh, Robbie

2008-01-01

Background A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community. Description Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork . GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them. Conclusion By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets. PMID:19017390
Combinatorial influence of environmental parameters on transcription factor activity

PubMed Central

Knijnenburg, T.A.; Wessels, L.F.A.; Reinders, M.J.T.

2008-01-01

Motivation: Cells receive a wide variety of environmental signals, which are often processed combinatorially to generate specific genetic responses. Changes in transcript levels, as observed across different environmental conditions, can, to a large extent, be attributed to changes in the activity of transcription factors (TFs). However, in unraveling these transcription regulation networks, the actual environmental signals are often not incorporated into the model, simply because they have not been measured. The unquantified heterogeneity of the environmental parameters across microarray experiments frustrates regulatory network inference. Results: We propose an inference algorithm that models the influence of environmental parameters on gene expression. The approach is based on a yeast microarray compendium of chemostat steady-state experiments. Chemostat cultivation enables the accurate control and measurement of many of the key cultivation parameters, such as nutrient concentrations, growth rate and temperature. The observed transcript levels are explained by inferring the activity of TFs in response to combinations of cultivation parameters. The interplay between activated enhancers and repressors that bind a gene promoter determine the possible up- or downregulation of the gene. The model is translated into a linear integer optimization problem. The resulting regulatory network identifies the combinatorial effects of environmental parameters on TF activity and gene expression. Availability: The Matlab code is available from the authors upon request. Contact: t.a.knijnenburg@tudelft.nl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18586711
Gene regulatory and signaling networks exhibit distinct topological distributions of motifs

NASA Astrophysics Data System (ADS)

Ferreira, Gustavo Rodrigues; Nakaya, Helder Imoto; Costa, Luciano da Fontoura

2018-04-01

The biological processes of cellular decision making and differentiation involve a plethora of signaling pathways and gene regulatory circuits. These networks in turn exhibit a multitude of motifs playing crucial parts in regulating network activity. Here we compare the topological placement of motifs in gene regulatory and signaling networks and observe that it suggests different evolutionary strategies in motif distribution for distinct cellular subnetworks.
In Silico Enhancing M. tuberculosis Protein Interaction Networks in STRING To Predict Drug-Resistance Pathways and Pharmacological Risks.

PubMed

Mei, Suyu

2018-05-04

Bacterial protein-protein interaction (PPI) networks are significant to reveal the machinery of signal transduction and drug resistance within bacterial cells. The database STRING has collected a large number of bacterial pathogen PPI networks, but most of the data are of low quality without being experimentally or computationally validated, thus restricting its further biomedical applications. We exploit the experimental data via four solutions to enhance the quality of M. tuberculosis H37Rv (MTB) PPI networks in STRING. Computational results show that the experimental data derived jointly by two-hybrid and copurification approaches are the most reliable to train an L 2 -regularized logistic regression model for MTB PPI network validation. On the basis of the validated MTB PPI networks, we further study the three problems via breadth-first graph search algorithm: (1) discovery of MTB drug-resistance pathways through searching for the paths between known drug-target genes and drug-resistance genes, (2) choosing potential cotarget genes via searching for the critical genes located on multiple pathways, and (3) choosing essential drug-target genes via analysis of network degree distribution. In addition, we further combine the validated MTB PPI networks with human PPI networks to analyze the potential pharmacological risks of known and candidate drug-target genes from the point of view of system pharmacology. The evidence from protein structure alignment demonstrates that the drugs that act on MTB target genes could also adversely act on human signaling pathways.
The Reconstruction and Analysis of Gene Regulatory Networks.

PubMed

Zheng, Guangyong; Huang, Tao

2018-01-01

In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
Markov State Models of gene regulatory networks.

PubMed

Chu, Brian K; Tse, Margaret J; Sato, Royce R; Read, Elizabeth L

2017-02-06

Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fate-decisions. Quantitative models that can link molecular-level knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cell-reprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking. The method identifies the number, phenotypes, and lifetimes of long-lived states for a set of common gene regulatory network models. Application of transition path theory to the constructed Markov State Model decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic state-switching. In this proof-of-concept study, we found that the Markov State Model provides a general framework for analyzing and visualizing stochastic multistability and state-transitions in gene networks. Our results suggest that this framework-adopted from the field of atomistic Molecular Dynamics-can be a useful tool for quantitative Systems Biology at the network scale.
Unraveling the Tangled Skein: The Evolution of Transcriptional Regulatory Networks in Development.

PubMed

Rebeiz, Mark; Patel, Nipam H; Hinman, Veronica F

2015-01-01

The molecular and genetic basis for the evolution of anatomical diversity is a major question that has inspired evolutionary and developmental biologists for decades. Because morphology takes form during development, a true comprehension of how anatomical structures evolve requires an understanding of the evolutionary events that alter developmental genetic programs. Vast gene regulatory networks (GRNs) that connect transcription factors to their target regulatory sequences control gene expression in time and space and therefore determine the tissue-specific genetic programs that shape morphological structures. In recent years, many new examples have greatly advanced our understanding of the genetic alterations that modify GRNs to generate newly evolved morphologies. Here, we review several aspects of GRN evolution, including their deep preservation, their mechanisms of alteration, and how they originate to generate novel developmental programs.
Approximate geodesic distances reveal biologically relevant structures in microarray data.

PubMed

Nilsson, Jens; Fioretos, Thoas; Höglund, Mattias; Fontes, Magnus

2004-04-12

Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account.
Two-Way Gene Interaction From Microarray Data Based on Correlation Methods.

PubMed

Alavi Majd, Hamid; Talebi, Atefeh; Gilany, Kambiz; Khayyer, Nasibeh

2016-06-01

Gene networks have generated a massive explosion in the development of high-throughput techniques for monitoring various aspects of gene activity. Networks offer a natural way to model interactions between genes, and extracting gene network information from high-throughput genomic data is an important and difficult task. The purpose of this study is to construct a two-way gene network based on parametric and nonparametric correlation coefficients. The first step in constructing a Gene Co-expression Network is to score all pairs of gene vectors. The second step is to select a score threshold and connect all gene pairs whose scores exceed this value. In the foundation-application study, we constructed two-way gene networks using nonparametric methods, such as Spearman's rank correlation coefficient and Blomqvist's measure, and compared them with Pearson's correlation coefficient. We surveyed six genes of venous thrombosis disease, made a matrix entry representing the score for the corresponding gene pair, and obtained two-way interactions using Pearson's correlation, Spearman's rank correlation, and Blomqvist's coefficient. Finally, these methods were compared with Cytoscape, based on BIND, and Gene Ontology, based on molecular function visual methods; R software version 3.2 and Bioconductor were used to perform these methods. Based on the Pearson and Spearman correlations, the results were the same and were confirmed by Cytoscape and GO visual methods; however, Blomqvist's coefficient was not confirmed by visual methods. Some results of the correlation coefficients are not the same with visualization. The reason may be due to the small number of data.
Differential Gene Expression in Colon Tissue Associated With Diet, Lifestyle, and Related Oxidative Stress.

PubMed

Slattery, Martha L; Pellatt, Daniel F; Mullany, Lila E; Wolff, Roger K

2015-01-01

Several diet and lifestyle factors may impact health by influencing oxidative stress levels. We hypothesize that level of cigarette smoking, alcohol, anti-inflammatory drugs, and diet alter gene expression. We analyzed RNA-seq data from 144 colon cancer patients who had information on recent cigarette smoking, recent alcohol consumption, diet, and recent aspirin/non-steroidal anti-inflammatory use. Using a false discovery rate of 0.1, we evaluated gene differential expression between high and low levels of exposure using DESeq2. Ingenuity Pathway Analysis (IPA) was used to determine networks associated with de-regulated genes in our data. We identified 46 deregulated genes associated with recent cigarette use; these genes enriched causal networks regulated by TEK and MAP2K3. Different differentially expressed genes were associated with type of alcohol intake; five genes were associated with total alcohol, six were associated with beer intake, six were associated with wine intake, and four were associated with liquor consumption. Recent use of aspirin and/or ibuprofen was associated with differential expression of TMC06, ST8SIA4, and STEAP3 while a summary oxidative balance score (OBS) was associated with SYCP3, HDX, and NRG4 (all up-regulated with greater oxidative balance). Of the dietary antioxidants and carotenoids evaluated only intake of beta carotene (1 gene), Lutein/Zeaxanthine (5 genes), and Vitamin E (4 genes) were associated with differential gene expression. There were similarities in biological function of de-regulated genes associated with various dietary and lifestyle factors. Our data support the hypothesis that diet and lifestyle factors associated with oxidative stress can alter gene expression. However genes altered were unique to type of alcohol and type of antioxidant. Because of potential differences in associations observed between platforms these findings need replication in other populations.
Regulatory networks and connected components of the neutral space. A look at functional islands

NASA Astrophysics Data System (ADS)

Boldhaus, G.; Klemm, K.

2010-09-01

The functioning of a living cell is largely determined by the structure of its regulatory network, comprising non-linear interactions between regulatory genes. An important factor for the stability and evolvability of such regulatory systems is neutrality - typically a large number of alternative network structures give rise to the necessary dynamics. Here we study the discretized regulatory dynamics of the yeast cell cycle [Li et al., PNAS, 2004] and the set of networks capable of reproducing it, which we call functional. Among these, the empirical yeast wildtype network is close to optimal with respect to sparse wiring. Under point mutations, which establish or delete single interactions, the neutral space of functional networks is fragmented into ≈ 4.7 × 108 components. One of the smaller ones contains the wildtype network. On average, functional networks reachable from the wildtype by mutations are sparser, have higher noise resilience and fewer fixed point attractors as compared with networks outside of this wildtype component.
System-level insights into the cellular interactome of a non-model organism: inferring, modelling and analysing functional gene network of soybean (Glycine max).

PubMed

Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

2014-01-01

Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN.
System-Level Insights into the Cellular Interactome of a Non-Model Organism: Inferring, Modelling and Analysing Functional Gene Network of Soybean (Glycine max)

PubMed Central

Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

2014-01-01

Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN. PMID:25423109
BRAIN NETWORKS. Correlated gene expression supports synchronous activity in brain networks.

PubMed

Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D

2015-06-12

During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.
A parallel implementation of the network identification by multiple regression (NIR) algorithm to reverse-engineer regulatory gene networks.

PubMed

Gregoretti, Francesco; Belcastro, Vincenzo; di Bernardo, Diego; Oliva, Gennaro

2010-04-21

The reverse engineering of gene regulatory networks using gene expression profile data has become crucial to gain novel biological knowledge. Large amounts of data that need to be analyzed are currently being produced due to advances in microarray technologies. Using current reverse engineering algorithms to analyze large data sets can be very computational-intensive. These emerging computational requirements can be met using parallel computing techniques. It has been shown that the Network Identification by multiple Regression (NIR) algorithm performs better than the other ready-to-use reverse engineering software. However it cannot be used with large networks with thousands of nodes--as is the case in biological networks--due to the high time and space complexity. In this work we overcome this limitation by designing and developing a parallel version of the NIR algorithm. The new implementation of the algorithm reaches a very good accuracy even for large gene networks, improving our understanding of the gene regulatory networks that is crucial for a wide range of biomedical applications.

Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

PubMed

Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

2015-06-01

To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Identification of interactive gene networks: a novel approach in gene array profiling of myometrial events during guinea pig pregnancy.

PubMed

Mason, Clifford W; Swaan, Peter W; Weiner, Carl P

2006-06-01

The transition from myometrial quiescence to activation is poorly understood, and the analysis of array data is limited by the available data mining tools. We applied functional analysis and logical operations along regulatory gene networks to identify molecular processes and pathways underlying quiescence and activation. We analyzed some 18,400 transcripts and variants in guinea pig myometrium at stages corresponding to quiescence and activation, and compared them to the nonpregnant (control) counterpart using a functional mapping tool, MetaCore (GeneGo, St Joseph, MI) to identify novel gene networks composed of biological pathways during mid (MP) and late (LP) pregnancy. Genes altered during quiescence and or activation were identified following gene specific comparisons with myometrium from nonpregnant animals, and then linked to curated pathways and formulated networks. The MP and LP networks were subtracted from each other to identify unique genomic events during those periods. For example, changes 2-fold or greater in genes mediating protein biosynthesis, programmed cell death, microtubule polymerization, and microtubule based movement were noted during the transition to LP. We describe a novel approach combining microarrays and genetic data to identify networks associated with normal myometrial events. The resulting insights help identify potential biomarkers and permit future targeted investigations of these pathways or networks to confirm or refute their importance.
Competing endogenous RNA regulatory network in papillary thyroid carcinoma.

PubMed

Chen, Shouhua; Fan, Xiaobin; Gu, He; Zhang, Lili; Zhao, Wenhua

2018-05-11

The present study aimed to screen all types of RNAs involved in the development of papillary thyroid carcinoma (PTC). RNA‑sequencing data of PTC and normal samples were used for screening differentially expressed (DE) microRNAs (DE‑miRNAs), long non‑coding RNAs (DE‑lncRNAs) and genes (DEGs). Subsequently, lncRNA‑miRNA, miRNA‑gene (that is, miRNA‑mRNA) and gene‑gene interaction pairs were extracted and used to construct regulatory networks. Feature genes in the miRNA‑mRNA network were identified by topological analysis and recursive feature elimination analysis. A support vector machine (SVM) classifier was built using 15 feature genes, and its classification effect was validated using two microarray data sets that were downloaded from the Gene Expression Omnibus (GEO) database. In addition, Gene Ontology function and Kyoto Encyclopedia Genes and Genomes pathway enrichment analyses were conducted for genes identified in the ceRNA network. A total of 506 samples, including 447 tumor samples and 59 normal samples, were obtained from The Cancer Genome Atlas (TCGA); 16 DE‑lncRNAs, 917 DEGs and 30 DE‑miRNAs were screened. The miRNA‑mRNA regulatory network comprised 353 nodes and 577 interactions. From these data, 15 feature genes with high predictive precision (>95%) were extracted from the network and were used to form an SVM classifier with an accuracy of 96.05% (486/506) for PTC samples downloaded from TCGA, and accuracies of 96.81 and 98.46% for GEO downloaded data sets. The ceRNA regulatory network comprised 596 lines (or interactions) and 365 nodes. Genes in the ceRNA network were significantly enriched in 'neuron development', 'differentiation', 'neuroactive ligand‑receptor interaction', 'metabolism of xenobiotics by cytochrome P450', 'drug metabolism' and 'cytokine‑cytokine receptor interaction' pathways. Hox transcript antisense RNA, miRNA‑206 and kallikrein‑related peptidase 10 were nodes in the ceRNA regulatory network of the selected feature gene, and they may serve import roles in the development of PTC.
GARNET--gene set analysis with exploration of annotation relations.

PubMed

Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu

2011-02-15

Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).
RNA-seq analysis of the gonadal transcriptome during Alligator mississippiensis temperature-dependent sex determination and differentiation.

PubMed

Yatsu, Ryohei; Miyagawa, Shinichi; Kohno, Satomi; Parrott, Benjamin B; Yamaguchi, Katsushi; Ogino, Yukiko; Miyakawa, Hitoshi; Lowers, Russell H; Shigenobu, Shuji; Guillette, Louis J; Iguchi, Taisen

2016-01-25

The American alligator (Alligator mississippiensis) displays temperature-dependent sex determination (TSD), in which incubation temperature during embryonic development determines the sexual fate of the individual. However, the molecular mechanisms governing this process remain a mystery, including the influence of initial environmental temperature on the comprehensive gonadal gene expression patterns occurring during TSD. Our characterization of transcriptomes during alligator TSD allowed us to identify novel candidate genes involved in TSD initiation. High-throughput RNA sequencing (RNA-seq) was performed on gonads collected from A. mississippiensis embryos incubated at both a male and a female producing temperature (33.5 °C and 30 °C, respectively) in a time series during sexual development. RNA-seq yielded 375.2 million paired-end reads, which were mapped and assembled, and used to characterize differential gene expression. Changes in the transcriptome occurring as a function of both development and sexual differentiation were extensively profiled. Forty-one differentially expressed genes were detected in response to incubation at male producing temperature, and included genes such as Wnt signaling factor WNT11, histone demethylase KDM6B, and transcription factor C/EBPA. Furthermore, comparative analysis of development- and sex-dependent differential gene expression revealed 230 candidate genes involved in alligator sex determination and differentiation, and early details of the suspected male-fate commitment were profiled. We also discovered sexually dimorphic expression of uncharacterized ncRNAs and other novel elements, such as unique expression patterns of HEMGN and ARX. Twenty-five of the differentially expressed genes identified in our analysis were putative transcriptional regulators, among which were MYBL2, MYCL, and HOXC10, in addition to conventional sex differentiation genes such as SOX9, and FOXL2. Inferred gene regulatory network was constructed, and the gene-gene and temperature-gene interactions were predicted. Gonadal global gene expression kinetics during sex determination has been extensively profiled for the first time in a TSD species. These findings provide insights into the genetic framework underlying TSD, and expand our current understanding of the developmental fate pathways during vertebrate sex determination.
Modulation of dynamic modes by interplay between positive and negative feedback loops in gene regulatory networks

NASA Astrophysics Data System (ADS)

Wang, Liu-Suo; Li, Ning-Xi; Chen, Jing-Jia; Zhang, Xiao-Peng; Liu, Feng; Wang, Wei

2018-04-01

A positive and a negative feedback loop can induce bistability and oscillation, respectively, in biological networks. Nevertheless, they are frequently interlinked to perform more elaborate functions in many gene regulatory networks. Coupled positive and negative feedback loops may exhibit either oscillation or bistability depending on the intensity of the stimulus in some particular networks. It is less understood how the transition between the two dynamic modes is modulated by the positive and negative feedback loops. We developed an abstract model of such systems, largely based on the core p53 pathway, to explore the mechanism for the transformation of dynamic behaviors. Our results show that enhancing the positive feedback may promote or suppress oscillations depending on the strength of both feedback loops. We found that the system oscillates with low amplitudes in response to a moderate stimulus and switches to the on state upon a strong stimulus. When the positive feedback is activated much later than the negative one in response to a strong stimulus, the system exhibits long-term oscillations before switching to the on state. We explain this intriguing phenomenon using quasistatic approximation. Moreover, early switching to the on state may occur when the system starts from a steady state in the absence of stimuli. The interplay between the positive and negative feedback plays a key role in the transitions between oscillation and bistability. Of note, our conclusions should be applicable only to some specific gene regulatory networks, especially the p53 network, in which both oscillation and bistability exist in response to a certain type of stimulus. Our work also underscores the significance of transient dynamics in determining cellular outcome.
Uncovering co-expression gene network regulating fruit acidity in diverse apples

USDA-ARS?s Scientific Manuscript database

Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that encodes an Aluminum-activated Malate Transporter1 (...
Maternal age influences folliculogenesis and gene networks in the ovaries of beef heifers

USDA-ARS?s Scientific Manuscript database

The size of the ovarian reserve is an important component of fertility and reproductive longevity in bovine females. Ultrasonographic determination of antral follicle count is the best method for estimating the size of the ovarian reserve in heifers and cows in a production setting. Antral follicl...
Stability and structural properties of gene regulation networks with coregulation rules.

PubMed

Warrell, Jonathan; Mhlanga, Musa

2017-05-07

Coregulation of the expression of groups of genes has been extensively demonstrated empirically in bacterial and eukaryotic systems. Such coregulation can arise through the use of shared regulatory motifs, which allow the coordinated expression of modules (and module groups) of functionally related genes across the genome. Coregulation can also arise through the physical association of multi-gene complexes through chromosomal looping, which are then transcribed together. We present a general formalism for modeling coregulation rules in the framework of Random Boolean Networks (RBN), and develop specific models for transcription factor networks with modular structure (including module groups, and multi-input modules (MIM) with autoregulation) and multi-gene complexes (including hierarchical differentiation between multi-gene complex members). We develop a mean-field approach to analyse the dynamical stability of large networks incorporating coregulation, and show that autoregulated MIM and hierarchical gene-complex models can achieve greater stability than networks without coregulation whose rules have matching activation frequency. We provide further analysis of the stability of small networks of both kinds through simulations. We also characterize several general properties of the transients and attractors in the hierarchical coregulation model, and show using simulations that the steady-state distribution factorizes hierarchically as a Bayesian network in a Markov Jump Process analogue of the RBN model. Copyright © 2017. Published by Elsevier Ltd.
Differential network entropy reveals cancer system hallmarks

PubMed Central

West, James; Bianconi, Ginestra; Severini, Simone; Teschendorff, Andrew E.

2012-01-01

The cellular phenotype is described by a complex network of molecular interactions. Elucidating network properties that distinguish disease from the healthy cellular state is therefore of critical importance for gaining systems-level insights into disease mechanisms and ultimately for developing improved therapies. By integrating gene expression data with a protein interaction network we here demonstrate that cancer cells are characterised by an increase in network entropy. In addition, we formally demonstrate that gene expression differences between normal and cancer tissue are anticorrelated with local network entropy changes, thus providing a systemic link between gene expression changes at the nodes and their local correlation patterns. In particular, we find that genes which drive cell-proliferation in cancer cells and which often encode oncogenes are associated with reductions in network entropy. These findings may have potential implications for identifying novel drug targets. PMID:23150773
Inferring Time-Varying Network Topologies from Gene Expression Data

PubMed Central

2007-01-01

Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster—to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence. PMID:18309363
Inferring time-varying network topologies from gene expression data.

PubMed

Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas

2007-01-01

Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Potential Regulators Driving the Transition in Nonalcoholic Fatty Liver Disease: a Stage-Based View.

PubMed

Lou, Yi; Chen, Yi-Dan; Sun, Fu-Rong; Shi, Jun-Ping; Song, Yu; Yang, Jin

2017-01-01

The incidence of nonalcoholic fatty liver disease (NAFLD), ranging from mild steatosis to hepatocellular injury and inflammation, increases with the rise of obesity. However, the implications of transcription factors network in progressive NAFLD remain to be determined. A co-regulatory network approach by combining gene expression and transcription influence was utilized to dissect transcriptional regulators in different NAFLD stages. In vivo, mice models of NAFLD were used to investigate whether dysregulated expression be undertaken by transcriptional regulators. Through constructing a large-scale co-regulatory network, sample-specific regulator activity was estimated. The combinations of active regulators that drive the progression of NAFLD were identified. Next, top regulators in each stage of NAFLD were determined, and the results were validated using the different experiments and bariatric surgical samples. In particular, Adipocyte enhancer-binding protein 1 (AEBP1) showed increased transcription activity in nonalcoholic steatohepatitis (NASH). Further characterization of the AEBP1 related transcription program defined its co-regulators, targeted genes, and functional organization. The dynamics of AEBP1 and its potential targets were verified in an animal model of NAFLD. This study identifies putative functions for several transcription factors in the pathogenesis of NAFLD and may thus point to potential targets for therapeutic interventions. © 2017 The Author(s) Published by S. Karger AG, Basel.
Disease Modeling via Large-Scale Network Analysis

DTIC Science & Technology

2015-05-20

SECURITY CLASSIFICATION OF: A central goal of genetics is to learn how the genotype of an organism determines its phenotype. We address the implicit...guarantees for the methods. In the past, we have developed predictive methods general enough to apply to potentially any genetic trait, varying from... genetics is to learn how the genotype of an organism determines its phenotype. We address the implicit problem of predicting the association of genes with
Characterizing Transcriptional Networks in Male Rainbow Darter (Etheostoma caeruleum) that Regulate Testis Development over a Complete Reproductive Cycle

PubMed Central

McMaster, Mark E.; Servos, Mark R.; Martyniuk, Christopher J.; Munkittrick, Kelly R.

2016-01-01

Intersex is a condition that has been associated with exposure to sewage effluents in male rainbow darter (Etheostoma caeruleum). To better understand changes in the transcriptome that are associated with intersex, we characterized annual changes in the testis transcriptome in wild, unexposed fish. Rainbow darter males were collected from the Grand River (Ontario, Canada) in May (spawning), August (post-spawning), October (recrudescence), January (developing) and March (pre-spawning). Histology was used to determine the proportion of spermatogenic cell types that were present during each period of testicular maturation. Regression analysis determined that the proportion of spermatozoa versus spermatocytes in all stages of development (R2 ≥ 0.58) were inversely related; however this was not the case when males were in the post-spawning period. Gene networks that were specific to the transition from developing to pre-spawning stages included nitric oxide biosynthesis, response to wounding, sperm cell function, and stem cell maintenance. The pre-spawning to spawning transition included gene networks related to amino acid import, glycogenesis, Sertoli cell proliferation, sperm capacitation, and sperm motility. The spawning to post-spawning transition included unique gene networks associated with chromosome condensation, ribosome biogenesis and assembly, and mitotic spindle assembly. Lastly, the transition from post-spawning to recrudescence included gene networks associated with egg activation, epithelial to mesenchymal transition, membrane fluidity, and sperm cell adhesion. Noteworthy was that there were a significant number of gene networks related to immune system function that were differentially expressed throughout reproduction, suggesting that immune network signalling has a prominent role in the male testis. Transcripts in the testis of post-spawning individuals showed patterns of expression that were most different for the majority of transcripts investigated when compared to the other stages. Interestingly, many transcripts associated with female sex differentiation (i.e. esr1, sox9, cdca8 and survivin) were significantly higher in the testis during the post-spawning season compared to other testis stages. At post-spawning, there were higher levels of estrogen and androgen receptors (esr1, esr2, ar) in the testis, while there was a decrease in the levels of sperm associated antigen 1 (spag1) and spermatogenesis associated 4 (spata4) mRNA. Cyp17a was more abundant in the testis of fish in the pre-spawning, spawning, and post-spawning seasons compared to those individuals that were recrudescent while aromatase (cyp19a) did not vary in expression over the year. This study identifies cell process related to testis development in a seasonally spawning species and improves our understanding regarding the molecular signaling events that underlie testicular growth. This is significant because, while there are a number of studies characterizing molecular pathways in the ovary, there are comparatively less describing transcriptomic patterns in the testis in wild fish. PMID:27861489
NEAT: an efficient network enrichment analysis test.

PubMed

Signorelli, Mirko; Vinciotti, Veronica; Wit, Ernst C

2016-09-05

Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
A single determinant dominates the rate of yeast protein evolution.

PubMed

Drummond, D Allan; Raval, Alpan; Wilke, Claus O

2006-02-01

A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

PubMed Central

Baumbach, Jan

2007-01-01

Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320
A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets.

PubMed

Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun

2014-01-01

As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
Discover mouse gene coexpression landscapes using dictionary learning and sparse coding.

PubMed

Li, Yujie; Chen, Hanbo; Jiang, Xi; Li, Xiang; Lv, Jinglei; Peng, Hanchuan; Tsien, Joe Z; Liu, Tianming

2017-12-01

Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will illuminate the higher-order transcriptome organization and offer genetic foundations of functional circuitry. Here using dictionary learning and sparse coding, we derived coexpression networks from the space-resolved anatomical comprehensive in situ hybridization data from Allen Mouse Brain Atlas dataset. The key idea is that if two genes use the same dictionary to represent their original signals, then their gene expressions must share similar patterns, thereby considering them as "coexpressed." For each network, we have simultaneous knowledge of spatial distributions, the genes in the network and the extent a particular gene conforms to the coexpression pattern. Gene ontologies and the comparisons with published gene lists reveal biologically identified coexpression networks, some of which correspond to major cell types, biological pathways, and/or anatomical regions.

Challenges of the information age: the impact of false discovery on pathway identification.

PubMed

Rog, Colin J; Chekuri, Srinivasa C; Edgerton, Mary E

2012-11-21

Pathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored. The frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene. This can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.
Genome-wide network of regulatory genes for construction of a chordate embryo.

PubMed

Shoguchi, Eiichi; Hamaguchi, Makoto; Satoh, Nori

2008-04-15

Animal development is controlled by gene regulation networks that are composed of sequence-specific transcription factors (TF) and cell signaling molecules (ST). Although housekeeping genes have been reported to show clustering in the animal genomes, whether the genes comprising a given regulatory network are physically clustered on a chromosome is uncertain. We examined this question in the present study. Ascidians are the closest living relatives of vertebrates, and their tadpole-type larva represents the basic body plan of chordates. The Ciona intestinalis genome contains 390 core TF genes and 119 major ST genes. Previous gene disruption assays led to the formulation of a basic chordate embryonic blueprint, based on over 3000 genetic interactions among 79 zygotic regulatory genes. Here, we mapped the regulatory genes, including all 79 regulatory genes, on the 14 pairs of Ciona chromosomes by fluorescent in situ hybridization (FISH). Chromosomal localization of upstream and downstream regulatory genes demonstrates that the components of coherent developmental gene networks are evenly distributed over the 14 chromosomes. Thus, this study provides the first comprehensive evidence that the physical clustering of regulatory genes, or their target genes, is not relevant for the genome-wide control of gene expression during development.
MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma

PubMed Central

Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S.; Theis, Fabian J.

2015-01-01

MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method “miRlastic”, which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic. PMID:26694379
MicroRNA-Target Network Inference and Local Network Enrichment Analysis Identify Two microRNA Clusters with Distinct Functions in Head and Neck Squamous Cell Carcinoma.

PubMed

Sass, Steffen; Pitea, Adriana; Unger, Kristian; Hess, Julia; Mueller, Nikola S; Theis, Fabian J

2015-12-18

MicroRNAs represent ~22 nt long endogenous small RNA molecules that have been experimentally shown to regulate gene expression post-transcriptionally. One main interest in miRNA research is the investigation of their functional roles, which can typically be accomplished by identification of mi-/mRNA interactions and functional annotation of target gene sets. We here present a novel method "miRlastic", which infers miRNA-target interactions using transcriptomic data as well as prior knowledge and performs functional annotation of target genes by exploiting the local structure of the inferred network. For the network inference, we applied linear regression modeling with elastic net regularization on matched microRNA and messenger RNA expression profiling data to perform feature selection on prior knowledge from sequence-based target prediction resources. The novelty of miRlastic inference originates in predicting data-driven intra-transcriptome regulatory relationships through feature selection. With synthetic data, we showed that miRlastic outperformed commonly used methods and was suitable even for low sample sizes. To gain insight into the functional role of miRNAs and to determine joint functional properties of miRNA clusters, we introduced a local enrichment analysis procedure. The principle of this procedure lies in identifying regions of high functional similarity by evaluating the shortest paths between genes in the network. We can finally assign functional roles to the miRNAs by taking their regulatory relationships into account. We thoroughly evaluated miRlastic on a cohort of head and neck cancer (HNSCC) patients provided by The Cancer Genome Atlas. We inferred an mi-/mRNA regulatory network for human papilloma virus (HPV)-associated miRNAs in HNSCC. The resulting network best enriched for experimentally validated miRNA-target interaction, when compared to common methods. Finally, the local enrichment step identified two functional clusters of miRNAs that were predicted to mediate HPV-associated dysregulation in HNSCC. Our novel approach was able to characterize distinct pathway regulations from matched miRNA and mRNA data. An R package of miRlastic was made available through: http://icb.helmholtz-muenchen.de/mirlastic.
Systems Level Analysis of Systemic Sclerosis Shows a Network of Immune and Profibrotic Pathways Connected with Genetic Polymorphisms

PubMed Central

Mahoney, J. Matthew; Taroni, Jaclyn; Martyanov, Viktor; Wood, Tammara A.; Greene, Casey S.; Pioli, Patricia A.; Hinchcliff, Monique E.; Whitfield, Michael L.

2015-01-01

Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6–12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes using a gene-gene interaction network, and place the genetic risk loci in the context of the intrinsic subsets. To identify gene expression modules common to three independent datasets from three different clinical centers, we developed a consensus clustering procedure based on mutual information of partitions, an information theory concept, and performed a meta-analysis of these genome-wide gene expression datasets. We created a gene-gene interaction network of the conserved molecular features across the intrinsic subsets and analyzed their connections with SSc-associated genetic polymorphisms. The network is composed of distinct, but interconnected, components related to interferon activation, M2 macrophages, adaptive immunity, extracellular matrix remodeling, and cell proliferation. The network shows extensive connections between the inflammatory- and fibroproliferative-specific genes. The network also shows connections between these subset-specific genes and 30 SSc-associated polymorphic genes including STAT4, BLK, IRF7, NOTCH4, PLAUR, CSK, IRAK1, and several human leukocyte antigen (HLA) genes. Our analyses suggest that the gene expression changes underlying the SSc subsets may be long-lived, but mechanistically interconnected and related to a patients underlying genetic risk. PMID:25569146
Comparison of tumor related signaling pathways with known compounds to determine potential agents for lung adenocarcinoma.

PubMed

Xu, Song; Liu, Renwang; Da, Yurong

2018-06-05

This study compared tumor-related signaling pathways with known compounds to determine potential agents for lung adenocarcinoma (LUAD) treatment. Kyoto Encyclopedia of Genes and Genomes signaling pathway analyses were performed based on LUAD differentially expressed genes from The Cancer Genome Atlas (TCGA) project and genotype-tissue expression controls. These results were compared to various known compounds using the Connectivity Mapping dataset. The clinical significance of the hub genes identified by overlapping pathway enrichment analysis was further investigated using data mining from multiple sources. A drug-pathway network for LUAD was constructed, and molecular docking was carried out. After the integration of 57 LUAD-related pathways and 35 pathways affected by small molecules, five overlapping pathways were revealed. Among these five pathways, the p53 signaling pathway was the most significant, with CCNB1, CCNB2, CDK1, CDKN2A, and CHEK1 being identified as hub genes. The p53 signaling pathway is implicated as a risk factor for LUAD tumorigenesis and survival. A total of 88 molecules significantly inhibiting the five LUAD-related oncogenic pathways were involved in the LUAD drug-pathway network. Daunorubicin, mycophenolic acid, and pyrvinium could potentially target the hub gene CHEK1 directly. Our study highlights the critical pathways that should be targeted in the search for potential LUAD treatments, most importantly, the p53 signaling pathway. Some compounds, such as ciclopirox and AG-028671, may have potential roles for LUAD treatment but require further experimental verification. © 2018 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Construction of ontology augmented networks for protein complex prediction.

PubMed

Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

2013-01-01

Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
Modular transcriptional repertoire and MicroRNA target analyses characterize genomic dysregulation in the thymus of Down syndrome infants

PubMed Central

Moreira-Filho, Carlos Alberto; Bando, Silvia Yumi; Bertonha, Fernanda Bernardi; Silva, Filipi Nascimento; da Fontoura Costa, Luciano; Ferreira, Leandro Rodrigues; Furlanetto, Glaucio; Chacur, Paulo; Zerbini, Maria Claudia Nogueira; Carneiro-Sampaio, Magda

2016-01-01

Trisomy 21-driven transcriptional alterations in human thymus were characterized through gene coexpression network (GCN) and miRNA-target analyses. We used whole thymic tissue - obtained at heart surgery from Down syndrome (DS) and karyotipically normal subjects (CT) - and a network-based approach for GCN analysis that allows the identification of modular transcriptional repertoires (communities) and the interactions between all the system's constituents through community detection. Changes in the degree of connections observed for hierarchically important hubs/genes in CT and DS networks corresponded to community changes. Distinct communities of highly interconnected genes were topologically identified in these networks. The role of miRNAs in modulating the expression of highly connected genes in CT and DS was revealed through miRNA-target analysis. Trisomy 21 gene dysregulation in thymus may be depicted as the breakdown and altered reorganization of transcriptional modules. Leading networks acting in normal or disease states were identified. CT networks would depict the “canonical” way of thymus functioning. Conversely, DS networks represent a “non-canonical” way, i.e., thymic tissue adaptation under trisomy 21 genomic dysregulation. This adaptation is probably driven by epigenetic mechanisms acting at chromatin level and through the miRNA control of transcriptional programs involving the networks' high-hierarchy genes. PMID:26848775
Co-expression network with protein-protein interaction and transcription regulation in malaria parasite Plasmodium falciparum.

PubMed

Yu, Fu-Dong; Yang, Shao-You; Li, Yuan-Yuan; Hu, Wei

2013-04-10

Malaria continues to be one of the most severe global infectious diseases, as a major threat to human health and economic development. Network-based biological analysis is a promising approach to uncover key genes and biological processes from a network viewpoint, which could not be recognized from individual gene-based signatures. We integrated gene co-expression profile with protein-protein interaction and transcriptional regulation information to construct a comprehensive gene co-expression network of Plasmodium falciparum. Based on this network, we identified 10 core modules by using ICE (Iterative Clique Enumeration) algorithm, which were essential for malaria parasite development in intraerythrocytic developmental cycle (IDC) stages. In each module, all genes were highly correlated probably due to co-regulation or formation of a protein complex. Some of these genes were recognized to be differentially coexpressed among three close-by IDC stages. The gene of prpf8 (PFD0265w) encoding pre-mRNA processing splicing factor 8 product was identified as DCGs (differentially co-expressed genes) among IDC stages, although this gene function was seldom reported in previous researches. Integrating the species-specific gene prediction and differential co-expression gene detection, we found some modules could perform species-specific functions according to some of genes in these modules were species-specific genes, like the module 10. Furthermore, in order to reveal the underlying mechanisms of the erythrocyte invasion by P. falciparum, Steiner Tree algorithm was employed to identify the invasion subnetwork from our gene co-expression network. The subnetwork-based analysis indicated that some important Plasmodium parasite specific genes could corporate with each other and be co-regulated during the parasite invasion process, which including a head-to-head gene pair of PfRH2a (PF13_0198) and PfRH2b (MAL13P1.176). This study based on gene co-expression network could shed new insights on the mechanisms of pathogenesis, even virulence and P. falciparum development. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Mitochondrial Gene Expression Profiles and Metabolic Pathways in the Amygdala Associated with Exaggerated Fear in an Animal Model of PTSD.

PubMed

Li, He; Li, Xin; Smerin, Stanley E; Zhang, Lei; Jia, Min; Xing, Guoqiang; Su, Yan A; Wen, Jillian; Benedek, David; Ursano, Robert

2014-01-01

The metabolic mechanisms underlying the development of exaggerated fear in post-traumatic stress disorder (PTSD) are not well defined. In the present study, alteration in the expression of genes associated with mitochondrial function in the amygdala of an animal model of PTSD was determined. Amygdala tissue samples were excised from 10 non-stressed control rats and 10 stressed rats, 14 days post-stress treatment. Total RNA was isolated, cDNA was synthesized, and gene expression levels were determined using a cDNA microarray. During the development of the exaggerated fear associated with PTSD, 48 genes were found to be significantly upregulated and 37 were significantly downregulated in the amygdala complex based on stringent criteria (p < 0.01). Ingenuity pathway analysis revealed up- or downregulation in the amygdala complex of four signaling networks - one associated with inflammatory and apoptotic pathways, one with immune mediators and metabolism, one with transcriptional factors, and one with chromatin remodeling. Thus, informatics of a neuronal gene array allowed us to determine the expression profile of mitochondrial genes in the amygdala complex of an animal model of PTSD. The result is a further understanding of the metabolic and neuronal signaling mechanisms associated with delayed and exaggerated fear.
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

PubMed

Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

2014-01-01

One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
Pre-Clinical Drug Prioritization via Prognosis-Guided Genetic Interaction Networks

PubMed Central

Xiong, Jianghui; Liu, Juan; Rayner, Simon; Tian, Ze; Li, Yinghui; Chen, Shanguang

2010-01-01

The high rates of failure in oncology drug clinical trials highlight the problems of using pre-clinical data to predict the clinical effects of drugs. Patient population heterogeneity and unpredictable physiology complicate pre-clinical cancer modeling efforts. We hypothesize that gene networks associated with cancer outcome in heterogeneous patient populations could serve as a reference for identifying drug effects. Here we propose a novel in vivo genetic interaction which we call ‘synergistic outcome determination’ (SOD), a concept similar to ‘Synthetic Lethality’. SOD is defined as the synergy of a gene pair with respect to cancer patients' outcome, whose correlation with outcome is due to cooperative, rather than independent, contributions of genes. The method combines microarray gene expression data with cancer prognostic information to identify synergistic gene-gene interactions that are then used to construct interaction networks based on gene modules (a group of genes which share similar function). In this way, we identified a cluster of important epigenetically regulated gene modules. By projecting drug sensitivity-associated genes on to the cancer-specific inter-module network, we defined a perturbation index for each drug based upon its characteristic perturbation pattern on the inter-module network. Finally, by calculating this index for compounds in the NCI Standard Agent Database, we significantly discriminated successful drugs from a broad set of test compounds, and further revealed the mechanisms of drug combinations. Thus, prognosis-guided synergistic gene-gene interaction networks could serve as an efficient in silico tool for pre-clinical drug prioritization and rational design of combinatorial therapies. PMID:21085674
PRODIGEN: visualizing the probability landscape of stochastic gene regulatory networks in state and time space.

PubMed

Ma, Chihua; Luciani, Timothy; Terebus, Anna; Liang, Jie; Marai, G Elisabeta

2017-02-15

Visualizing the complex probability landscape of stochastic gene regulatory networks can further biologists' understanding of phenotypic behavior associated with specific genes. We present PRODIGEN (PRObability DIstribution of GEne Networks), a web-based visual analysis tool for the systematic exploration of probability distributions over simulation time and state space in such networks. PRODIGEN was designed in collaboration with bioinformaticians who research stochastic gene networks. The analysis tool combines in a novel way existing, expanded, and new visual encodings to capture the time-varying characteristics of probability distributions: spaghetti plots over one dimensional projection, heatmaps of distributions over 2D projections, enhanced with overlaid time curves to display temporal changes, and novel individual glyphs of state information corresponding to particular peaks. We demonstrate the effectiveness of the tool through two case studies on the computed probabilistic landscape of a gene regulatory network and of a toggle-switch network. Domain expert feedback indicates that our visual approach can help biologists: 1) visualize probabilities of stable states, 2) explore the temporal probability distributions, and 3) discover small peaks in the probability landscape that have potential relation to specific diseases.
Gene function prediction with gene interaction networks: a context graph kernel approach.

PubMed

Li, Xin; Chen, Hsinchun; Li, Jiexun; Zhang, Zhu

2010-01-01

Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels.
Exercise-associated DNA methylation change in skeletal muscle and the importance of imprinted genes: a bioinformatics meta-analysis.

PubMed

Brown, William M

2015-12-01

Epigenetics is the study of processes--beyond DNA sequence alteration--producing heritable characteristics. For example, DNA methylation modifies gene expression without altering the nucleotide sequence. A well-studied DNA methylation-based phenomenon is genomic imprinting (ie, genotype-independent parent-of-origin effects). We aimed to elucidate: (1) the effect of exercise on DNA methylation and (2) the role of imprinted genes in skeletal muscle gene networks (ie, gene group functional profiling analyses). Gene ontology (ie, gene product elucidation)/meta-analysis. 26 skeletal muscle and 86 imprinted genes were subjected to g:Profiler ontology analysis. Meta-analysis assessed exercise-associated DNA methylation change. g:Profiler found four muscle gene networks with imprinted loci. Meta-analysis identified 16 articles (387 genes/1580 individuals) associated with exercise. Age, method, sample size, sex and tissue variation could elevate effect size bias. Only skeletal muscle gene networks including imprinted genes were reported. Exercise-associated effect sizes were calculated by gene. Age, method, sample size, sex and tissue variation were moderators. Six imprinted loci (RB1, MEG3, UBE3A, PLAGL1, SGCE, INS) were important for muscle gene networks, while meta-analysis uncovered five exercise-associated imprinted loci (KCNQ1, MEG3, GRB10, L3MBTL1, PLAGL1). DNA methylation decreased with exercise (60% of loci). Exercise-associated DNA methylation change was stronger among older people (ie, age accounted for 30% of the variation). Among older people, genes exhibiting DNA methylation decreases were part of a microRNA-regulated gene network functioning to suppress cancer. Imprinted genes were identified in skeletal muscle gene networks and exercise-associated DNA methylation change. Exercise-associated DNA methylation modification could rewind the 'epigenetic clock' as we age. CRD42014009800. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Sexually dimorphic transcriptomic responses in the teleostean hypothalamus: a case study with the organochlorine pesticide dieldrin.

PubMed

Martyniuk, Christopher J; Doperalski, Nicholas J; Kroll, Kevin J; Barber, David S; Denslow, Nancy D

2013-01-01

Organochlorine pesticides (OCPs) such as dieldrin are a persistent class of aquatic pollutants that cause adverse neurological and reproductive effects in vertebrates. In this study, female and male largemouth bass (Micropterus salmoides) (LMB) were exposed to 3mg dieldrin/kg feed in a 2 month feeding exposure (August-October) to (1) determine if the hypothalamic transcript responses to dieldrin were conserved between the sexes; (2) characterize cell signaling cascades underlying dieldrin neurotoxicity; and (3) determine whether or not co-feeding with 17β-estradiol (E(2)), a hormone with neuroprotective roles, mitigates responses in males to dieldrin. Despite also being a weak estrogen, dieldrin treatments did not elicit changes in reproductive endpoints (e.g. gonadosomatic index, vitellogenin, or plasma E(2)). Sub-network (SNEA) and gene set enrichment analysis (GSEA) revealed that neuro-hormone networks, neurotransmitter and nuclear receptor signaling, and the activin signaling network were altered by dieldrin exposure. Most striking was that the majority of cell pathways identified by the gene set enrichment were significantly increased in females while the majority of cell pathways were significantly decreased in males fed dieldrin. These data suggest that (1) there are sexually dimorphic responses in the teleost hypothalamus; (2) neurotransmitter systems are a target of dieldrin at the transcriptomics level; and (3) males co-fed dieldrin and E(2) had the fewest numbers of genes and cell pathways altered in the hypothalamus, suggesting that E(2) may mitigate the effects of dieldrin in the central nervous system. Copyright © 2012 Elsevier Inc. All rights reserved.
Sexually dimorphic transcriptomic responses in the teleostean hypothalamus: A case study with the organochlorine pesticide dieldrin

PubMed Central

Martyniuk, Christopher J.; Doperalski, Nicholas J.; Kroll, Kevin J.; Barber, David S.; Denslow, Nancy D.

2013-01-01

Organochlorine pesticides (OCPs) such as dieldrin are a persistent class of aquatic pollutants that cause adverse neurological and reproductive effects in vertebrates. In this study, female and male largemouth bass (Micropterus salmoides) (LMB) were exposed to 3 mg dieldrin/kg feed in a 2 month feeding exposure (August–October) to (1) determine if the hypothalamic transcript responses to dieldrin were conserved between the sexes; (2) characterize cell signaling cascades underlying dieldrin neurotoxicity; and (3) determine whether or not co-feeding with 17β-estradiol (E2), a hormone with neuroprotective roles, mitigates responses in males to dieldrin. Despite also being a weak estrogen, dieldrin treatments did not elicit changes in reproductive endpoints (e.g. gonadosomatic index, vitellogenin, or plasma E2). Sub-network (SNEA) and gene set enrichment analysis (GSEA) revealed that neuro-hormone networks, neurotransmitter and nuclear receptor signaling, and the activin signaling network were altered by dieldrin exposure. Most striking was that the majority of cell pathways identified by the gene set enrichment were significantly increased in females while the majority of cell pathways were significantly decreased in males fed dieldrin. These data suggest that (1) there are sexually dimorphic responses in the teleost hypothalamus; (2) neurotransmitter systems are a target of dieldrin at the transcriptomics level; and (3) males co-fed dieldrin and E2 had the fewest numbers of genes and cell pathways altered in the hypothalamus, suggesting that E2 may mitigate the effects of dieldrin in the central nervous system. PMID:23041725
Tools and Models for Integrating Multiple Cellular Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerstein, Mark

2015-11-06

In this grant, we have systematically investigated the integrated networks, which are responsible for the coordination of activity between metabolic pathways in prokaryotes. We have developed several computational tools to analyze the topology of the integrated networks consisting of metabolic, regulatory, and physical interaction networks. The tools are all open-source, and they are available to download from Github, and can be incorporated in the Knowledgebase. Here, we summarize our work as follow. Understanding the topology of the integrated networks is the first step toward understanding its dynamics and evolution. For Aim 1 of this grant, we have developed a novelmore » algorithm to determine and measure the hierarchical structure of transcriptional regulatory networks [1]. The hierarchy captures the direction of information flow in the network. The algorithm is generally applicable to regulatory networks in prokaryotes, yeast and higher organisms. Integrated datasets are extremely beneficial in understanding the biology of a system in a compact manner due to the conflation of multiple layers of information. Therefore for Aim 2 of this grant, we have developed several tools and carried out analysis for integrating system-wide genomic information. To make use of the structural data, we have developed DynaSIN for protein-protein interactions networks with various dynamical interfaces [2]. We then examined the association between network topology with phenotypic effects such as gene essentiality. In particular, we have organized E. coli and S. cerevisiae transcriptional regulatory networks into hierarchies. We then correlated gene phenotypic effects by tinkering with different layers to elucidate which layers were more tolerant to perturbations [3]. In the context of evolution, we also developed a workflow to guide the comparison between different types of biological networks across various species using the concept of rewiring [4], and Furthermore, we have developed CRIT for correlation analysis in systems biology [5]. For Aim 3, we have further investigated the scaling relationship that the number of Transcription Factors (TFs) in a genome is proportional to the square of the total number of genes. We have extended the analysis from transcription factors to various classes of functional categories, and from individual categories to joint distribution [6]. By introducing a new analytical framework, we have generalized the original toolbox model to take into account of metabolic network with arbitrary network topology [7].« less
Gene network inference and visualization tools for biologists: application to new human transcriptome datasets

PubMed Central

Hurley, Daniel; Araki, Hiromitsu; Tamada, Yoshinori; Dunmore, Ben; Sanders, Deborah; Humphreys, Sally; Affara, Muna; Imoto, Seiya; Yasuda, Kaori; Tomiyasu, Yuki; Tashiro, Kosuke; Savoie, Christopher; Cho, Vicky; Smith, Stephen; Kuhara, Satoru; Miyano, Satoru; Charnock-Jones, D. Stephen; Crampin, Edmund J.; Print, Cristin G.

2012-01-01

Gene regulatory networks inferred from RNA abundance data have generated significant interest, but despite this, gene network approaches are used infrequently and often require input from bioinformaticians. We have assembled a suite of tools for analysing regulatory networks, and we illustrate their use with microarray datasets generated in human endothelial cells. We infer a range of regulatory networks, and based on this analysis discuss the strengths and limitations of network inference from RNA abundance data. We welcome contact from researchers interested in using our inference and visualization tools to answer biological questions. PMID:22121215
A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures.

PubMed

Kentzoglanakis, Kyriakos; Poole, Matthew

2012-01-01

In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.