Analysis of bHLH coding genes using gene co-expression network approach.
Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok
2016-07-01
Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.
Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.
Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J
2016-11-04
Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types, duplication ages and co-expression consequences.
RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG
2015-01-01
The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425
Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.
Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina
2015-01-01
Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Regulatory network involving miRNAs and genes in serous ovarian carcinoma
Zhao, Haiyan; Xu, Hao; Xue, Luchen
2017-01-01
Serous ovarian carcinoma (SOC) is one of the most life-threatening types of gynecological malignancy, but the pathogenesis of SOC remains unknown. Previous studies have indicated that differentially expressed genes and microRNAs (miRNAs) serve important functions in SOC. However, genes and miRNAs are identified in a disperse form, and limited information is known about the regulatory association between miRNAs and genes in SOC. In the present study, three regulatory networks were hierarchically constructed, including a differentially-expressed network, a related network and a global network to reveal associations between each factor. In each network, there were three types of factors, which were genes, miRNAs and transcription factors that interact with each other. Focus was placed on the differentially-expressed network, in which all genes and miRNAs were differentially expressed and therefore may have affected the development of SOC. Following the comparison and analysis between the three networks, a number of signaling pathways which demonstrated differentially expressed elements were highlighted. Subsequently, the upstream and downstream elements of differentially expressed miRNAs and genes were listed, and a number of key elements (differentially expressed miRNAs, genes and TFs predicted using the P-match method) were analyzed. The differentially expressed network partially illuminated the pathogenesis of SOC. It was hypothesized that if there was no differential expression of miRNAs and genes, SOC may be prevented and treatment may be identified. The present study provided a theoretical foundation for gene therapy for SOC. PMID:29113276
Prom-On, Santitham; Chanthaphan, Atthawut; Chan, Jonathan Hoyin; Meechai, Asawin
2011-02-01
Relationships among gene expression levels may be associated with the mechanisms of the disease. While identifying a direct association such as a difference in expression levels between case and control groups links genes to disease mechanisms, uncovering an indirect association in the form of a network structure may help reveal the underlying functional module associated with the disease under scrutiny. This paper presents a method to improve the biological relevance in functional module identification from the gene expression microarray data by enhancing the structure of a weighted gene co-expression network using minimum spanning tree. The enhanced network, which is called a backbone network, contains only the essential structural information to represent the gene co-expression network. The entire backbone network is decoupled into a number of coherent sub-networks, and then the functional modules are reconstructed from these sub-networks to ensure minimum redundancy. The method was tested with a simulated gene expression dataset and case-control expression datasets of autism spectrum disorder and colorectal cancer studies. The results indicate that the proposed method can accurately identify clusters in the simulated dataset, and the functional modules of the backbone network are more biologically relevant than those obtained from the original approach.
Munding, Elizabeth M.; Igel, A. Haller; Shiue, Lily; Dorighi, Kristel M.; Treviño, Lisa R.; Ares, Manuel
2010-01-01
Splicing regulatory networks are essential components of eukaryotic gene expression programs, yet little is known about how they are integrated with transcriptional regulatory networks into coherent gene expression programs. Here we define the MER1 splicing regulatory network and examine its role in the gene expression program during meiosis in budding yeast. Mer1p splicing factor promotes splicing of just four pre-mRNAs. All four Mer1p-responsive genes also require Nam8p for splicing activation by Mer1p; however, other genes require Nam8p but not Mer1p, exposing an overlapping meiotic splicing network controlled by Nam8p. MER1 mRNA and three of the four Mer1p substrate pre-mRNAs are induced by the transcriptional regulator Ume6p. This unusual arrangement delays expression of Mer1p-responsive genes relative to other genes under Ume6p control. Products of Mer1p-responsive genes are required for initiating and completing recombination and for activation of Ndt80p, the activator of the transcriptional network required for subsequent steps in the program. Thus, the MER1 splicing regulatory network mediates the dependent relationship between the UME6 and NDT80 transcriptional regulatory networks in the meiotic gene expression program. This study reveals how splicing regulatory networks can be interlaced with transcriptional regulatory networks in eukaryotic gene expression programs. PMID:21123654
Feltus, F Alex; Ficklin, Stephen P; Gibson, Scott M; Smith, Melissa C
2013-06-05
In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired.
2013-01-01
Background In genomics, highly relevant gene interaction (co-expression) networks have been constructed by finding significant pair-wise correlations between genes in expression datasets. These networks are then mined to elucidate biological function at the polygenic level. In some cases networks may be constructed from input samples that measure gene expression under a variety of different conditions, such as for different genotypes, environments, disease states and tissues. When large sets of samples are obtained from public repositories it is often unmanageable to associate samples into condition-specific groups, and combining samples from various conditions has a negative effect on network size. A fixed significance threshold is often applied also limiting the size of the final network. Therefore, we propose pre-clustering of input expression samples to approximate condition-specific grouping of samples and individual network construction of each group as a means for dynamic significance thresholding. The net effect is increase sensitivity thus maximizing the total co-expression relationships in the final co-expression network compendium. Results A total of 86 Arabidopsis thaliana co-expression networks were constructed after k-means partitioning of 7,105 publicly available ATH1 Affymetrix microarray samples. We term each pre-sorted network a Gene Interaction Layer (GIL). Random Matrix Theory (RMT), an un-supervised thresholding method, was used to threshold each of the 86 networks independently, effectively providing a dynamic (non-global) threshold for the network. The overall gene count across all GILs reached 19,588 genes (94.7% measured gene coverage) and 558,022 unique co-expression relationships. In comparison, network construction without pre-sorting of input samples yielded only 3,297 genes (15.9%) and 129,134 relationships. in the global network. Conclusions Here we show that pre-clustering of microarray samples helps approximate condition-specific networks and allows for dynamic thresholding using un-supervised methods. Because RMT ensures only highly significant interactions are kept, the GIL compendium consists of 558,022 unique high quality A. thaliana co-expression relationships across almost all of the measurable genes on the ATH1 array. For A. thaliana, these networks represent the largest compendium to date of significant gene co-expression relationships, and are a means to explore complex pathway, polygenic, and pleiotropic relationships for this focal model plant. The networks can be explored at sysbio.genome.clemson.edu. Finally, this method is applicable to any large expression profile collection for any organism and is best suited where a knowledge-independent network construction method is desired. PMID:23738693
Reconstructing directed gene regulatory network by only gene expression data.
Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng
2016-08-18
Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
Laarits, T; Bordalo, P; Lemos, B
2016-08-01
Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules
Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.
Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Discovery and validation of a glioblastoma co-expressed gene module
Dunwoodie, Leland J.; Poehlman, William L.; Ficklin, Stephen P.; Feltus, Frank Alexander
2018-01-01
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network. PMID:29541392
Discovery and validation of a glioblastoma co-expressed gene module.
Dunwoodie, Leland J; Poehlman, William L; Ficklin, Stephen P; Feltus, Frank Alexander
2018-02-16
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network.
Characterizing mutation-expression network relationships in multiple cancers.
Ghazanfar, Shila; Yang, Jean Yee Hwa
2016-08-01
Data made available through large cancer consortia like The Cancer Genome Atlas make for a rich source of information to be studied across and between cancers. In recent years, network approaches have been applied to such data in uncovering the complex interrelationships between mutational and expression profiles, but lack direct testing for expression changes via mutation. In this pan-cancer study we analyze mutation and gene expression information in an integrative manner by considering the networks generated by testing for differences in expression in direct association with specific mutations. We relate our findings among the 19 cancers examined to identify commonalities and differences as well as their characteristics. Using somatic mutation and gene expression information across 19 cancers, we generated mutation-expression networks per cancer. On evaluation we found that our generated networks were significantly enriched for known cancer-related genes, such as skin cutaneous melanoma (p<0.01 using Network of Cancer Genes 4.0). Our framework identified that while different cancers contained commonly mutated genes, there was little concordance between associated gene expression changes among cancers. Comparison between cancers showed a greater overlap of network nodes for cancers with higher overall non-silent mutation load, compared to those with a lower overall non-silent mutation load. This study offers a framework that explores network information through co-analysis of somatic mutations and gene expression profiles. Our pan-cancer application of this approach suggests that while mutations are frequently common among cancer types, the impact they have on the surrounding networks via gene expression changes varies. Despite this finding, there are some cancers for which mutation-associated network behaviour appears to be similar: suggesting a potential framework for uncovering related cancers for which similar therapeutic strategies may be applicable. Our framework for understanding relationships among cancers has been integrated into an interactive R Shiny application, PAn Cancer Mutation Expression Networks (PACMEN), containing dynamic and static network visualization of the mutation-expression networks. PACMEN also features tools for further examination of network topology characteristics among cancers. Copyright © 2016 Elsevier Ltd. All rights reserved.
Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis
2012-01-01
Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Zhang, Jinfeng; Zhao, Wenjuan; Fu, Rong; Fu, Chenglin; Wang, Lingxia; Liu, Huainian; Li, Shuangcheng; Deng, Qiming; Wang, Shiquan; Zhu, Jun; Liang, Yueyang; Li, Ping; Zheng, Aiping
2018-05-05
Rhizoctonia solani causes rice sheath blight, an important disease affecting the growth of rice (Oryza sativa L.). Attempts to control the disease have met with little success. Based on transcriptional profiling, we previously identified more than 11,947 common differentially expressed genes (TPM > 10) between the rice genotypes TeQing and Lemont. In the current study, we extended these findings by focusing on an analysis of gene co-expression in response to R. solani AG1 IA and identified gene modules within the networks through weighted gene co-expression network analysis (WGCNA). We compared the different genes assigned to each module and the biological interpretations of gene co-expression networks at early and later modules in the two rice genotypes to reveal differential responses to AG1 IA. Our results show that different changes occurred in the two rice genotypes and that the modules in the two groups contain a number of candidate genes possibly involved in pathogenesis, such as the VQ protein. Furthermore, these gene co-expression networks provide comprehensive transcriptional information regarding gene expression in rice in response to AG1 IA. The co-expression networks derived from our data offer ideas for follow-up experimentation that will help advance our understanding of the translational regulation of rice gene expression changes in response to AG1 IA.
Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu
2017-08-01
Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.
Multiscale Embedded Gene Co-expression Network Analysis
Song, Won-Min; Zhang, Bin
2015-01-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778
Multiscale Embedded Gene Co-expression Network Analysis.
Song, Won-Min; Zhang, Bin
2015-11-01
Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.
Wang, Jianxin; Chen, Bo; Wang, Yaqun; Wang, Ningtao; Garbey, Marc; Tran-Son-Tay, Roger; Berceli, Scott A.; Wu, Rongling
2013-01-01
The capacity of an organism to respond to its environment is facilitated by the environmentally induced alteration of gene and protein expression, i.e. expression plasticity. The reconstruction of gene regulatory networks based on expression plasticity can gain not only new insights into the causality of transcriptional and cellular processes but also the complex regulatory mechanisms that underlie biological function and adaptation. We describe an approach for network inference by integrating expression plasticity into Shannon’s mutual information. Beyond Pearson correlation, mutual information can capture non-linear dependencies and topology sparseness. The approach measures the network of dependencies of genes expressed in different environments, allowing the environment-induced plasticity of gene dependencies to be tested in unprecedented details. The approach is also able to characterize the extent to which the same genes trigger different amounts of expression in response to environmental changes. We demonstrated the usefulness of this approach through analysing gene expression data from a rabbit vein graft study that includes two distinct blood flow environments. The proposed approach provides a powerful tool for the modelling and analysis of dynamic regulatory networks using gene expression data from distinct environments. PMID:23470995
Ahi, Ehsan Pashay; Kapralova, Kalina Hristova; Pálsson, Arnar; Maier, Valerie Helene; Gudbrandsson, Jóhannes; Snorrason, Sigurdur S; Jónsson, Zophonías O; Franzdóttir, Sigrídur Rut
2014-01-01
Understanding the molecular basis of craniofacial variation can provide insights into key developmental mechanisms of adaptive changes and their role in trophic divergence and speciation. Arctic charr (Salvelinus alpinus) is a polymorphic fish species, and, in Lake Thingvallavatn in Iceland, four sympatric morphs have evolved distinct craniofacial structures. We conducted a gene expression study on candidates from a conserved gene coexpression network, focusing on the development of craniofacial elements in embryos of two contrasting Arctic charr morphotypes (benthic and limnetic). Four Arctic charr morphs were studied: one limnetic and two benthic morphs from Lake Thingvallavatn and a limnetic reference aquaculture morph. The presence of morphological differences at developmental stages before the onset of feeding was verified by morphometric analysis. Following up on our previous findings that Mmp2 and Sparc were differentially expressed between morphotypes, we identified a network of genes with conserved coexpression across diverse vertebrate species. A comparative expression study of candidates from this network in developing heads of the four Arctic charr morphs verified the coexpression relationship of these genes and revealed distinct transcriptional dynamics strongly correlated with contrasting craniofacial morphologies (benthic versus limnetic). A literature review and Gene Ontology analysis indicated that a significant proportion of the network genes play a role in extracellular matrix organization and skeletogenesis, and motif enrichment analysis of conserved noncoding regions of network candidates predicted a handful of transcription factors, including Ap1 and Ets2, as potential regulators of the gene network. The expression of Ets2 itself was also found to associate with network gene expression. Genes linked to glucocorticoid signalling were also studied, as both Mmp2 and Sparc are responsive to this pathway. Among those, several transcriptional targets and upstream regulators showed differential expression between the contrasting morphotypes. Interestingly, although selected network genes showed overlapping expression patterns in situ and no morph differences, Timp2 expression patterns differed between morphs. Our comparative study of transcriptional dynamics in divergent craniofacial morphologies of Arctic charr revealed a conserved network of coexpressed genes sharing functional roles in structural morphogenesis. We also implicate transcriptional regulators of the network as targets for future functional studies.
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.
Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi
2009-09-03
DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.
Okamura-Oho, Yuko; Shimokawa, Kazuro; Nishimura, Masaomi; Takemoto, Satoko; Sato, Akira; Furuichi, Teiichi; Yokota, Hideo
2014-01-01
Using a recently invented technique for gene expression mapping in the whole-anatomy context, termed transcriptome tomography, we have generated a dataset of 36,000 maps of overall gene expression in the adult-mouse brain. Here, using an informatics approach, we identified a broad co-expression network that follows an inverse power law and is rich in functional interaction and gene-ontology terms. Our framework for the integrated analysis of expression maps and graphs of co-expression networks revealed that groups of combinatorially expressed genes, which regulate cell differentiation during development, were present in the adult brain and each of these groups was associated with a discrete cell types. These groups included non-coding genes of unknown function. We found that these genes specifically linked developmentally conserved groups in the network. A previously unrecognized robust expression pattern covering the whole brain was related to the molecular anatomy of key biological processes occurring in particular areas. PMID:25382412
Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue
2013-01-01
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine).
Wong, Darren C J; Sweetman, Crystal; Drew, Damian P; Ford, Christopher M
2013-12-16
Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and flavonoid biosynthesis) whereby the recovered sub-networks reconfirm established plant gene functions and also identify novel associations. Together, we present valuable insights into grapevine transcriptional regulation by developing network models applicable to researchers in their prioritisation of gene candidates, for on-going study of biological processes related to grapevine development, metabolism and stress responses.
Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi
2017-01-01
We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Differentially Coexpressed Disease Gene Identification Based on Gene Coexpression Network.
Jiang, Xue; Zhang, Han; Quan, Xiongwen
2016-01-01
Screening disease-related genes by analyzing gene expression data has become a popular theme. Traditional disease-related gene selection methods always focus on identifying differentially expressed gene between case samples and a control group. These traditional methods may not fully consider the changes of interactions between genes at different cell states and the dynamic processes of gene expression levels during the disease progression. However, in order to understand the mechanism of disease, it is important to explore the dynamic changes of interactions between genes in biological networks at different cell states. In this study, we designed a novel framework to identify disease-related genes and developed a differentially coexpressed disease-related gene identification method based on gene coexpression network (DCGN) to screen differentially coexpressed genes. We firstly constructed phase-specific gene coexpression network using time-series gene expression data and defined the conception of differential coexpression of genes in coexpression network. Then, we designed two metrics to measure the value of gene differential coexpression according to the change of local topological structures between different phase-specific networks. Finally, we conducted meta-analysis of gene differential coexpression based on the rank-product method. Experimental results demonstrated the feasibility and effectiveness of DCGN and the superior performance of DCGN over other popular disease-related gene selection methods through real-world gene expression data sets.
Yu, Tonghu; Zhang, Huaping; Qi, Hong
2018-01-01
The aim of the present study was to investigate more colon cancer-related genes in different stages. Gene expression profile E-GEOD-62932 was extracted for differentially expressed gene (DEG) screening. Series test of cluster analysis was used to obtain significant trending models. Based on the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, functional and pathway enrichment analysis were processed and a pathway relation network was constructed. Gene co-expression network and gene signal network were constructed for common DEGs. The DEGs with the same trend were clustered and in total, 16 clusters with statistical significance were obtained. The screened DEGs were enriched into small molecule metabolic process and metabolic pathways. The pathway relation network was constructed with 57 nodes. A total of 328 common DEGs were obtained. Gene signal network was constructed with 71 nodes. Gene co-expression network was constructed with 161 nodes and 211 edges. ABCD3, CPT2, AGL and JAM2 are potential biomarkers for the diagnosis of colon cancer. PMID:29928385
A statistical method for measuring activation of gene regulatory networks.
Esteves, Gustavo H; Reis, Luiz F L
2018-06-13
Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng
2014-01-01
Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154
Pan- and core- network analysis of co-expression genes in a model plant
He, Fei; Maslov, Sergei
2016-12-16
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Pan- and core- network analysis of co-expression genes in a model plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Fei; Maslov, Sergei
Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less
Statistical indicators of collective behavior and functional clusters in gene networks of yeast
NASA Astrophysics Data System (ADS)
Živković, J.; Tadić, B.; Wick, N.; Thurner, S.
2006-03-01
We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Logsdon, Benjamin A.; Mezey, Jason
2010-01-01
Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL), which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1), and genes involved in endocytosis (RCY1), the spindle checkpoint (BUB2), sulfonate catabolism (JLP1), and cell-cell communication (PRM7). Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data. PMID:21152011
Integration of biological networks and gene expression data using Cytoscape
Cline, Melissa S; Smoot, Michael; Cerami, Ethan; Kuchinsky, Allan; Landys, Nerius; Workman, Chris; Christmas, Rowan; Avila-Campilo, Iliana; Creech, Michael; Gross, Benjamin; Hanspers, Kristina; Isserlin, Ruth; Kelley, Ryan; Killcoyne, Sarah; Lotia, Samad; Maere, Steven; Morris, John; Ono, Keiichiro; Pavlovic, Vuk; Pico, Alexander R; Vailaya, Aditya; Wang, Peng-Liang; Adler, Annette; Conklin, Bruce R; Hood, Leroy; Kuiper, Martin; Sander, Chris; Schmulevich, Ilya; Schwikowski, Benno; Warner, Guy J; Ideker, Trey; Bader, Gary D
2013-01-01
Cytoscape is a free software package for visualizing, modeling and analyzing molecular and genetic interaction networks. This protocol explains how to use Cytoscape to analyze the results of mRNA expression profiling, and other functional genomics and proteomics experiments, in the context of an interaction network obtained for genes of interest. Five major steps are described: (i) obtaining a gene or protein network, (ii) displaying the network using layout algorithms, (iii) integrating with gene expression and other functional attributes, (iv) identifying putative complexes and functional modules and (v) identifying enriched Gene Ontology annotations in the network. These steps provide a broad sample of the types of analyses performed by Cytoscape. PMID:17947979
Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C
2013-01-01
The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.
"Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia
2014-08-28
The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less
Reverse engineering gene regulatory networks from measurement with missing values.
Ogundijo, Oyetunji E; Elmas, Abdulkadir; Wang, Xiaodong
2016-12-01
Gene expression time series data are usually in the form of high-dimensional arrays. Unfortunately, the data may sometimes contain missing values: for either the expression values of some genes at some time points or the entire expression values of a single time point or some sets of consecutive time points. This significantly affects the performance of many algorithms for gene expression analysis that take as an input, the complete matrix of gene expression measurement. For instance, previous works have shown that gene regulatory interactions can be estimated from the complete matrix of gene expression measurement. Yet, till date, few algorithms have been proposed for the inference of gene regulatory network from gene expression data with missing values. We describe a nonlinear dynamic stochastic model for the evolution of gene expression. The model captures the structural, dynamical, and the nonlinear natures of the underlying biomolecular systems. We present point-based Gaussian approximation (PBGA) filters for joint state and parameter estimation of the system with one-step or two-step missing measurements . The PBGA filters use Gaussian approximation and various quadrature rules, such as the unscented transform (UT), the third-degree cubature rule and the central difference rule for computing the related posteriors. The proposed algorithm is evaluated with satisfying results for synthetic networks, in silico networks released as a part of the DREAM project, and the real biological network, the in vivo reverse engineering and modeling assessment (IRMA) network of yeast Saccharomyces cerevisiae . PBGA filters are proposed to elucidate the underlying gene regulatory network (GRN) from time series gene expression data that contain missing values. In our state-space model, we proposed a measurement model that incorporates the effect of the missing data points into the sequential algorithm. This approach produces a better inference of the model parameters and hence, more accurate prediction of the underlying GRN compared to when using the conventional Gaussian approximation (GA) filters ignoring the missing data points.
Nayak, Renuka R.; Kearns, Michael; Spielman, Richard S.; Cheung, Vivian G.
2009-01-01
Genes interact in networks to orchestrate cellular processes. Analysis of these networks provides insights into gene interactions and functions. Here, we took advantage of normal variation in human gene expression to infer gene networks, which we constructed using correlations in expression levels of more than 8.5 million gene pairs in immortalized B cells from three independent samples. The resulting networks allowed us to identify biological processes and gene functions. Among the biological pathways, we found processes such as translation and glycolysis that co-occur in the same subnetworks. We predicted the functions of poorly characterized genes, including CHCHD2 and TMEM111, and provided experimental evidence that TMEM111 is part of the endoplasmic reticulum-associated secretory pathway. We also found that IFIH1, a susceptibility gene of type 1 diabetes, interacts with YES1, which plays a role in glucose transport. Furthermore, genes that predispose to the same diseases are clustered nonrandomly in the coexpression network, suggesting that networks can provide candidate genes that influence disease susceptibility. Therefore, our analysis of gene coexpression networks offers information on the role of human genes in normal and disease processes. PMID:19797678
Zhou, Xionghui; Liu, Juan
2014-01-01
Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for phenotypic change.
Pan, Yue; Lu, Lingyun; Chen, Junquan; Zhong, Yong; Dai, Zhehao
2018-01-01
This study aimed to identify potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma by comprehensive bioinformatics analysis. Data of gene expression profiles (GSE28424) and miRNA expression profiles (GSE28423) were downloaded from GEO database. The differentially expressed genes (DEGs) and miRNAs (DEMIs) were obtained by R Bioconductor packages. Functional and enrichment analyses of selected genes were performed using DAVID database. Protein-protein interaction (PPI) network was constructed by STRING and visualized in Cytoscape. The relationships among the DEGs and module in PPI network were analyzed by plug-in NetworkAnalyzer and MCODE seperately. Through the TargetScan and comparing target genes with DEGs, the miRNA-mRNA regulation network was established. Totally 346 DEGs and 90 DEMIs were found to be differentially expressed. These DEGs were enriched in biological processes and KEGG pathway of inflammatory immune response. 25 genes in the PPI network were selected as hub genes. Top 10 hub genes were TYROBP, HLA-DRA, VWF, PPBP, SERPING1, HLA-DPA1, SERPINA1, KIF20A, FERMT3, HLA-E. PPI network of DEGs followed a pattern of power law network and met the characteristics of small-world network. MCODE analysis identified 4 clusters and the most significant cluster consisted of 11 nodes and 55 edges. SEPP1, CKS2, TCAP, BPI were identified as the seed genes in their own clusters, respectively. The miRNA-mRNA regulation network which was composed of 89 pairs was established. MiR-210 had the highest connectivity with 12 target genes. Among the predicted target of MiR-96, HLA-DPA1 and TYROBP were the hub genes. Our study indicated possible differentially expressed genes and miRNA, and microRNA-mRNA negative regulatory networks in osteosarcoma by bioinformatics analysis, which may provide novel insights for unraveling pathogenesis of osteosarcoma.
Gene networks and the evolution of plant morphology.
Das Gupta, Mainak; Tsiantis, Miltos
2018-06-06
Elaboration of morphology depends on the precise orchestration of gene expression by key regulatory genes. The hierarchy and relationship among the participating genes is commonly known as gene regulatory network (GRN). Therefore, the evolution of morphology ultimately occurs by the rewiring of gene network structures or by the co-option of gene networks to novel domains. The availability of high-resolution expression data combined with powerful statistical tools have opened up new avenues to formulate and test hypotheses on how diverse gene networks influence trait development and diversity. Here we summarize recent studies based on both big-data and genetics approaches to understand the evolution of plant form and physiology. We also discuss recent genome-wide investigations on how studying open-chromatin regions may help study the evolution of gene expression patterns. Copyright © 2018. Published by Elsevier Ltd.
Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin
2014-01-01
The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
de Arruda, Henrique Ferraz; Comin, Cesar Henrique; Miazaki, Mauro; Viana, Matheus Palhares; Costa, Luciano da Fontoura
2015-04-30
A key point in developmental biology is to understand how gene expression influences the morphological and dynamical patterns that are observed in living beings. In this work we propose a methodology capable of addressing this problem that is based on estimating the mutual information and Pearson correlation between the intensity of gene expression and measurements of several morphological properties of the cells. A similar approach is applied in order to identify effects of gene expression over the system dynamics. Neuronal networks were artificially grown over a lattice by considering a reference model used to generate artificial neurons. The input parameters of the artificial neurons were determined according to two distinct patterns of gene expression and the dynamical response was assessed by considering the integrate-and-fire model. As far as single gene dependence is concerned, we found that the interaction between the gene expression and the network topology, as well as between the former and the dynamics response, is strongly affected by the gene expression pattern. In addition, we observed a high correlation between the gene expression and some topological measurements of the neuronal network for particular patterns of gene expression. To our best understanding, there are no similar analyses to compare with. A proper understanding of gene expression influence requires jointly studying the morphology, topology, and dynamics of neurons. The proposed framework represents a first step towards predicting gene expression patterns from morphology and connectivity. Copyright © 2015. Published by Elsevier B.V.
Guo, Nan; Zhang, Nan; Yan, Liqiu; Lian, Zheng; Wang, Jiawang; Lv, Fengfeng; Wang, Yunfei; Cao, Xufen
2018-06-14
Acute myocardial infarction induces ventricular remodeling, which is implicated in dilated heart and heart failure. The pathogenical mechanism of myocardium remodeling remains to be elucidated. The aim of the present study was to identify key genes and networks for myocardium remodeling following ischemia‑reperfusion (IR). First, the mRNA expression data from the National Center for Biotechnology Information database were downloaded to identify differences in mRNA expression of the IR heart at days 2 and 7. Then, weighted gene co‑expression network analysis, hierarchical clustering, protein‑protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were used to identify key genes and networks for the heart remodeling process following IR. A total of 3,321 differentially expressed genes were identified during the heart remodeling process. A total of 6 modules were identified through gene co‑expression network analysis. GO and KEGG analysis results suggested that each module represented a different biological function and was associated with different pathways. Finally, hub genes of each module were identified by PPI network construction. The present study revealed that heart remodeling following IR is a complicated process, involving extracellular matrix organization, neural development, apoptosis and energy metabolism. The dysregulated genes, including SRC proto‑oncogene, non‑receptor tyrosine kinase, discs large MAGUK scaffold protein 1, ATP citrate lyase, RAN, member RAS oncogene family, tumor protein p53, and polo like kinase 2, may be essential for heart remodeling following IR and may be used as potential targets for the inhibition of heart remodeling following acute myocardial infarction.
Okada, D; Endo, S; Matsuda, H; Ogawa, S; Taniguchi, Y; Katsuta, T; Watanabe, T; Iwaisaki, H
2018-05-12
Genome-wide association studies (GWAS) of quantitative traits have detected numerous genetic associations, but they encounter difficulties in pinpointing prominent candidate genes and inferring gene networks. The present study used a systems genetics approach integrating GWAS results with external RNA-expression data to detect candidate gene networks in feed utilization and growth traits of Japanese Black cattle, which are matters of concern. A SNP co-association network was derived from significant correlations between SNPs with effects estimated by GWAS across seven phenotypic traits. The resulting network genes contained significant numbers of annotations related to the traits. Using bovine transcriptome data from a public database, an RNA co-expression network was inferred based on the similarity of expression patterns across different tissues. An intersection network was then generated by superimposing the SNP and RNA networks and extracting shared interactions. This intersection network contained four tissue-specific modules: nervous system, reproductive system, muscular system, and glands. To characterize the structure (topographical properties) of the three networks, their scale-free properties were evaluated, which revealed that the intersection network was the most scale-free. In the sub-network containing the most connected transcription factors (URI1, ROCK2 and ETV6), most genes were widely expressed across tissues, and genes previously shown to be involved in the traits were found. Results indicated that the current approach might be used to construct a gene network that better reflects biological information, providing encouragement for the genetic dissection of economically important quantitative traits.
Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia
2015-06-01
To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery
Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi
2009-01-01
Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. Conclusion GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at . PMID:19728865
Finding gene regulatory network candidates using the gene expression knowledge base.
Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin
2014-12-10
Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.
Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.
2013-01-01
The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071
Taka, Hitomi; Asano, Shin-ichiro; Matsuura, Yoshiharu; Bando, Hisanori
2015-01-01
To infect their hosts, DNA viruses must successfully initiate the expression of viral genes that control subsequent viral gene expression and manipulate the host environment. Viral genes that are immediately expressed upon infection play critical roles in the early infection process. In this study, we investigated the expression and regulation of five canonical regulatory immediate-early (IE) genes of Autographa californica multicapsid nucleopolyhedrovirus: ie0, ie1, ie2, me53, and pe38. A systematic transient gene-expression analysis revealed that these IE genes are generally transactivators, suggesting the existence of a highly interactive regulatory network. A genetic analysis using gene knockout viruses demonstrated that the expression of these IE genes was tolerant to the single deletions of activator IE genes in the early stage of infection. A network graph analysis on the regulatory relationships observed in the transient expression analysis suggested that the robustness of IE gene expression is due to the organization of the IE gene regulatory network and how each IE gene is activated. However, some regulatory relationships detected by the genetic analysis were contradictory to those observed in the transient expression analysis, especially for IE0-mediated regulation. Statistical modeling, combined with genetic analysis using knockout alleles for ie0 and ie1, showed that the repressor function of ie0 was due to the interaction between ie0 and ie1, not ie0 itself. Taken together, these systematic approaches provided insight into the topology and nature of the IE gene regulatory network. PMID:25816136
System Biology Approach: Gene Network Analysis for Muscular Dystrophy.
Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro
2018-01-01
Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.
Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent
2009-01-01
Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a STARNET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at , and does not require user registration. PMID:19828039
A Functional and Regulatory Network Associated with PIP Expression in Human Breast Cancer
Debily, Marie-Anne; Marhomy, Sandrine El; Boulanger, Virginie; Eveno, Eric; Mariage-Samson, Régine; Camarca, Alessandra; Auffray, Charles; Piatier-Tonneau, Dominique; Imbeaud, Sandrine
2009-01-01
Background The PIP (prolactin-inducible protein) gene has been shown to be expressed in breast cancers, with contradictory results concerning its implication. As both the physiological role and the molecular pathways in which PIP is involved are poorly understood, we conducted combined gene expression profiling and network analysis studies on selected breast cancer cell lines presenting distinct PIP expression levels and hormonal receptor status, to explore the functional and regulatory network of PIP co-modulated genes. Principal Findings Microarray analysis allowed identification of genes co-modulated with PIP independently of modulations resulting from hormonal treatment or cell line heterogeneity. Relevant clusters of genes that can discriminate between [PIP+] and [PIP−] cells were identified. Functional and regulatory network analyses based on a knowledge database revealed a master network of PIP co-modulated genes, including many interconnecting oncogenes and tumor suppressor genes, half of which were detected as differentially expressed through high-precision measurements. The network identified appears associated with an inhibition of proliferation coupled with an increase of apoptosis and an enhancement of cell adhesion in breast cancer cell lines, and contains many genes with a STAT5 regulatory motif in their promoters. Conclusions Our global exploratory approach identified biological pathways modulated along with PIP expression, providing further support for its good prognostic value of disease-free survival in breast cancer. Moreover, our data pointed to the importance of a regulatory subnetwork associated with PIP expression in which STAT5 appears as a potential transcriptional regulator. PMID:19262752
Co-expression networks reveal the tissue-specific regulation of transcription and splicing
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D.H.; Jo, Brian; Gao, Chuan; McDowell, Ian C.; Engelhardt, Barbara E.
2017-01-01
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. PMID:29021288
Gene expression links functional networks across cortex and striatum.
Anderson, Kevin M; Krienen, Fenna M; Choi, Eun Young; Reinen, Jenna M; Yeo, B T Thomas; Holmes, Avram J
2018-04-12
The human brain is comprised of a complex web of functional networks that link anatomically distinct regions. However, the biological mechanisms supporting network organization remain elusive, particularly across cortical and subcortical territories with vastly divergent cellular and molecular properties. Here, using human and primate brain transcriptional atlases, we demonstrate that spatial patterns of gene expression show strong correspondence with limbic and somato/motor cortico-striatal functional networks. Network-associated expression is consistent across independent human datasets and evolutionarily conserved in non-human primates. Genes preferentially expressed within the limbic network (encompassing nucleus accumbens, orbital/ventromedial prefrontal cortex, and temporal pole) relate to risk for psychiatric illness, chloride channel complexes, and markers of somatostatin neurons. Somato/motor associated genes are enriched for oligodendrocytes and markers of parvalbumin neurons. These analyses indicate that parallel cortico-striatal processing channels possess dissociable genetic signatures that recapitulate distributed functional networks, and nominate molecular mechanisms supporting cortico-striatal circuitry in health and disease.
Gene expression complex networks: synthesis, identification, and analysis.
Lopes, Fabrício M; Cesar, Roberto M; Costa, Luciano Da F
2011-10-01
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree
A transversal approach to predict gene product networks from ontology-based similarity
Chabalier, Julie; Mosser, Jean; Burgun, Anita
2007-01-01
Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807
Yu, Fu-Dong; Yang, Shao-You; Li, Yuan-Yuan; Hu, Wei
2013-04-10
Malaria continues to be one of the most severe global infectious diseases, as a major threat to human health and economic development. Network-based biological analysis is a promising approach to uncover key genes and biological processes from a network viewpoint, which could not be recognized from individual gene-based signatures. We integrated gene co-expression profile with protein-protein interaction and transcriptional regulation information to construct a comprehensive gene co-expression network of Plasmodium falciparum. Based on this network, we identified 10 core modules by using ICE (Iterative Clique Enumeration) algorithm, which were essential for malaria parasite development in intraerythrocytic developmental cycle (IDC) stages. In each module, all genes were highly correlated probably due to co-regulation or formation of a protein complex. Some of these genes were recognized to be differentially coexpressed among three close-by IDC stages. The gene of prpf8 (PFD0265w) encoding pre-mRNA processing splicing factor 8 product was identified as DCGs (differentially co-expressed genes) among IDC stages, although this gene function was seldom reported in previous researches. Integrating the species-specific gene prediction and differential co-expression gene detection, we found some modules could perform species-specific functions according to some of genes in these modules were species-specific genes, like the module 10. Furthermore, in order to reveal the underlying mechanisms of the erythrocyte invasion by P. falciparum, Steiner Tree algorithm was employed to identify the invasion subnetwork from our gene co-expression network. The subnetwork-based analysis indicated that some important Plasmodium parasite specific genes could corporate with each other and be co-regulated during the parasite invasion process, which including a head-to-head gene pair of PfRH2a (PF13_0198) and PfRH2b (MAL13P1.176). This study based on gene co-expression network could shed new insights on the mechanisms of pathogenesis, even virulence and P. falciparum development. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
The structure of a gene co-expression network reveals biological functions underlying eQTLs.
Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali
2013-01-01
What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.
Ambroise, Jérôme; Robert, Annie; Macq, Benoit; Gala, Jean-Luc
2012-01-06
An important challenge in system biology is the inference of biological networks from postgenomic data. Among these biological networks, a gene transcriptional regulatory network focuses on interactions existing between transcription factors (TFs) and and their corresponding target genes. A large number of reverse engineering algorithms were proposed to infer such networks from gene expression profiles, but most current methods have relatively low predictive performances. In this paper, we introduce the novel TNIFSED method (Transcriptional Network Inference from Functional Similarity and Expression Data), that infers a transcriptional network from the integration of correlations and partial correlations of gene expression profiles and gene functional similarities through a supervised classifier. In the current work, TNIFSED was applied to predict the transcriptional network in Escherichia coli and in Saccharomyces cerevisiae, using datasets of 445 and 170 affymetrix arrays, respectively. Using the area under the curve of the receiver operating characteristics and the F-measure as indicators, we showed the predictive performance of TNIFSED to be better than unsupervised state-of-the-art methods. TNIFSED performed slightly worse than the supervised SIRENE algorithm for the target genes identification of the TF having a wide range of yet identified target genes but better for TF having only few identified target genes. Our results indicate that TNIFSED is complementary to the SIRENE algorithm, and particularly suitable to discover target genes of "orphan" TFs.
On the robustness of complex heterogeneous gene expression networks.
Gómez-Gardeñes, Jesús; Moreno, Yamir; Floría, Luis M
2005-04-01
We analyze a continuous gene expression model on the underlying topology of a complex heterogeneous network. Numerical simulations aimed at studying the chaotic and periodic dynamics of the model are performed. The results clearly indicate that there is a region in which the dynamical and structural complexity of the system avoid chaotic attractors. However, contrary to what has been reported for Random Boolean Networks, the chaotic phase cannot be completely suppressed, which has important bearings on network robustness and gene expression modeling.
Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu
2017-01-01
Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663
Uddin, Raihan; Singh, Shiva M.
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning. PMID:29066959
Uddin, Raihan; Singh, Shiva M
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning.
Modena, Brian D; Bleecker, Eugene R; Busse, William W; Erzurum, Serpil C; Gaston, Benjamin M; Jarjour, Nizar N; Meyers, Deborah A; Milosevic, Jadranka; Tedrow, John R; Wu, Wei; Kaminski, Naftali; Wenzel, Sally E
2017-06-01
Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Identify networks of genes reflective of underlying biological processes that define SA. Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12-21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes.
Modena, Brian D.; Bleecker, Eugene R.; Busse, William W.; Erzurum, Serpil C.; Gaston, Benjamin M.; Jarjour, Nizar N.; Meyers, Deborah A.; Milosevic, Jadranka; Tedrow, John R.; Wu, Wei; Kaminski, Naftali
2017-01-01
Rationale: Severe asthma (SA) is a heterogeneous disease with multiple molecular mechanisms. Gene expression studies of bronchial epithelial cells in individuals with asthma have provided biological insight and underscored possible mechanistic differences between individuals. Objectives: Identify networks of genes reflective of underlying biological processes that define SA. Methods: Airway epithelial cell gene expression from 155 subjects with asthma and healthy control subjects in the Severe Asthma Research Program was analyzed by weighted gene coexpression network analysis to identify gene networks and profiles associated with SA and its specific characteristics (i.e., pulmonary function tests, quality of life scores, urgent healthcare use, and steroid use), which potentially identified underlying biological processes. A linear model analysis confirmed these findings while adjusting for potential confounders. Measurements and Main Results: Weighted gene coexpression network analysis constructed 64 gene network modules, including modules corresponding to T1 and T2 inflammation, neuronal function, cilia, epithelial growth, and repair mechanisms. Although no network selectively identified SA, genes in modules linked to epithelial growth and repair and neuronal function were markedly decreased in SA. Several hub genes of the epithelial growth and repair module were found located at the 17q12–21 locus, near a well-known asthma susceptibility locus. T2 genes increased with severity in those treated with corticosteroids but were also elevated in untreated, mild-to-moderate disease compared with healthy control subjects. T1 inflammation, especially when associated with increased T2 gene expression, was elevated in a subgroup of younger patients with SA. Conclusions: In this hypothesis-generating analysis, gene expression networks in relation to asthma severity provided potentially new insight into biological mechanisms associated with the development of SA and its phenotypes. PMID:27984699
Wang, Yi Kan; Hurley, Daniel G.; Schnell, Santiago; Print, Cristin G.; Crampin, Edmund J.
2013-01-01
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data. PMID:23967277
Ponsuksili, Siriluck; Du, Yang; Hadlich, Frieder; Siengdee, Puntita; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus
2013-08-05
Physiological processes aiding the conversion of muscle to meat involve many genes associated with muscle structure and metabolic processes. MicroRNAs regulate networks of genes to orchestrate cellular functions, in turn regulating phenotypes. We applied weighted gene co-expression network analysis to identify co-expression modules that correlated to meat quality phenotypes and were highly enriched for genes involved in glucose metabolism, response to wounding, mitochondrial ribosome, mitochondrion, and extracellular matrix. Negative correlation of miRNA with mRNA and target prediction were used to select transcripts out of the modules of trait-associated mRNAs to further identify those genes that are correlated with post mortem traits. Porcine muscle co-expression transcript networks that correlated to post mortem traits were identified. The integration of miRNA and mRNA expression analyses, as well as network analysis, enabled us to interpret the differentially-regulated genes from a systems perspective. Linking co-expression networks of transcripts and hierarchically organized pairs of miRNAs and mRNAs to meat properties yields new insight into several biological pathways underlying phenotype differences. These pathways may also be diagnostic for many myopathies, which are accompanied by deficient nutrient and oxygen supply of muscle fibers.
Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John
2016-04-01
Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Cell cycle gene expression networks discovered using systems biology: Significance in carcinogenesis
Scott, RE; Ghule, PN; Stein, JL; Stein, GS
2015-01-01
The early stages of carcinogenesis are linked to defects in the cell cycle. A series of cell cycle checkpoints are involved in this process. The G1/S checkpoint that serves to integrate the control of cell proliferation and differentiation is linked to carcinogenesis and the mitotic spindle checkpoint with the development of chromosomal instability. This paper presents the outcome of systems biology studies designed to evaluate if networks of covariate cell cycle gene transcripts exist in proliferative mammalian tissues including mice, rats and humans. The GeneNetwork website that contains numerous gene expression datasets from different species, sexes and tissues represents the foundational resource for these studies (www.genenetwork.org). In addition, WebGestalt, a gene ontology tool, facilitated the identification of expression networks of genes that co-vary with key cell cycle targets, especially Cdc20 and Plk1 (www.bioinfo.vanderbilt.edu/webgestalt). Cell cycle expression networks of such covariate mRNAs exist in multiple proliferative tissues including liver, lung, pituitary, adipose and lymphoid tissues among others but not in brain or retina that have low proliferative potential. Sixty-three covariate cell cycle gene transcripts (mRNAs) compose the average cell cycle network with p = e−13 to e−36. Cell cycle expression networks show species, sex and tissue variability and they are enriched in mRNA transcripts associated with mitosis many of which are associated with chromosomal instability. PMID:25808367
Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne
2005-04-15
The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.
Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E
2017-04-12
Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md
2014-01-01
Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803
The Role of Vitamin D in the Transcriptional Program of Human Pregnancy
Al-Garawi, Amal; Carey, Vincent J.; Chhabra, Divya; Morrow, Jarrett; Lasky-Su, Jessica; Qiu, Weiliang; Laranjo, Nancy; Litonjua, Augusto A.; Weiss, Scott T.
2016-01-01
Background Patterns of gene expression of human pregnancy are poorly understood. In a trial of vitamin D supplementation in pregnant women, peripheral blood transcriptomes were measured longitudinally on 30 women and used to characterize gene co-expression networks. Objective Studies suggest that increased maternal Vitamin D levels may reduce the risk of asthma in early life, yet the underlying mechanisms have not been examined. In this study, we used a network-based approach to examine changes in gene expression profiles during the course of normal pregnancy and evaluated their association with maternal Vitamin D levels. Design The VDAART study is a randomized clinical trial of vitamin D supplementation in pregnancy for reduction of pediatric asthma risk. The trial enrolled 881 women at 10–18 weeks of gestation. Longitudinal gene expression measures were obtained on thirty pregnant women, using RNA isolated from peripheral blood samples obtained in the first and third trimesters. Differentially expressed genes were identified using significance of analysis of microarrays (SAM), and clustered using a weighted gene co-expression network analysis (WGCNA). Gene-set enrichment was performed to identify major biological pathways. Results Comparison of transcriptional profiles between first and third trimesters of pregnancy identified 5839 significantly differentially expressed genes (FDR<0.05). Weighted gene co-expression network analysis clustered these transcripts into 14 co-expression modules of which two showed significant correlation with maternal vitamin D levels. Pathway analysis of these two modules revealed genes enriched in immune defense pathways and extracellular matrix reorganization as well as genes enriched in notch signaling and transcription factor networks. Conclusion Our data show that gene expression profiles of healthy pregnant women change during the course of pregnancy and suggest that maternal Vitamin D levels influence transcriptional profiles. These alterations of the maternal transcriptome may contribute to fetal immune imprinting and reduce allergic sensitization in early life. Trial Registration clinicaltrials.gov NCT00920621 PMID:27711190
Robust Learning of High-dimensional Biological Networks with Bayesian Networks
NASA Astrophysics Data System (ADS)
Nägele, Andreas; Dejori, Mathäus; Stetter, Martin
Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.
Co-expression networks reveal the tissue-specific regulation of transcription and splicing.
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H; Jo, Brian; Gao, Chuan; McDowell, Ian C; Engelhardt, Barbara E; Battle, Alexis
2017-11-01
Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. © 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.
Gong, Bin-Sheng; Zhang, Qing-Pu; Zhang, Guang-Mei; Zhang, Shao-Jun; Zhang, Wei; Lv, Hong-Chao; Zhang, Fan; Lv, Sa-Li; Li, Chuan-Xing; Rao, Shao-Qi; Li, Xia
2007-01-01
Gene expression profiles and single-nucleotide polymorphism (SNP) profiles are modern data for genetic analysis. It is possible to use the two types of information to analyze the relationships among genes by some genetical genomics approaches. In this study, gene expression profiles were used as expression traits. And relationships among the genes, which were co-linked to a common SNP(s), were identified by integrating the two types of information. Further research on the co-expressions among the co-linked genes was carried out after the gene-SNP relationships were established using the Haseman-Elston sib-pair regression. The results showed that the co-expressions among the co-linked genes were significantly higher if the number of connections between the genes and a SNP(s) was more than six. Then, the genes were interconnected via one or more SNP co-linkers to construct a gene-SNP intermixed network. The genes sharing more SNPs tended to have a stronger correlation. Finally, a gene-gene network was constructed with their intensities of relationships (the number of SNP co-linkers shared) as the weights for the edges. PMID:18466544
Integration of multi-omics data for integrative gene regulatory network inference.
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun; Kang, Mingon
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called 'multi-omics data', that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN's capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed.
Integration of multi-omics data for integrative gene regulatory network inference
Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun
2017-01-01
Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called ‘multi-omics data’, that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN’s capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed. PMID:29354189
Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi
2013-01-01
Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.
Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko
2016-06-01
Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
WGCNA: an R package for weighted correlation network analysis.
Langfelder, Peter; Horvath, Steve
2008-12-29
Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.
FastGCN: A GPU Accelerated Tool for Fast Gene Co-Expression Networks
Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun
2015-01-01
Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out. PMID:25602758
FastGCN: a GPU accelerated tool for fast gene co-expression networks.
Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun
2015-01-01
Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.
CoNekT: an open-source framework for comparative genomic and transcriptomic network analyses.
Proost, Sebastian; Mutwil, Marek
2018-05-01
The recent accumulation of gene expression data in the form of RNA sequencing creates unprecedented opportunities to study gene regulation and function. Furthermore, comparative analysis of the expression data from multiple species can elucidate which functional gene modules are conserved across species, allowing the study of the evolution of these modules. However, performing such comparative analyses on raw data is not feasible for many biologists. Here, we present CoNekT (Co-expression Network Toolkit), an open source web server, that contains user-friendly tools and interactive visualizations for comparative analyses of gene expression data and co-expression networks. These tools allow analysis and cross-species comparison of (i) gene expression profiles; (ii) co-expression networks; (iii) co-expressed clusters involved in specific biological processes; (iv) tissue-specific gene expression; and (v) expression profiles of gene families. To demonstrate these features, we constructed CoNekT-Plants for green alga, seed plants and flowering plants (Picea abies, Chlamydomonas reinhardtii, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Zea mays and Solanum lycopersicum) and thus provide a web-tool with the broadest available collection of plant phyla. CoNekT-Plants is freely available from http://conekt.plant.tools, while the CoNekT source code and documentation can be found at https://github.molgen.mpg.de/proost/CoNekT/.
Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun
2014-01-01
As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
Differential network entropy reveals cancer system hallmarks
West, James; Bianconi, Ginestra; Severini, Simone; Teschendorff, Andrew E.
2012-01-01
The cellular phenotype is described by a complex network of molecular interactions. Elucidating network properties that distinguish disease from the healthy cellular state is therefore of critical importance for gaining systems-level insights into disease mechanisms and ultimately for developing improved therapies. By integrating gene expression data with a protein interaction network we here demonstrate that cancer cells are characterised by an increase in network entropy. In addition, we formally demonstrate that gene expression differences between normal and cancer tissue are anticorrelated with local network entropy changes, thus providing a systemic link between gene expression changes at the nodes and their local correlation patterns. In particular, we find that genes which drive cell-proliferation in cancer cells and which often encode oncogenes are associated with reductions in network entropy. These findings may have potential implications for identifying novel drug targets. PMID:23150773
Carey, Michelle; Ramírez, Juan Camilo; Wu, Shuang; Wu, Hulin
2018-07-01
A biological host response to an external stimulus or intervention such as a disease or infection is a dynamic process, which is regulated by an intricate network of many genes and their products. Understanding the dynamics of this gene regulatory network allows us to infer the mechanisms involved in a host response to an external stimulus, and hence aids the discovery of biomarkers of phenotype and biological function. In this article, we propose a modeling/analysis pipeline for dynamic gene expression data, called Pipeline4DGEData, which consists of a series of statistical modeling techniques to construct dynamic gene regulatory networks from the large volumes of high-dimensional time-course gene expression data that are freely available in the Gene Expression Omnibus repository. This pipeline has a consistent and scalable structure that allows it to simultaneously analyze a large number of time-course gene expression data sets, and then integrate the results across different studies. We apply the proposed pipeline to influenza infection data from nine studies and demonstrate that interesting biological findings can be discovered with its implementation.
BRAIN NETWORKS. Correlated gene expression supports synchronous activity in brain networks.
Richiardi, Jonas; Altmann, Andre; Milazzo, Anna-Clare; Chang, Catie; Chakravarty, M Mallar; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Bromberg, Uli; Büchel, Christian; Conrod, Patricia; Fauth-Bühler, Mira; Flor, Herta; Frouin, Vincent; Gallinat, Jürgen; Garavan, Hugh; Gowland, Penny; Heinz, Andreas; Lemaître, Hervé; Mann, Karl F; Martinot, Jean-Luc; Nees, Frauke; Paus, Tomáš; Pausova, Zdenka; Rietschel, Marcella; Robbins, Trevor W; Smolka, Michael N; Spanagel, Rainer; Ströhle, Andreas; Schumann, Gunter; Hawrylycz, Mike; Poline, Jean-Baptiste; Greicius, Michael D
2015-06-12
During rest, brain activity is synchronized between different regions widely distributed throughout the brain, forming functional networks. However, the molecular mechanisms supporting functional connectivity remain undefined. We show that functional brain networks defined with resting-state functional magnetic resonance imaging can be recapitulated by using measures of correlated gene expression in a post mortem brain tissue data set. The set of 136 genes we identify is significantly enriched for ion channels. Polymorphisms in this set of genes significantly affect resting-state functional connectivity in a large sample of healthy adolescents. Expression levels of these genes are also significantly associated with axonal connectivity in the mouse. The results provide convergent, multimodal evidence that resting-state functional networks correlate with the orchestrated activity of dozens of genes linked to ion channel activity and synaptic function. Copyright © 2015, American Association for the Advancement of Science.
On construction of stochastic genetic networks based on gene expression sequences.
Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya
2005-08-01
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
WGCNA: an R package for weighted correlation network analysis
Langfelder, Peter; Horvath, Steve
2008-01-01
Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at . PMID:19114008
Smita, Shuchi; Katiyar, Amit; Pandey, Dev Mani; Chinnusamy, Viswanathan; Archak, Sunil; Bansal, Kailash Chander
2013-01-01
Identification of genes that are coexpressed across various tissues and environmental stresses is biologically interesting, since they may play coordinated role in similar biological processes. Genes with correlated expression patterns can be best identified by using coexpression network analysis of transcriptome data. In the present study, we analyzed the temporal-spatial coordination of gene expression in root, leaf and panicle of rice under drought stress and constructed network using WGCNA and Cytoscape. Total of 2199 differentially expressed genes (DEGs) were identified in at least three or more tissues, wherein 88 genes have coordinated expression profile among all the six tissues under drought stress. These 88 highly coordinated genes were further subjected to module identification in the coexpression network. Based on chief topological properties we identified 18 hub genes such as ABC transporter, ATP-binding protein, dehydrin, protein phosphatase 2C, LTPL153 - Protease inhibitor, phosphatidylethanolaminebinding protein, lactose permease-related, NADP-dependent malic enzyme, etc. Motif enrichment analysis showed the presence of ABRE cis-elements in the promoters of > 62% of the coordinately expressed genes. Our results suggest that drought stress mediated upregulated gene expression was coordinated through an ABA-dependent signaling pathway across tissues, at least for the subset of genes identified in this study, while down regulation appears to be regulated by tissue specific pathways in rice.
Modularity and evolutionary constraints in a baculovirus gene regulatory network
2013-01-01
Background The structure of regulatory networks remains an open question in our understanding of complex biological systems. Interactions during complete viral life cycles present unique opportunities to understand how host-parasite network take shape and behave. The Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is a large double-stranded DNA virus, whose genome may encode for 152 open reading frames (ORFs). Here we present the analysis of the ordered cascade of the AgMNPV gene expression. Results We observed an earlier onset of the expression than previously reported for other baculoviruses, especially for genes involved in DNA replication. Most ORFs were expressed at higher levels in a more permissive host cell line. Genes with more than one copy in the genome had distinct expression profiles, which could indicate the acquisition of new functionalities. The transcription gene regulatory network (GRN) for 149 ORFs had a modular topology comprising five communities of highly interconnected nodes that separated key genes that are functionally related on different communities, possibly maximizing redundancy and GRN robustness by compartmentalization of important functions. Core conserved functions showed expression synchronicity, distinct GRN features and significantly less genetic diversity, consistent with evolutionary constraints imposed in key elements of biological systems. This reduced genetic diversity also had a positive correlation with the importance of the gene in our estimated GRN, supporting a relationship between phylogenetic data of baculovirus genes and network features inferred from expression data. We also observed that gene arrangement in overlapping transcripts was conserved among related baculoviruses, suggesting a principle of genome organization. Conclusions Albeit with a reduced number of nodes (149), the AgMNPV GRN had a topology and key characteristics similar to those observed in complex cellular organisms, which indicates that modularity may be a general feature of biological gene regulatory networks. PMID:24006890
Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming
2015-01-01
In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.
Fine-tuning gene networks using simple sequence repeats
Egbert, Robert G.; Klavins, Eric
2012-01-01
The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Derous, Davina; Mitchell, Sharon E; Green, Cara L; Wang, Yingchun; Han, Jing Dong J; Chen, Luonan; Promislow, Daniel E L; Lusseau, David; Speakman, John R; Douglas, Alex
2016-05-01
Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels.
Derous, Davina; Mitchell, Sharon E.; Green, Cara L.; Wang, Yingchun; Han, Jing Dong J.; Chen, Luonan; Promislow, Daniel E.L.; Lusseau, David; Speakman, John R.; Douglas, Alex
2016-01-01
Connectivity in a gene-gene network declines with age, typically within gene clusters. We explored the effect of short-term (3 months) graded calorie restriction (CR) (up to 40 %) on network structure of aging-associated genes in the murine hypothalamus by using conditional mutual information. The networks showed a topological rearrangement when exposed to graded CR with a higher relative within cluster connectivity at 40CR. We observed changes in gene centrality concordant with changes in CR level, with Ppargc1a, and Ppt1 having increased centrality and Etfdh, Traf3 and Abcc1 decreased centrality as CR increased. This change in gene centrality in a graded manner with CR, occurred in the absence of parallel changes in gene expression levels. This study emphasizes the importance of augmenting traditional differential gene expression analyses to better understand structural changes in the transcriptome. Overall our results suggested that CR induced changes in centrality of biological relevant genes that play an important role in preventing the age-associated loss of network integrity irrespective of their gene expression levels. PMID:27115072
Mapping eQTL Networks with Mixed Graphical Markov Models
Tur, Inma; Roverato, Alberto; Castelo, Robert
2014-01-01
Expression quantitative trait loci (eQTL) mapping constitutes a challenging problem due to, among other reasons, the high-dimensional multivariate nature of gene-expression traits. Next to the expression heterogeneity produced by confounding factors and other sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular, and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of all of these factors to end up with a network of direct associations connecting the path from genotype to phenotype. In this article we approach this challenge with mixed graphical Markov models, higher-order conditional independences, and q-order correlation graphs. These models show that additive genetic effects propagate through the network as function of gene–gene correlations. Our estimation of the eQTL network underlying a well-studied yeast data set leads to a sparse structure with more direct genetic and regulatory associations that enable a straightforward comparison of the genetic control of gene expression across chromosomes. Interestingly, it also reveals that eQTLs explain most of the expression variability of network hub genes. PMID:25271303
miR-638 regulates gene expression networks associated with emphysematous lung destruction
2013-01-01
Background Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease characterized by varying degrees of emphysematous lung destruction and small airway disease, each with distinct effects on clinical outcomes. There is little known about how microRNAs contribute specifically to the emphysema phenotype. We examined how genome-wide microRNA expression is altered with regional emphysema severity and how these microRNAs regulate disease-associated gene expression networks. Methods We profiled microRNAs in different regions of the lung with varying degrees of emphysema from 6 smokers with COPD and 2 controls (8 regions × 8 lungs = 64 samples). Regional emphysema severity was quantified by mean linear intercept. Whole genome microRNA and gene expression data were integrated in the same samples to build co-expression networks. Candidate microRNAs were perturbed in human lung fibroblasts in order to validate these networks. Results The expression levels of 63 microRNAs (P < 0.05) were altered with regional emphysema. A subset, including miR-638, miR-30c, and miR-181d, had expression levels that were associated with those of their predicted mRNA targets. Genes correlated with these microRNAs were enriched in pathways associated with emphysema pathophysiology (for example, oxidative stress and accelerated aging). Inhibition of miR-638 expression in lung fibroblasts led to modulation of these same emphysema-related pathways. Gene targets of miR-638 in these pathways were amongst those negatively correlated with miR-638 expression in emphysema. Conclusions Our findings demonstrate that microRNAs are altered with regional emphysema severity and modulate disease-associated gene expression networks. Furthermore, miR-638 may regulate gene expression pathways related to the oxidative stress response and aging in emphysematous lung tissue and lung fibroblasts. PMID:24380442
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-12-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.
Exploring of the molecular mechanism of rhinitis via bioinformatics methods
Song, Yufen; Yan, Zhaohui
2018-01-01
The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
Chen, Shuonan; Mar, Jessica C
2018-06-19
A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data. Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other. This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.
USDA-ARS?s Scientific Manuscript database
A gene co-expression network was generated using a dual RNA-seq study with the fungal pathogen A. flavus and its plant host Z. mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network reveal...
Shanley, Thomas P; Cvijanovich, Natalie; Lin, Richard; Allen, Geoffrey L; Thomas, Neal J; Doctor, Allan; Kalyanaraman, Meena; Tofil, Nancy M; Penfil, Scott; Monaco, Marie; Odoms, Kelli; Barnes, Michael; Sakthivel, Bhuvaneswari; Aronow, Bruce J; Wong, Hector R
2007-01-01
We have conducted longitudinal studies focused on the expression profiles of signaling pathways and gene networks in children with septic shock. Genome-level expression profiles were generated from whole blood-derived RNA of children with septic shock (n = 30) corresponding to day one and day three of septic shock, respectively. Based on sequential statistical and expression filters, day one and day three of septic shock were characterized by differential regulation of 2,142 and 2,504 gene probes, respectively, relative to controls (n = 15). Venn analysis demonstrated 239 unique genes in the day one dataset, 598 unique genes in the day three dataset, and 1,906 genes common to both datasets. Functional analyses demonstrated time-dependent, differential regulation of genes involved in multiple signaling pathways and gene networks primarily related to immunity and inflammation. Notably, multiple and distinct gene networks involving T cell- and MHC antigen-related biology were persistently downregulated on both day one and day three. Further analyses demonstrated large scale, persistent downregulation of genes corresponding to functional annotations related to zinc homeostasis. These data represent the largest reported cohort of patients with septic shock subjected to longitudinal genome-level expression profiling. The data further advance our genome-level understanding of pediatric septic shock and support novel hypotheses. PMID:17932561
Course 10: Three Lectures on Biological Networks
NASA Astrophysics Data System (ADS)
Magnasco, M. O.
1 Enzymatic networks. Proofreading knots: How DNA topoisomerases disentangle DNA 1.1 Length scales and energy scales 1.2 DNA topology 1.3 Topoisomerases 1.4 Knots and supercoils 1.5 Topological equilibrium 1.6 Can topoisomerases recognize topology? 1.7 Proposal: Kinetic proofreading 1.8 How to do it twice 1.9 The care and proofreading of knots 1.10 Suppression of supercoils 1.11 Problems and outlook 1.12 Disquisition 2 Gene expression networks. Methods for analysis of DNA chip experiments 2.1 The regulation of gene expression 2.2 Gene expression arrays 2.3 Analysis of array data 2.4 Some simplifying assumptions 2.5 Probeset analysis 2.6 Discussion 3 Neural and gene expression networks: Song-induced gene expression in the canary brain 3.1 The study of songbirds 3.2 Canary song 3.3 ZENK 3.4 The blush 3.5 Histological analysis 3.6 Natural vs. artificial 3.7 The Blush II: gAP 3.8 Meditation
Ishiwata, Ryosuke R; Morioka, Masaki S; Ogishima, Soichi; Tanaka, Hiroshi
2009-02-15
BioCichlid is a 3D visualization system of time-course microarray data on molecular networks, aiming at interpretation of gene expression data by transcriptional relationships based on the central dogma with physical and genetic interactions. BioCichlid visualizes both physical (protein) and genetic (regulatory) network layers, and provides animation of time-course gene expression data on the genetic network layer. Transcriptional regulations are represented to bridge the physical network (transcription factors) and genetic network (regulated genes) layers, thus integrating promoter analysis into the pathway mapping. BioCichlid enhances the interpretation of microarray data and allows for revealing the underlying mechanisms causing differential gene expressions. BioCichlid is freely available and can be accessed at http://newton.tmd.ac.jp/. Source codes for both biocichlid server and client are also available.
Xi, Jianing; Wang, Minghui; Li, Ao
2018-06-05
Discovery of mutated driver genes is one of the primary objective for studying tumorigenesis. To discover some relatively low frequently mutated driver genes from somatic mutation data, many existing methods incorporate interaction network as prior information. However, the prior information of mRNA expression patterns are not exploited by these existing network-based methods, which is also proven to be highly informative of cancer progressions. To incorporate prior information from both interaction network and mRNA expressions, we propose a robust and sparse co-regularized nonnegative matrix factorization to discover driver genes from mutation data. Furthermore, our framework also conducts Frobenius norm regularization to overcome overfitting issue. Sparsity-inducing penalty is employed to obtain sparse scores in gene representations, of which the top scored genes are selected as driver candidates. Evaluation experiments by known benchmarking genes indicate that the performance of our method benefits from the two type of prior information. Our method also outperforms the existing network-based methods, and detect some driver genes that are not predicted by the competing methods. In summary, our proposed method can improve the performance of driver gene discovery by effectively incorporating prior information from interaction network and mRNA expression patterns into a robust and sparse co-regularized matrix factorization framework.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali
2011-01-01
Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additionalmore » genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.« less
EgoNet: identification of human disease ego-network modules
2014-01-01
Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628
Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella
2018-01-01
Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723
Probabilistic representation of gene regulatory networks.
Mao, Linyong; Resat, Haluk
2004-09-22
Recent experiments have established unambiguously that biological systems can have significant cell-to-cell variations in gene expression levels even in isogenic populations. Computational approaches to studying gene expression in cellular systems should capture such biological variations for a more realistic representation. In this paper, we present a new fully probabilistic approach to the modeling of gene regulatory networks that allows for fluctuations in the gene expression levels. The new algorithm uses a very simple representation for the genes, and accounts for the repression or induction of the genes and for the biological variations among isogenic populations simultaneously. Because of its simplicity, introduced algorithm is a very promising approach to model large-scale gene regulatory networks. We have tested the new algorithm on the synthetic gene network library bioengineered recently. The good agreement between the computed and the experimental results for this library of networks, and additional tests, demonstrate that the new algorithm is robust and very successful in explaining the experimental data. The simulation software is available upon request. Supplementary material will be made available on the OUP server.
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network
Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes
2012-01-01
Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms. PMID:22807664
TP53 mutations, expression and interaction networks in human cancers
Wang, Xiaosheng; Sun, Qingrong
2017-01-01
Although the associations of p53 dysfunction, p53 interaction networks and oncogenesis have been widely explored, a systematic analysis of TP53 mutations and its related interaction networks in various types of human cancers is lacking. Our study explored the associations of TP53 mutations, gene expression, clinical outcomes, and TP53 interaction networks across 33 cancer types using data from The Cancer Genome Atlas (TCGA). We show that TP53 is the most frequently mutated gene in a number of cancers, and its mutations appear to be early events in cancer initiation. We identified genes potentially repressed by p53, and genes whose expression correlates significantly with TP53 expression. These gene products may be especially important nodes in p53 interaction networks in human cancers. This study shows that while TP53-truncating mutations often result in decreased TP53 expression, other non-truncating TP53 mutations result in increased TP53 expression in some cancers. Survival analyses in a number of cancers show that patients with TP53 mutations are more likely to have worse prognoses than TP53-wildtype patients, and that elevated TP53 expression often leads to poor clinical outcomes. We identified a set of candidate synthetic lethal (SL) genes for TP53, and validated some of these SL interactions using data from the Cancer Cell Line Project. These predicted SL genes are promising candidates for experimental validation and the development of personalized therapeutics for patients with TP53-mutated cancers. PMID:27880943
TP53 mutations, expression and interaction networks in human cancers.
Wang, Xiaosheng; Sun, Qingrong
2017-01-03
Although the associations of p53 dysfunction, p53 interaction networks and oncogenesis have been widely explored, a systematic analysis of TP53 mutations and its related interaction networks in various types of human cancers is lacking. Our study explored the associations of TP53 mutations, gene expression, clinical outcomes, and TP53 interaction networks across 33 cancer types using data from The Cancer Genome Atlas (TCGA). We show that TP53 is the most frequently mutated gene in a number of cancers, and its mutations appear to be early events in cancer initiation. We identified genes potentially repressed by p53, and genes whose expression correlates significantly with TP53 expression. These gene products may be especially important nodes in p53 interaction networks in human cancers. This study shows that while TP53-truncating mutations often result in decreased TP53 expression, other non-truncating TP53 mutations result in increased TP53 expression in some cancers. Survival analyses in a number of cancers show that patients with TP53 mutations are more likely to have worse prognoses than TP53-wildtype patients, and that elevated TP53 expression often leads to poor clinical outcomes. We identified a set of candidate synthetic lethal (SL) genes for TP53, and validated some of these SL interactions using data from the Cancer Cell Line Project. These predicted SL genes are promising candidates for experimental validation and the development of personalized therapeutics for patients with TP53-mutated cancers.
Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.
Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J
2013-11-12
Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.
NASA Astrophysics Data System (ADS)
Tripathi, Shubham; Deem, Michael W.
2015-02-01
Cancer progresses with a change in the structure of the gene network in normal cells. We define a measure of organizational hierarchy in gene networks of affected cells in adult acute myeloid leukemia (AML) patients. With a retrospective cohort analysis based on the gene expression profiles of 116 AML patients, we find that the likelihood of future cancer relapse and the level of clinical risk are directly correlated with the level of organization in the cancer related gene network. We also explore the variation of the level of organization in the gene network with cancer progression. We find that this variation is non-monotonic, which implies the fitness landscape in the evolution of AML cancer cells is non-trivial. We further find that the hierarchy in gene expression at the time of diagnosis may be a useful biomarker in AML prognosis.
NASA Astrophysics Data System (ADS)
Jia, Chen; Qian, Hong; Chen, Min; Zhang, Michael Q.
2018-03-01
The transient response to a stimulus and subsequent recovery to a steady state are the fundamental characteristics of a living organism. Here we study the relaxation kinetics of autoregulatory gene networks based on the chemical master equation model of single-cell stochastic gene expression with nonlinear feedback regulation. We report a novel relation between the rate of relaxation, characterized by the spectral gap of the Markov model, and the feedback sign of the underlying gene circuit. When a network has no feedback, the relaxation rate is exactly the decaying rate of the protein. We further show that positive feedback always slows down the relaxation kinetics while negative feedback always speeds it up. Numerical simulations demonstrate that this relation provides a possible method to infer the feedback topology of autoregulatory gene networks by using time-series data of gene expression.
MINER: exploratory analysis of gene interaction networks by machine learning from expression data.
Kadupitige, Sidath Randeni; Leung, Kin Chun; Sellmeier, Julia; Sivieng, Jane; Catchpoole, Daniel R; Bain, Michael E; Gaëta, Bruno A
2009-12-03
The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. We have developed MINER (Microarray Interactive Network Exploration and Representation), an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Wang, Zhiwei; Liao, Tianqi; Zhou, Zhongkai; Wang, Yuyang; Diao, Yongjia; Strappe, Padraig; Prenzler, Paul; Ayton, Jamie; Blanchard, Chris
2016-09-06
To study the mechanism underlying the liver damage induced by deep-fried oil (DO) consumption and the beneficial effects from resistant starch (RS) supplement, differential gene expression and pathway network were analyzed based on RNA sequencing data from rats. The up/down regulated genes and corresponding signaling pathways were used to construct a novel local gene network (LGN). The topology of the network showed characteristics of small-world network, with some pathways demonstrating a high degree. Some changes in genes led to a larger probability occurrence of disease or infection with DO intake. More importantly, the main pathways were found to be almost the same between the two LGNs (30 pathways overlapped in total 48) with gene expression profile. This finding may indicate that RS supplement in DO-containing diet may mainly regulate the genes that related to DO damage, and RS in the diet may provide direct signals to the liver cells and modulate its effect through a network involving complex gene regulatory events. It is the first attempt to reveal the mechanism of the attenuation of liver dysfunction from RS supplement in the DO-containing diet using differential gene expression and pathway network. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Xiang, Ruidong; McNally, Jody; Rowe, Suzanne; Jonker, Arjan; Pinares-Patino, Cesar S.; Oddy, V. Hutton; Vercoe, Phil E.; McEwan, John C.; Dalrymple, Brian P.
2016-01-01
Ruminants obtain nutrients from microbial fermentation of plant material, primarily in their rumen, a multilayered forestomach. How the different layers of the rumen wall respond to diet and influence microbial fermentation, and how these process are regulated, is not well understood. Gene expression correlation networks were constructed from full thickness rumen wall transcriptomes of 24 sheep fed two different amounts and qualities of a forage and measured for methane production. The network contained two major negatively correlated gene sub-networks predominantly representing the epithelial and muscle layers of the rumen wall. Within the epithelium sub-network gene clusters representing lipid/oxo-acid metabolism, general metabolism and proliferating and differentiating cells were identified. The expression of cell cycle and metabolic genes was positively correlated with dry matter intake, ruminal short chain fatty acid concentrations and methane production. A weak correlation between lipid/oxo-acid metabolism genes and methane yield was observed. Feed consumption level explained the majority of gene expression variation, particularly for the cell cycle genes. Many known stratified epithelium transcription factors had significantly enriched targets in the epithelial gene clusters. The expression patterns of the transcription factors and their targets in proliferating and differentiating skin is mirrored in the rumen, suggesting conservation of regulatory systems. PMID:27966600
Annotation of gene function in citrus using gene expression information and co-expression networks
2014-01-01
Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
High-resolution gene expression data from blastoderm embryos of the scuttle fly Megaselia abdita
Wotton, Karl R; Jiménez-Guri, Eva; Crombach, Anton; Cicin-Sain, Damjan; Jaeger, Johannes
2015-01-01
Gap genes are involved in segment determination during early development in dipteran insects (flies, midges, and mosquitoes). We carried out a systematic quantitative comparative analysis of the gap gene network across different dipteran species. Our work provides mechanistic insights into the evolution of this pattern-forming network. As a central component of our project, we created a high-resolution quantitative spatio-temporal data set of gap and maternal co-ordinate gene expression in the blastoderm embryo of the non-drosophilid scuttle fly, Megaselia abdita. Our data include expression patterns in both wild-type and RNAi-treated embryos. The data—covering 10 genes, 10 time points, and over 1,000 individual embryos—consist of original embryo images, quantified expression profiles, extracted positions of expression boundaries, and integrated expression patterns, plus metadata and intermediate processing steps. These data provide a valuable resource for researchers interested in the comparative study of gene regulatory networks and pattern formation, an essential step towards a more quantitative and mechanistic understanding of developmental evolution. PMID:25977812
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-01-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918
Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.
Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin
2017-08-01
This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
From Saccharomyces cerevisiae to human: The important gene co-expression modules.
Liu, Wei; Li, Li; Ye, Hua; Chen, Haiwei; Shen, Weibiao; Zhong, Yuexian; Tian, Tian; He, Huaqin
2017-08-01
Network-based systems biology has become an important method for analyzing high-throughput gene expression data and gene function mining. Yeast has long been a popular model organism for biomedical research. In the current study, a weighted gene co-expression network analysis algorithm was applied to construct a gene co-expression network in Saccharomyces cerevisiae . Seventeen stable gene co-expression modules were detected from 2,814 S. cerevisiae microarray data. Further characterization of these modules with the Database for Annotation, Visualization and Integrated Discovery tool indicated that these modules were associated with certain biological processes, such as heat response, cell cycle, translational regulation, mitochondrion oxidative phosphorylation, amino acid metabolism and autophagy. Hub genes were also screened by intra-modular connectivity. Finally, the module conservation was evaluated in a human disease microarray dataset. Functional modules were identified in budding yeast, some of which are associated with patient survival. The current study provided a paradigm for single cell microorganisms and potentially other organisms.
Wang, Li-Xin; Li, Yang; Chen, Guan-Zhi
2018-01-01
Metastatic melanoma is an aggressive skin cancer and is one of the global malignancies with high mortality and morbidity. It is essential to identify and verify diagnostic biomarkers of early metastatic melanoma. Previous studies have systematically assessed protein biomarkers and mRNA-based expression characteristics. However, molecular markers for the early diagnosis of metastatic melanoma have not been identified. To explore potential regulatory targets, we have analyzed the gene microarray expression profiles of malignant melanoma samples by co-expression analysis based on the network approach. The differentially expressed genes (DEGs) were screened by the EdgeR package of R software. A weighted gene co-expression network analysis (WGCNA) was used for the identification of DEGs in the special gene modules and hub genes. Subsequently, a protein-protein interaction network was constructed to extract hub genes associated with gene modules. Finally, twenty-four important hub genes (RASGRP2, IKZF1, CXCR5, LTB, BLK, LINGO3, CCR6, P2RY10, RHOH, JUP, KRT14, PLA2G3, SPRR1A, KRT78, SFN, CLDN4, IL1RN, PKP3, CBLC, KRT16, TMEM79, KLK8, LYPD3 and LYPD5) were treated as valuable factors involved in the immune response and tumor cell development in tumorigenesis. In addition, a transcriptional regulatory network was constructed for these specific modules or hub genes, and a few core transcriptional regulators were found to be mostly associated with our hub genes, including GATA1, STAT1, SP1, and PSG1. In summary, our findings enhance our understanding of the biological process of malignant melanoma metastasis, enabling us to identify specific genes to use for diagnostic and prognostic markers and possibly for targeted therapy.
Gene network analysis: from heart development to cardiac therapy.
Ferrazzi, Fulvia; Bellazzi, Riccardo; Engel, Felix B
2015-03-01
Networks offer a flexible framework to represent and analyse the complex interactions between components of cellular systems. In particular gene networks inferred from expression data can support the identification of novel hypotheses on regulatory processes. In this review we focus on the use of gene network analysis in the study of heart development. Understanding heart development will promote the elucidation of the aetiology of congenital heart disease and thus possibly improve diagnostics. Moreover, it will help to establish cardiac therapies. For example, understanding cardiac differentiation during development will help to guide stem cell differentiation required for cardiac tissue engineering or to enhance endogenous repair mechanisms. We introduce different methodological frameworks to infer networks from expression data such as Boolean and Bayesian networks. Then we present currently available temporal expression data in heart development and discuss the use of network-based approaches in published studies. Collectively, our literature-based analysis indicates that gene network analysis constitutes a promising opportunity to infer therapy-relevant regulatory processes in heart development. However, the use of network-based approaches has so far been limited by the small amount of samples in available datasets. Thus, we propose to acquire high-resolution temporal expression data to improve the mathematical descriptions of regulatory processes obtained with gene network inference methodologies. Especially probabilistic methods that accommodate the intrinsic variability of biological systems have the potential to contribute to a deeper understanding of heart development.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Siqi; Joseph, Antony; Hammonds, Ann S.
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less
Wu, Siqi; Joseph, Antony; Hammonds, Ann S.; ...
2016-04-06
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.
Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong
2015-01-01
In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
Wang, Anping; Zhang, Guibin
2017-11-01
The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.
The Immunological Genome Project: networks of gene expression in immune cells.
Heng, Tracy S P; Painter, Michio W
2008-10-01
The Immunological Genome Project combines immunology and computational biology laboratories in an effort to establish a complete 'road map' of gene-expression and regulatory networks in all immune cells.
Gene regulatory network inference using fused LASSO on multiple data sets
Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran
2016-01-01
Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687
Construction of regulatory networks using expression time-series data of a genotyped population.
Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E
2011-11-29
The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.
Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui
2014-01-01
Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection.
Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui
2014-01-01
Objectives: Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. Methods: MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. Results: A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. Conclusions: We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection. PMID:25664019
Biophysical Constraints Arising from Compositional Context in Synthetic Gene Networks.
Yeung, Enoch; Dy, Aaron J; Martin, Kyle B; Ng, Andrew H; Del Vecchio, Domitilla; Beck, James L; Collins, James J; Murray, Richard M
2017-07-26
Synthetic gene expression is highly sensitive to intragenic compositional context (promoter structure, spacing regions between promoter and coding sequences, and ribosome binding sites). However, much less is known about the effects of intergenic compositional context (spatial arrangement and orientation of entire genes on DNA) on expression levels in synthetic gene networks. We compare expression of induced genes arranged in convergent, divergent, or tandem orientations. Induction of convergent genes yielded up to 400% higher expression, greater ultrasensitivity, and dynamic range than divergent- or tandem-oriented genes. Orientation affects gene expression whether one or both genes are induced. We postulate that transcriptional interference in divergent and tandem genes, mediated by supercoiling, can explain differences in expression and validate this hypothesis through modeling and in vitro supercoiling relaxation experiments. Treatment with gyrase abrogated intergenic context effects, bringing expression levels within 30% of each other. We rebuilt the toggle switch with convergent genes, taking advantage of supercoiling effects to improve threshold detection and switch stability. Copyright © 2017 Elsevier Inc. All rights reserved.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints
Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi
2017-01-01
The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
Mason, Mike J; Fan, Guoping; Plath, Kathrin; Zhou, Qing; Horvath, Steve
2009-01-01
Background Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we investigated the use of unsigned and signed network analysis to identify pluripotency and differentiation related genes. Results We show that signed networks provide a better systems level understanding of the regulatory mechanisms of ES cells than unsigned networks, using two independent murine ES cell expression data sets. Specifically, using signed weighted gene co-expression network analysis (WGCNA), we found a pluripotency module and a differentiation module, which are not identified in unsigned networks. We confirmed the importance of these modules by incorporating genome-wide TF binding data for key ES cell regulators. Interestingly, we find that the pluripotency module is enriched with genes related to DNA damage repair and mitochondrial function in addition to transcriptional regulation. Using a connectivity measure of module membership, we not only identify known regulators of ES cells but also show that Mrpl15, Msh6, Nrf1, Nup133, Ppif, Rbpj, Sh3gl2, and Zfp39, among other genes, have important roles in maintaining ES cell pluripotency and self-renewal. We also report highly significant relationships between module membership and epigenetic modifications (histone modifications and promoter CpG methylation status), which are known to play a role in controlling gene expression during ES cell self-renewal and differentiation. Conclusion Our systems biologic re-analysis of gene expression, transcription factor binding, epigenetic and gene ontology data provides a novel integrative view of ES cell biology. PMID:19619308
Wu, Siqi; Joseph, Antony; Hammonds, Ann S; Celniker, Susan E; Yu, Bin; Frise, Erwin
2016-04-19
Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set ofDrosophilaearly embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation ofDrosophilaexpression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. The performance of PP with theDrosophiladata suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.
Omony, Jimmy; de Jong, Anne; Krawczyk, Antonina O.; Eijlander, Robyn T.; Kuipers, Oscar P.
2018-01-01
Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes. PMID:29424683
Omony, Jimmy; de Jong, Anne; Krawczyk, Antonina O; Eijlander, Robyn T; Kuipers, Oscar P
2018-02-09
Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes.
Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex
Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel
2015-01-01
The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262
Priest, Henry D; Fox, Samuel E; Rowley, Erik R; Murray, Jessica R; Michael, Todd P; Mockler, Todd C
2014-01-01
Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
A transcriptional dynamic network during Arabidopsis thaliana pollen development.
Wang, Jigang; Qiu, Xiaojie; Li, Yuhua; Deng, Youping; Shi, Tieliu
2011-01-01
To understand transcriptional regulatory networks (TRNs), especially the coordinated dynamic regulation between transcription factors (TFs) and their corresponding target genes during development, computational approaches would represent significant advances in the genome-wide expression analysis. The major challenges for the experiments include monitoring the time-specific TFs' activities and identifying the dynamic regulatory relationships between TFs and their target genes, both of which are currently not yet available at the large scale. However, various methods have been proposed to computationally estimate those activities and regulations. During the past decade, significant progresses have been made towards understanding pollen development at each development stage under the molecular level, yet the regulatory mechanisms that control the dynamic pollen development processes remain largely unknown. Here, we adopt Networks Component Analysis (NCA) to identify TF activities over time course, and infer their regulatory relationships based on the coexpression of TFs and their target genes during pollen development. We carried out meta-analysis by integrating several sets of gene expression data related to Arabidopsis thaliana pollen development (stages range from UNM, BCP, TCP, HP to 0.5 hr pollen tube and 4 hr pollen tube). We constructed a regulatory network, including 19 TFs, 101 target genes and 319 regulatory interactions. The computationally estimated TF activities were well correlated to their coordinated genes' expressions during the development process. We clustered the expression of their target genes in the context of regulatory influences, and inferred new regulatory relationships between those TFs and their target genes, such as transcription factor WRKY34, which was identified that specifically expressed in pollen, and regulated several new target genes. Our finding facilitates the interpretation of the expression patterns with more biological relevancy, since the clusters corresponding to the activity of specific TF or the combination of TFs suggest the coordinated regulation of TFs to their target genes. Through integrating different resources, we constructed a dynamic regulatory network of Arabidopsis thaliana during pollen development with gene coexpression and NCA. The network illustrated the relationships between the TFs' activities and their target genes' expression, as well as the interactions between TFs, which provide new insight into the molecular mechanisms that control the pollen development.
Functional modules by relating protein interaction networks and gene expression.
Tornow, Sabine; Mewes, H W
2003-11-01
Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships.
Functional modules by relating protein interaction networks and gene expression
Tornow, Sabine; Mewes, H. W.
2003-01-01
Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships. PMID:14576317
Mahoney, J. Matthew; Taroni, Jaclyn; Martyanov, Viktor; Wood, Tammara A.; Greene, Casey S.; Pioli, Patricia A.; Hinchcliff, Monique E.; Whitfield, Michael L.
2015-01-01
Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6–12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes using a gene-gene interaction network, and place the genetic risk loci in the context of the intrinsic subsets. To identify gene expression modules common to three independent datasets from three different clinical centers, we developed a consensus clustering procedure based on mutual information of partitions, an information theory concept, and performed a meta-analysis of these genome-wide gene expression datasets. We created a gene-gene interaction network of the conserved molecular features across the intrinsic subsets and analyzed their connections with SSc-associated genetic polymorphisms. The network is composed of distinct, but interconnected, components related to interferon activation, M2 macrophages, adaptive immunity, extracellular matrix remodeling, and cell proliferation. The network shows extensive connections between the inflammatory- and fibroproliferative-specific genes. The network also shows connections between these subset-specific genes and 30 SSc-associated polymorphic genes including STAT4, BLK, IRF7, NOTCH4, PLAUR, CSK, IRAK1, and several human leukocyte antigen (HLA) genes. Our analyses suggest that the gene expression changes underlying the SSc subsets may be long-lived, but mechanistically interconnected and related to a patients underlying genetic risk. PMID:25569146
TRACING CO-REGULATORY NETWORK DYNAMICS IN NOISY, SINGLE-CELL TRANSCRIPTOME TRAJECTORIES.
Cordero, Pablo; Stuart, Joshua M
2017-01-01
The availability of gene expression data at the single cell level makes it possible to probe the molecular underpinnings of complex biological processes such as differentiation and oncogenesis. Promising new methods have emerged for reconstructing a progression 'trajectory' from static single-cell transcriptome measurements. However, it remains unclear how to adequately model the appreciable level of noise in these data to elucidate gene regulatory network rewiring. Here, we present a framework called Single Cell Inference of MorphIng Trajectories and their Associated Regulation (SCIMITAR) that infers progressions from static single-cell transcriptomes by employing a continuous parametrization of Gaussian mixtures in high-dimensional curves. SCIMITAR yields rich models from the data that highlight genes with expression and co-expression patterns that are associated with the inferred progression. Further, SCIMITAR extracts regulatory states from the implicated trajectory-evolvingco-expression networks. We benchmark the method on simulated data to show that it yields accurate cell ordering and gene network inferences. Applied to the interpretation of a single-cell human fetal neuron dataset, SCIMITAR finds progression-associated genes in cornerstone neural differentiation pathways missed by standard differential expression tests. Finally, by leveraging the rewiring of gene-gene co-expression relations across the progression, the method reveals the rise and fall of co-regulatory states and trajectory-dependent gene modules. These analyses implicate new transcription factors in neural differentiation including putative co-factors for the multi-functional NFAT pathway.
The transfer and transformation of collective network information in gene-matched networks.
Kitsukawa, Takashi; Yagi, Takeshi
2015-10-09
Networks, such as the human society network, social and professional networks, and biological system networks, contain vast amounts of information. Information signals in networks are distributed over nodes and transmitted through intricately wired links, making the transfer and transformation of such information difficult to follow. Here we introduce a novel method for describing network information and its transfer using a model network, the Gene-matched network (GMN), in which nodes (neurons) possess attributes (genes). In the GMN, nodes are connected according to their expression of common genes. Because neurons have multiple genes, the GMN is cluster-rich. We show that, in the GMN, information transfer and transformation were controlled systematically, according to the activity level of the network. Furthermore, information transfer and transformation could be traced numerically with a vector using genes expressed in the activated neurons, the active-gene array, which was used to assess the relative activity among overlapping neuronal groups. Interestingly, this coding style closely resembles the cell-assembly neural coding theory. The method introduced here could be applied to many real-world networks, since many systems, including human society and various biological systems, can be represented as a network of this type.
Yu, Hua; Jiao, Bingke; Lu, Lu; Wang, Pengfei; Chen, Shuangcheng; Liang, Chengzhi; Liu, Wei
2018-01-01
Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
Hiraishi, Kunihiko
2014-01-01
One of the significant topics in systems biology is to develop control theory of gene regulatory networks (GRNs). In typical control of GRNs, expression of some genes is inhibited (activated) by manipulating external stimuli and expression of other genes. It is expected to apply control theory of GRNs to gene therapy technologies in the future. In this paper, a control method using a Boolean network (BN) is studied. A BN is widely used as a model of GRNs, and gene expression is expressed by a binary value (ON or OFF). In particular, a context-sensitive probabilistic Boolean network (CS-PBN), which is one of the extended models of BNs, is used. For CS-PBNs, the verification problem and the optimal control problem are considered. For the verification problem, a solution method using the probabilistic model checker PRISM is proposed. For the optimal control problem, a solution method using polynomial optimization is proposed. Finally, a numerical example on the WNT5A network, which is related to melanoma, is presented. The proposed methods provide us useful tools in control theory of GRNs. PMID:24587766
Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing
2009-03-11
Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene-encoded proteins are attached to the core at more peripheral positions of the networks.
Network-Induced Classification Kernels for Gene Expression Profile Analysis
Dror, Gideon; Shamir, Ron
2012-01-01
Abstract Computational classification of gene expression profiles into distinct disease phenotypes has been highly successful to date. Still, robustness, accuracy, and biological interpretation of the results have been limited, and it was suggested that use of protein interaction information jointly with the expression profiles can improve the results. Here, we study three aspects of this problem. First, we show that interactions are indeed relevant by showing that co-expressed genes tend to be closer in the network of interactions. Second, we show that the improved performance of one extant method utilizing expression and interactions is not really due to the biological information in the network, while in another method this is not the case. Finally, we develop a new kernel method—called NICK—that integrates network and expression data for SVM classification, and demonstrate that overall it achieves better results than extant methods while running two orders of magnitude faster. PMID:22697242
Shi, Rui; Wang, Jack P; Lin, Ying-Chung; Li, Quanzi; Sun, Ying-Hsuan; Chen, Hao; Sederoff, Ronald R; Chiang, Vincent L
2017-05-01
Co-expression networks based on transcriptomes of Populus trichocarpa major tissues and specific cell types suggest redundant control of cell wall component biosynthetic genes by transcription factors in wood formation. We analyzed the transcriptomes of five tissues (xylem, phloem, shoot, leaf, and root) and two wood forming cell types (fiber and vessel) of Populus trichocarpa to assemble gene co-expression subnetworks associated with wood formation. We identified 165 transcription factors (TFs) that showed xylem-, fiber-, and vessel-specific expression. Of these 165 TFs, 101 co-expressed (correlation coefficient, r > 0.7) with the 45 secondary cell wall cellulose, hemicellulose, and lignin biosynthetic genes. Each cell wall component gene co-expressed on average with 34 TFs, suggesting redundant control of the cell wall component gene expression. Co-expression analysis showed that the 101 TFs and the 45 cell wall component genes each has two distinct groups (groups 1 and 2), based on their co-expression patterns. The group 1 TFs (44 members) are predominantly xylem and fiber specific, and are all highly positively co-expressed with the group 1 cell wall component genes (30 members), suggesting their roles as major wood formation regulators. Group 1 TFs include a lateral organ boundary domain gene (LBD) that has the highest number of positively correlated cell wall component genes (36) and TFs (47). The group 2 TFs have 57 members, including 14 vessel-specific TFs, and are generally less correlated with the cell wall component genes. An exception is a vessel-specific basic helix-loop-helix (bHLH) gene that negatively correlates with 20 cell wall component genes, and may function as a key transcriptional suppressor. The co-expression networks revealed here suggest a well-structured transcriptional homeostasis for cell wall component biosynthesis during wood formation.
Xu, Haoming; Moni, Mohammad Ali; Liò, Pietro
2015-12-01
In cancer genomics, gene expression levels provide important molecular signatures for all types of cancer, and this could be very useful for predicting the survival of cancer patients. However, the main challenge of gene expression data analysis is high dimensionality, and microarray is characterised by few number of samples with large number of genes. To overcome this problem, a variety of penalised Cox proportional hazard models have been proposed. We introduce a novel network regularised Cox proportional hazard model and a novel multiplex network model to measure the disease comorbidities and to predict survival of the cancer patient. Our methods are applied to analyse seven microarray cancer gene expression datasets: breast cancer, ovarian cancer, lung cancer, liver cancer, renal cancer and osteosarcoma. Firstly, we applied a principal component analysis to reduce the dimensionality of original gene expression data. Secondly, we applied a network regularised Cox regression model on the reduced gene expression datasets. By using normalised mutual information method and multiplex network model, we predict the comorbidities for the liver cancer based on the integration of diverse set of omics and clinical data, and we find the diseasome associations (disease-gene association) among different cancers based on the identified common significant genes. Finally, we evaluated the precision of the approach with respect to the accuracy of survival prediction using ROC curves. We report that colon cancer, liver cancer and renal cancer share the CXCL5 gene, and breast cancer, ovarian cancer and renal cancer share the CCND2 gene. Our methods are useful to predict survival of the patient and disease comorbidities more accurately and helpful for improvement of the care of patients with comorbidity. Software in Matlab and R is available on our GitHub page: https://github.com/ssnhcom/NetworkRegularisedCox.git. Copyright © 2015. Published by Elsevier Ltd.
Kar, Siddhartha P.; Tyrer, Jonathan P.; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie T.; Beckmann, Matthias W.; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F.; Edwards, Robert P.; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K.; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K.; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston-Campbell, Lara E.; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Monteiro, Alvaro N. A.; Freedman, Matthew L.; Gayther, Simon A.; Pharoah, Paul D. P.
2015-01-01
Background Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by co-expression may also be enriched for additional EOC risk associations. Methods We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly co-expressed with each selected TF gene in the unified microarray data set of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this data set were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Results Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P<0.05 and FDR<0.05). These results were replicated (P<0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. Conclusion We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Impact Network analysis integrating large, context-specific data sets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. PMID:26209509
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks.
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks.
Cooperative Adaptive Responses in Gene Regulatory Networks with Many Degrees of Freedom
Inoue, Masayo; Kaneko, Kunihiko
2013-01-01
Cells generally adapt to environmental changes by first exhibiting an immediate response and then gradually returning to their original state to achieve homeostasis. Although simple network motifs consisting of a few genes have been shown to exhibit such adaptive dynamics, they do not reflect the complexity of real cells, where the expression of a large number of genes activates or represses other genes, permitting adaptive behaviors. Here, we investigated the responses of gene regulatory networks containing many genes that have undergone numerical evolution to achieve high fitness due to the adaptive response of only a single target gene; this single target gene responds to changes in external inputs and later returns to basal levels. Despite setting a single target, most genes showed adaptive responses after evolution. Such adaptive dynamics were not due to common motifs within a few genes; even without such motifs, almost all genes showed adaptation, albeit sometimes partial adaptation, in the sense that expression levels did not always return to original levels. The genes split into two groups: genes in the first group exhibited an initial increase in expression and then returned to basal levels, while genes in the second group exhibited the opposite changes in expression. From this model, genes in the first group received positive input from other genes within the first group, but negative input from genes in the second group, and vice versa. Thus, the adaptation dynamics of genes from both groups were consolidated. This cooperative adaptive behavior was commonly observed if the number of genes involved was larger than the order of ten. These results have implications in the collective responses of gene expression networks in microarray measurements of yeast Saccharomyces cerevisiae and the significance to the biological homeostasis of systems with many components. PMID:23592959
Genomic survey, expression profile and co-expression network analysis of OsWD40 family in rice
2012-01-01
Background WD40 proteins represent a large family in eukaryotes, which have been involved in a broad spectrum of crucial functions. Systematic characterization and co-expression analysis of OsWD40 genes enable us to understand the networks of the WD40 proteins and their biological processes and gene functions in rice. Results In this study, we identify and analyze 200 potential OsWD40 genes in rice, describing their gene structures, genome localizations, and evolutionary relationship of each member. Expression profiles covering the whole life cycle in rice has revealed that transcripts of OsWD40 were accumulated differentially during vegetative and reproductive development and preferentially up or down-regulated in different tissues. Under phytohormone treatments, 25 OsWD40 genes were differentially expressed with treatments of one or more of the phytohormone NAA, KT, or GA3 in rice seedlings. We also used a combined analysis of expression correlation and Gene Ontology annotation to infer the biological role of the OsWD40 genes in rice. The results suggested that OsWD40 genes may perform their diverse functions by complex network, thus were predictive for understanding their biological pathways. The analysis also revealed that OsWD40 genes might interact with each other to take part in metabolic pathways, suggesting a more complex feedback network. Conclusions All of these analyses suggest that the functions of OsWD40 genes are diversified, which provide useful references for selecting candidate genes for further functional studies. PMID:22429805
Fyn-Dependent Gene Networks in Acute Ethanol Sensitivity
Farris, Sean P.; Miles, Michael F.
2013-01-01
Studies in humans and animal models document that acute behavioral responses to ethanol are predisposing factor for the risk of long-term drinking behavior. Prior microarray data from our laboratory document strain- and brain region-specific variation in gene expression profile responses to acute ethanol that may be underlying regulators of ethanol behavioral phenotypes. The non-receptor tyrosine kinase Fyn has previously been mechanistically implicated in the sedative-hypnotic response to acute ethanol. To further understand how Fyn may modulate ethanol behaviors, we used whole-genome expression profiling. We characterized basal and acute ethanol-evoked (3 g/kg) gene expression patterns in nucleus accumbens (NAC), prefrontal cortex (PFC), and ventral midbrain (VMB) of control and Fyn knockout mice. Bioinformatics analysis identified a set of Fyn-related gene networks differently regulated by acute ethanol across the three brain regions. In particular, our analysis suggested a coordinate basal decrease in myelin-associated gene expression within NAC and PFC as an underlying factor in sensitivity of Fyn null animals to ethanol sedation. An in silico analysis across the BXD recombinant inbred (RI) strains of mice identified a significant correlation between Fyn expression and a previously published ethanol loss-of-righting-reflex (LORR) phenotype. By combining PFC gene expression correlates to Fyn and LORR across multiple genomic datasets, we identified robust Fyn-centric gene networks related to LORR. Our results thus suggest that multiple system-wide changes exist within specific brain regions of Fyn knockout mice, and that distinct Fyn-dependent expression networks within PFC may be important determinates of the LORR due to acute ethanol. These results add to the interpretation of acute ethanol behavioral sensitivity in Fyn kinase null animals, and identify Fyn-centric gene networks influencing variance in ethanol LORR. Such networks may also inform future design of pharmacotherapies for the treatment and prevention of alcohol use disorders. PMID:24312422
Loohuis, Nikkie FM Olde; Kasri, Nael Nadif; Glennon, Jeffrey C; van Bokhoven, Hans; Hébert, Sébastien S; Kaplan, Barry B.; Martens, Gerard JM; Aschrafi, Armaz
2016-01-01
MicroRNAs (miRs) are small regulatory molecules, which orchestrate neuronal development and plasticity through modulation of complex gene networks. microRNA-137 (miR-137) is a brain-enriched RNA with a critical role in regulating brain development and in mediating synaptic plasticity. Importantly, mutations in this miR are associated with the pathoetiology of schizophrenia (SZ), and there is a widespread assumption that disruptions in miR-137 expression lead to aberrant expression of gene regulatory networks associated with SZ. To systematically identify the mRNA targets for this miR, we performed miR-137 gain- and loss-of-function experiments in primary rat hippocampal neurons and profiled differentially expressed mRNAs through next-generation sequencing. We identified 500 genes that were bidirectionally activated or repressed in their expression by the modulation of miR-137 levels. Gene ontology analysis using two independent software resources suggested functions for these miR-137-regulated genes in neurodevelopmental processes, neuronal maturation processes and cell maintenance, all of which known to be critical for proper brain circuitry formation. Since many of the putative miR-137 targets identified here also have been previously shown to be associated with SZ, we propose that this miR acts as a critical gene network hub contributing to the pathophysiology of this neurodevelopmental disorder. PMID:26925706
Gene co-expression networks shed light into diseases of brain iron accumulation
Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M.; Botía, Juan A.; Collingwood, Joanna F.; Hardy, John; Milward, Elizabeth A.; Ryten, Mina; Houlden, Henry
2016-01-01
Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700
Gene co-expression networks shed light into diseases of brain iron accumulation.
Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M; Botía, Juan A; Collingwood, Joanna F; Hardy, John; Milward, Elizabeth A; Ryten, Mina; Houlden, Henry
2016-03-01
Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Briggs, Christine E; Wang, Yulei; Kong, Benjamin; Woo, Tsung-Ung W; Iyer, Lakshmanan K; Sonntag, Kai C
2015-08-27
The degeneration of substantia nigra (SN) dopamine (DA) neurons in sporadic Parkinson׳s disease (PD) is characterized by disturbed gene expression networks. Micro(mi)RNAs are post-transcriptional regulators of gene expression and we recently provided evidence that these molecules may play a functional role in the pathogenesis of PD. Here, we document a comprehensive analysis of miRNAs in SN DA neurons and PD, including sex differences. Our data show that miRNAs are dysregulated in disease-affected neurons and differentially expressed between male and female samples with a trend of more up-regulated miRNAs in males and more down-regulated miRNAs in females. Unbiased Ingenuity Pathway Analysis (IPA) revealed a network of miRNA/target-gene associations that is consistent with dysfunctional gene and signaling pathways in PD pathology. Our study provides evidence for a general association of miRNAs with the cellular function and identity of SN DA neurons, and with deregulated gene expression networks and signaling pathways related to PD pathogenesis that may be sex-specific. Copyright © 2015 Elsevier B.V. All rights reserved.
Using expression genetics to study the neurobiology of ethanol and alcoholism.
Farris, Sean P; Wolen, Aaron R; Miles, Michael F
2010-01-01
Recent simultaneous progress in human and animal model genetics and the advent of microarray whole genome expression profiling have produced prodigious data sets on genetic loci, potential candidate genes, and differential gene expression related to alcoholism and ethanol behaviors. Validated target genes or gene networks functioning in alcoholism are still of meager proportions. Genetical genomics, which combines genetic analysis of both traditional phenotypes and whole genome expression data, offers a potential methodology for characterizing brain gene networks functioning in alcoholism. This chapter will describe concepts, approaches, and recent findings in the field of genetical genomics as it applies to alcohol research. Copyright 2010 Elsevier Inc. All rights reserved.
2011-01-01
Background Global transcriptional analysis of loblolly pine (Pinus taeda L.) is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes). Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01). Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs) including those with significant homology (E-values ≤ 2 × 10-30) to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in roots. Many of the genes identified are known to be up-regulated in response to osmotic stress in pine and other plant species and encode proteins involved in both signal transduction and stress tolerance. Gene expression levels returned to control values within a 48-hour recovery period in all but 76 transcripts. Correlation network analysis indicates a scale-free network topology for the pine root transcriptome and identifies central nodes that may serve as drivers of drought-responsive transcriptome dynamics in the roots of loblolly pine. PMID:21609476
Dynamics of Bacterial Gene Regulatory Networks.
Shis, David L; Bennett, Matthew R; Igoshin, Oleg A
2018-05-20
The ability of bacterial cells to adjust their gene expression program in response to environmental perturbation is often critical for their survival. Recent experimental advances allowing us to quantitatively record gene expression dynamics in single cells and in populations coupled with mathematical modeling enable mechanistic understanding on how these responses are shaped by the underlying regulatory networks. Here, we review how the combination of local and global factors affect dynamical responses of gene regulatory networks. Our goal is to discuss the general principles that allow extrapolation from a few model bacteria to less understood microbes. We emphasize that, in addition to well-studied effects of network architecture, network dynamics are shaped by global pleiotropic effects and cell physiology.
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants. PMID:29692794
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants.
Genomic connectivity networks based on the BrainSpan atlas of the developing human brain
NASA Astrophysics Data System (ADS)
Mahfouz, Ahmed; Ziats, Mark N.; Rennert, Owen M.; Lelieveldt, Boudewijn P. F.; Reinders, Marcel J. T.
2014-03-01
The human brain comprises systems of networks that span the molecular, cellular, anatomic and functional levels. Molecular studies of the developing brain have focused on elucidating networks among gene products that may drive cellular brain development by functioning together in biological pathways. On the other hand, studies of the brain connectome attempt to determine how anatomically distinct brain regions are connected to each other, either anatomically (diffusion tensor imaging) or functionally (functional MRI and EEG), and how they change over development. A global examination of the relationship between gene expression and connectivity in the developing human brain is necessary to understand how the genetic signature of different brain regions instructs connections to other regions. Furthermore, analyzing the development of connectivity networks based on the spatio-temporal dynamics of gene expression provides a new insight into the effect of neurodevelopmental disease genes on brain networks. In this work, we construct connectivity networks between brain regions based on the similarity of their gene expression signature, termed "Genomic Connectivity Networks" (GCNs). Genomic connectivity networks were constructed using data from the BrainSpan Transcriptional Atlas of the Developing Human Brain. Our goal was to understand how the genetic signatures of anatomically distinct brain regions relate to each other across development. We assessed the neurodevelopmental changes in connectivity patterns of brain regions when networks were constructed with genes implicated in the neurodevelopmental disorder autism (autism spectrum disorder; ASD). Using graph theory metrics to characterize the GCNs, we show that ASD-GCNs are relatively less connected later in development with the cerebellum showing a very distinct expression of ASD-associated genes compared to other brain regions.
dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data.
Huynh-Thu, Vân Anh; Geurts, Pierre
2018-02-21
The elucidation of gene regulatory networks is one of the major challenges of systems biology. Measurements about genes that are exploited by network inference methods are typically available either in the form of steady-state expression vectors or time series expression data. In our previous work, we proposed the GENIE3 method that exploits variable importance scores derived from Random forests to identify the regulators of each target gene. This method provided state-of-the-art performance on several benchmark datasets, but it could however not specifically be applied to time series expression data. We propose here an adaptation of the GENIE3 method, called dynamical GENIE3 (dynGENIE3), for handling both time series and steady-state expression data. The proposed method is evaluated extensively on the artificial DREAM4 benchmarks and on three real time series expression datasets. Although dynGENIE3 does not systematically yield the best performance on each and every network, it is competitive with diverse methods from the literature, while preserving the main advantages of GENIE3 in terms of scalability.
Porcine Tissue-Specific Regulatory Networks Derived from Meta-Analysis of the Transcriptome
Pérez-Montarelo, Dafne; Hudson, Nicholas J.; Fernández, Ana I.; Ramayo-Caldas, Yuliaxis; Dalrymple, Brian P.; Reverter, Antonio
2012-01-01
The processes that drive tissue identity and differentiation remain unclear for most tissue types. So are the gene networks and transcription factors (TF) responsible for the differential structure and function of each particular tissue, and this is particularly true for non model species with incomplete genomic resources. To better understand the regulation of genes responsible for tissue identity in pigs, we have inferred regulatory networks from a meta-analysis of 20 gene expression studies spanning 480 Porcine Affymetrix chips for 134 experimental conditions on 27 distinct tissues. We developed a mixed-model normalization approach with a covariance structure that accommodated the disparity in the origin of the individual studies, and obtained the normalized expression of 12,320 genes across the 27 tissues. Using this resource, we constructed a network, based on the co-expression patterns of 1,072 TF and 1,232 tissue specific genes. The resulting network is consistent with the known biology of tissue development. Within the network, genes clustered by tissue and tissues clustered by site of embryonic origin. These clusters were significantly enriched for genes annotated in key relevant biological processes and confirm gene functions and interactions from the literature. We implemented a Regulatory Impact Factor (RIF) metric to identify the key regulators in skeletal muscle and tissues from the central nervous systems. The normalization of the meta-analysis, the inference of the gene co-expression network and the RIF metric, operated synergistically towards a successful search for tissue-specific regulators. Novel among these findings are evidence suggesting a novel key role of ERCC3 as a muscle regulator. Together, our results recapitulate the known biology behind tissue specificity and provide new valuable insights in a less studied but valuable model species. PMID:23049964
Mistry, Divya; Wise, Roger P; Dickerson, Julie A
2017-01-01
Identification of central genes and proteins in biomolecular networks provides credible candidates for pathway analysis, functional analysis, and essentiality prediction. The DiffSLC centrality measure predicts central and essential genes and proteins using a protein-protein interaction network. Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures helped identify critical genes and proteins in biomolecular networks. The proposed centrality measure, DiffSLC, combines the number of interactions of a protein and the gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLC is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from the Saccharomyces Gene Deletion Project, which show that DiffSLC detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLC a stronger alternative to other centrality methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLC is implemented using the igraph package in R, and networkx package in Python. The python package can be obtained from git.io/diffslcpy. The R implementation and code to reproduce the analysis is available via git.io/diffslc.
NASA Technical Reports Server (NTRS)
Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara
2000-01-01
We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.
Networking of differentially expressed genes in human cancer cells resistant to methotrexate
2009-01-01
Background The need for an integrated view of data obtained from high-throughput technologies gave rise to network analyses. These are especially useful to rationalize how external perturbations propagate through the expression of genes. To address this issue in the case of drug resistance, we constructed biological association networks of genes differentially expressed in cell lines resistant to methotrexate (MTX). Methods Seven cell lines representative of different types of cancer, including colon cancer (HT29 and Caco2), breast cancer (MCF-7 and MDA-MB-468), pancreatic cancer (MIA PaCa-2), erythroblastic leukemia (K562) and osteosarcoma (Saos-2), were used. The differential expression pattern between sensitive and MTX-resistant cells was determined by whole human genome microarrays and analyzed with the GeneSpring GX software package. Genes deregulated in common between the different cancer cell lines served to generate biological association networks using the Pathway Architect software. Results Dikkopf homolog-1 (DKK1) is a highly interconnected node in the network generated with genes in common between the two colon cancer cell lines, and functional validations of this target using small interfering RNAs (siRNAs) showed a chemosensitization toward MTX. Members of the UDP-glucuronosyltransferase 1A (UGT1A) family formed a network of genes differentially expressed in the two breast cancer cell lines. siRNA treatment against UGT1A also showed an increase in MTX sensitivity. Eukaryotic translation elongation factor 1 alpha 1 (EEF1A1) was overexpressed among the pancreatic cancer, leukemia and osteosarcoma cell lines, and siRNA treatment against EEF1A1 produced a chemosensitization toward MTX. Conclusions Biological association networks identified DKK1, UGT1As and EEF1A1 as important gene nodes in MTX-resistance. Treatments using siRNA technology against these three genes showed chemosensitization toward MTX. PMID:19732436
ICan: an integrated co-alteration network to identify ovarian cancer-related genes.
Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan
2015-01-01
Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.
ICan: An Integrated Co-Alteration Network to Identify Ovarian Cancer-Related Genes
Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan
2015-01-01
Background Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. Results We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). Conclusion In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data. PMID:25803614
Faraji, Farhoud; Hu, Ying; Wu, Gang; Goldberger, Natalie E.; Walker, Renard C.; Zhang, Jinghui; Hunter, Kent W.
2014-01-01
Metastasis is the result of stochastic genomic and epigenetic events leading to gene expression profiles that drive tumor dissemination. Here we exploit the principle that metastatic propensity is modified by the genetic background to generate prognostic gene expression signatures that illuminate regulators of metastasis. We also identify multiple microRNAs whose germline variation is causally linked to tumor progression and metastasis. We employ network analysis of global gene expression profiles in tumors derived from a panel of recombinant inbred mice to identify a network of co-expressed genes centered on Cnot2 that predicts metastasis-free survival. Modulating Cnot2 expression changes tumor cell metastatic potential in vivo, supporting a functional role for Cnot2 in metastasis. Small RNA sequencing of the same tumor set revealed a negative correlation between expression of the Mir216/217 cluster and tumor progression. Expression quantitative trait locus analysis (eQTL) identified cis-eQTLs at the Mir216/217 locus, indicating that differences in expression may be inherited. Ectopic expression of Mir216/217 in tumor cells suppressed metastasis in vivo. Finally, small RNA sequencing and mRNA expression profiling data were integrated to reveal that miR-3470a/b target a high proportion of network transcripts. In vivo analysis of Mir3470a/b demonstrated that both promote metastasis. Moreover, Mir3470b is a likely regulator of the Cnot2 network as its overexpression down-regulated expression of network hub genes and enhanced metastasis in vivo, phenocopying Cnot2 knockdown. The resulting data from this strategy identify Cnot2 as a novel regulator of metastasis and demonstrate the power of our systems-level approach in identifying modifiers of metastasis. PMID:24322557
Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall.
Park, Chihyun; Yun, So Jeong; Ryu, Sung Jin; Lee, Soyoung; Lee, Young-Sam; Yoon, Youngmi; Park, Sang Chul
2017-03-15
Cellular senescence irreversibly arrests growth of human diploid cells. In addition, recent studies have indicated that senescence is a multi-step evolving process related to important complex biological processes. Most studies analyzed only the genes and their functions representing each senescence phase without considering gene-level interactions and continuously perturbed genes. It is necessary to reveal the genotypic mechanism inferred by affected genes and their interaction underlying the senescence process. We suggested a novel computational approach to identify an integrative network which profiles an underlying genotypic signature from time-series gene expression data. The relatively perturbed genes were selected for each time point based on the proposed scoring measure denominated as perturbation scores. Then, the selected genes were integrated with protein-protein interactions to construct time point specific network. From these constructed networks, the conserved edges across time point were extracted for the common network and statistical test was performed to demonstrate that the network could explain the phenotypic alteration. As a result, it was confirmed that the difference of average perturbation scores of common networks at both two time points could explain the phenotypic alteration. We also performed functional enrichment on the common network and identified high association with phenotypic alteration. Remarkably, we observed that the identified cell cycle specific common network played an important role in replicative senescence as a key regulator. Heretofore, the network analysis from time series gene expression data has been focused on what topological structure was changed over time point. Conversely, we focused on the conserved structure but its context was changed in course of time and showed it was available to explain the phenotypic changes. We expect that the proposed method will help to elucidate the biological mechanism unrevealed by the existing approaches.
Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic
2016-06-01
Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
2014-01-01
Background Our current knowledge of tooth development derives mainly from studies in mice, which have only one set of non-replaced teeth, compared with the diphyodont dentition in humans. The miniature pig is also diphyodont, making it a valuable alternative model for understanding human tooth development and replacement. However, little is known about gene expression and function during swine odontogenesis. The goal of this study is to undertake the survey of differential gene expression profiling and functional network analysis during morphogenesis of diphyodont dentition in miniature pigs. The identification of genes related to diphyodont development should lead to a better understanding of morphogenetic patterns and the mechanisms of diphyodont replacement in large animal models and humans. Results The temporal gene expression profiles during early diphyodont development in miniature pigs were detected with the Affymetrix Porcine GeneChip. The gene expression data were further evaluated by ANOVA as well as pathway and STC analyses. A total of 2,053 genes were detected with differential expression. Several signal pathways and 151 genes were then identified through the construction of pathway and signal networks. Conclusions The gene expression profiles indicated that spatio-temporal down-regulation patterns of gene expression were predominant; while, both dynamic activation and inhibition of pathways occurred during the morphogenesis of diphyodont dentition. Our study offers a mechanistic framework for understanding dynamic gene regulation of early diphyodont development and provides a molecular basis for studying teeth development, replacement, and regeneration in miniature pigs. PMID:24498892
Yang, Jun; Hou, Ziming; Wang, Changjiang; Wang, Hao; Zhang, Hongbing
2018-04-23
Adamantinomatous craniopharyngioma (ACP) is an aggressive brain tumor that occurs predominantly in the pediatric population. Conventional diagnosis method and standard therapy cannot treat ACPs effectively. In this paper, we aimed to identify key genes for ACP early diagnosis and treatment. Datasets GSE94349 and GSE68015 were obtained from Gene Expression Omnibus database. Consensus clustering was applied to discover the gene clusters in the expression data of GSE94349 and functional enrichment analysis was performed on gene set in each cluster. The protein-protein interaction (PPI) network was built by the Search Tool for the Retrieval of Interacting Genes, and hubs were selected. Support vector machine (SVM) model was built based on the signature genes identified from enrichment analysis and PPI network. Dataset GSE94349 was used for training and testing, and GSE68015 was used for validation. Besides, RT-qPCR analysis was performed to analyze the expression of signature genes in ACP samples compared with normal controls. Seven gene clusters were discovered in the differentially expressed genes identified from GSE94349 dataset. Enrichment analysis of each cluster identified 25 pathways that highly associated with ACP. PPI network was built and 46 hubs were determined. Twenty-five pathway-related genes that overlapped with the hubs in PPI network were used as signatures to establish the SVM diagnosis model for ACP. The prediction accuracy of SVM model for training, testing, and validation data were 94, 85, and 74%, respectively. The expression of CDH1, CCL2, ITGA2, COL8A1, COL6A2, and COL6A3 were significantly upregulated in ACP tumor samples, while CAMK2A, RIMS1, NEFL, SYT1, and STX1A were significantly downregulated, which were consistent with the differentially expressed gene analysis. SVM model is a promising classification tool for screening and early diagnosis of ACP. The ACP-related pathways and signature genes will advance our knowledge of ACP pathogenesis and benefit the therapy improvement.
Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.
Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin
2013-09-22
High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.
Jia, Peilin; Chen, Xiangning; Fanous, Ayman H; Zhao, Zhongming
2018-05-24
Genetic components susceptible to complex disease such as schizophrenia include a wide spectrum of variants, including common variants (CVs) and de novo mutations (DNMs). Although CVs and DNMs differ by origin, it remains elusive whether and how they interact at the gene, pathway, and network levels that leads to the disease. In this work, we characterized the genes harboring schizophrenia-associated CVs (CVgenes) and the genes harboring DNMs (DNMgenes) using measures from network, tissue-specific expression profile, and spatiotemporal brain expression profile. We developed an algorithm to link the DNMgenes and CVgenes in spatiotemporal brain co-expression networks. DNMgenes tended to have central roles in the human protein-protein interaction (PPI) network, evidenced in their high degree and high betweenness values. DNMgenes and CVgenes connected with each other significantly more often than with other genes in the networks. However, only CVgenes remained significantly connected after adjusting for their degree. In our gene co-expression PPI network, we found DNMgenes and CVgenes connected in a tissue-specific fashion, and such a pattern was similar to that in GTEx brain but not in other GTEx tissues. Importantly, DNMgene-CVgene subnetworks were enriched with pathways of chromatin remodeling, MHC protein complex binding, and neurotransmitter activities. In summary, our results unveiled that both DNMgenes and CVgenes contributed to a core set of biologically important pathways and networks, and their interactions may attribute to the risk for schizophrenia. Our results also suggested a stronger biological effect of DNMgenes than CVgenes in schizophrenia.
Investigation of candidate genes for osteoarthritis based on gene expression profiles.
Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei
2016-12-01
To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Thakar, Manjusha; Howard, Jason D.; Kagohara, Luciane T.; Krigsfeld, Gabriel; Ranaweera, Ruchira S.; Hughes, Robert M.; Perez, Jimena; Jones, Siân; Favorov, Alexander V.; Carey, Jacob; Stein-O'Brien, Genevieve; Gaykalova, Daria A.; Ochs, Michael F.; Chung, Christine H.
2016-01-01
Patients with oncogene driven tumors are treated with targeted therapeutics including EGFR inhibitors. Genomic data from The Cancer Genome Atlas (TCGA) demonstrates molecular alterations to EGFR, MAPK, and PI3K pathways in previously untreated tumors. Therefore, this study uses bioinformatics algorithms to delineate interactions resulting from EGFR inhibitor use in cancer cells with these genetic alterations. We modify the HaCaT keratinocyte cell line model to simulate cancer cells with constitutive activation of EGFR, HRAS, and PI3K in a controlled genetic background. We then measure gene expression after treating modified HaCaT cells with gefitinib, afatinib, and cetuximab. The CoGAPS algorithm distinguishes a gene expression signature associated with the anticipated silencing of the EGFR network. It also infers a feedback signature with EGFR gene expression itself increasing in cells that are responsive to EGFR inhibitors. This feedback signature has increased expression of several growth factor receptors regulated by the AP-2 family of transcription factors. The gene expression signatures for AP-2alpha are further correlated with sensitivity to cetuximab treatment in HNSCC cell lines and changes in EGFR expression in HNSCC tumors with low CDKN2A gene expression. In addition, the AP-2alpha gene expression signatures are also associated with inhibition of MEK, PI3K, and mTOR pathways in the Library of Integrated Network-Based Cellular Signatures (LINCS) data. These results suggest that AP-2 transcription factors are activated as feedback from EGFR network inhibition and may mediate EGFR inhibitor resistance. PMID:27650546
Bidkhori, Gholamreza; Narimani, Zahra; Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali
2013-01-01
Our goal of this study was to reconstruct a "genome-scale co-expression network" and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named "genome-scale co-expression network". As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules.
Selection Shapes Transcriptional Logic and Regulatory Specialization in Genetic Networks
Fogelmark, Karl; Peterson, Carsten; Troein, Carl
2016-01-01
Background Living organisms need to regulate their gene expression in response to environmental signals and internal cues. This is a computational task where genes act as logic gates that connect to form transcriptional networks, which are shaped at all scales by evolution. Large-scale mutations such as gene duplications and deletions add and remove network components, whereas smaller mutations alter the connections between them. Selection determines what mutations are accepted, but its importance for shaping the resulting networks has been debated. Methodology To investigate the effects of selection in the shaping of transcriptional networks, we derive transcriptional logic from a combinatorially powerful yet tractable model of the binding between DNA and transcription factors. By evolving the resulting networks based on their ability to function as either a simple decision system or a circadian clock, we obtain information on the regulation and logic rules encoded in functional transcriptional networks. Comparisons are made between networks evolved for different functions, as well as with structurally equivalent but non-functional (neutrally evolved) networks, and predictions are validated against the transcriptional network of E. coli. Principal Findings We find that the logic rules governing gene expression depend on the function performed by the network. Unlike the decision systems, the circadian clocks show strong cooperative binding and negative regulation, which achieves tight temporal control of gene expression. Furthermore, we find that transcription factors act preferentially as either activators or repressors, both when binding multiple sites for a single target gene and globally in the transcriptional networks. This separation into positive and negative regulators requires gene duplications, which highlights the interplay between mutation and selection in shaping the transcriptional networks. PMID:26927540
Gene Circuit Analysis of the Terminal Gap Gene huckebein
Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes
2009-01-01
The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378
Gene circuit analysis of the terminal gap gene huckebein.
Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes
2009-10-01
The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network.
Inferring Time-Varying Network Topologies from Gene Expression Data
2007-01-01
Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster—to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence. PMID:18309363
Inferring time-varying network topologies from gene expression data.
Rao, Arvind; Hero, Alfred O; States, David J; Engel, James Douglas
2007-01-01
Most current methods for gene regulatory network identification lead to the inference of steady-state networks, that is, networks prevalent over all times, a hypothesis which has been challenged. There has been a need to infer and represent networks in a dynamic, that is, time-varying fashion, in order to account for different cellular states affecting the interactions amongst genes. In this work, we present an approach, regime-SSM, to understand gene regulatory networks within such a dynamic setting. The approach uses a clustering method based on these underlying dynamics, followed by system identification using a state-space model for each learnt cluster--to infer a network adjacency matrix. We finally indicate our results on the mouse embryonic kidney dataset as well as the T-cell activation-based expression dataset and demonstrate conformity with reported experimental evidence.
Sass, Hjalte C R; Borup, Rehannah; Alanin, Mikkel; Nielsen, Finn Cilius; Cayé-Thomasen, Per
2017-01-01
The objective of this study was to determine global gene expression in relation to Vestibular schwannomas (VS) growth rate and to identify signal transduction pathways and functional molecular networks associated with growth. Repeated magnetic resonance imaging (MRI) prior to surgery determined tumor growth rate. Following tissue sampling during surgery, mRNA was extracted from 16 sporadic VS. Double stranded cDNA was synthesized from the mRNA and used as template for in vitro transcription reaction to synthesize biotin-labeled antisense cRNA, which was hybridized to Affymetrix HG-U133A arrays and analyzed by dChip software. Differential gene expression was defined as a 1.5-fold difference between fast and slow growing tumors (><0.5 ccm/year), employing a p-value <0.01. Deregulated transcripts were matched against established gene ontology. Ingenuity Pathway Analysis was used for identification of signal transduction pathways and functional molecular networks associated with tumor growth. In total 109 genes were deregulated in relation to tumor growth rate. Genes associated with apoptosis, growth and cell proliferation were deregulated. Gene ontology included regulation of the cell cycle, cell differentiation and proliferation, among other functions. Fourteen pathways were associated with tumor growth. Five functional molecular networks were generated. This first study on global gene expression in relation to vestibular schwannoma growth rate identified several genes, signal transduction pathways and functional networks associated with tumor progression. Specific genes involved in apoptosis, cell growth and proliferation were deregulated in fast growing tumors. Fourteen pathways were associated with tumor growth. Generated functional networks underlined the importance of the PI3K family, among others.
Co-option of the polarity gene network shapes filament morphology in angiosperms
de Almeida, Ana Maria Rocha; Yockteng, Roxana; Schnable, James; Alvarez-Buylla, Elena R.; Freeling, Michael; Specht, Chelsea D.
2014-01-01
The molecular genetic mechanisms underlying abaxial-adaxial polarity in plants have been studied as a property of lateral and flattened organs, such as leaves. In leaves, laminar expansion occurs as a result of balanced abaxial-adaxial gene expression. Over- or under- expression of either abaxializing or adaxializing genes inhibits laminar growth, resulting in a mutant radialized phenotype. Here, we show that co-option of the abaxial-adaxial polarity gene network plays a role in the evolution of stamen filament morphology in angiosperms. RNA-Seq data from species bearing laminar (flattened) or radial (cylindrical) filaments demonstrates that species with laminar filaments exhibit balanced expression of abaxial-adaxial (ab-ad) genes, while overexpression of a YABBY gene is found in species with radial filaments. This result suggests that unbalanced expression of ab-ad genes results in inhibition of laminar outgrowth, leading to a radially symmetric structure as found in many angiosperm filaments. We anticipate that co-option of the polarity gene network is a fundamental mechanism shaping many aspects of plant morphology during angiosperm evolution. PMID:25168962
Co-option of the polarity gene network shapes filament morphology in angiosperms.
de Almeida, Ana Maria Rocha; Yockteng, Roxana; Schnable, James; Alvarez-Buylla, Elena R; Freeling, Michael; Specht, Chelsea D
2014-08-29
The molecular genetic mechanisms underlying abaxial-adaxial polarity in plants have been studied as a property of lateral and flattened organs, such as leaves. In leaves, laminar expansion occurs as a result of balanced abaxial-adaxial gene expression. Over- or under- expression of either abaxializing or adaxializing genes inhibits laminar growth, resulting in a mutant radialized phenotype. Here, we show that co-option of the abaxial-adaxial polarity gene network plays a role in the evolution of stamen filament morphology in angiosperms. RNA-Seq data from species bearing laminar (flattened) or radial (cylindrical) filaments demonstrates that species with laminar filaments exhibit balanced expression of abaxial-adaxial (ab-ad) genes, while overexpression of a YABBY gene is found in species with radial filaments. This result suggests that unbalanced expression of ab-ad genes results in inhibition of laminar outgrowth, leading to a radially symmetric structure as found in many angiosperm filaments. We anticipate that co-option of the polarity gene network is a fundamental mechanism shaping many aspects of plant morphology during angiosperm evolution.
Safari-Alighiarloo, Nahid; Taghizadeh, Mohammad; Tabatabaei, Seyyed Mohammad; Namaki, Saeed
2016-01-01
Background The involvement of multiple genes and missing heritability, which are dominant in complex diseases such as multiple sclerosis (MS), entail using network biology to better elucidate their molecular basis and genetic factors. We therefore aimed to integrate interactome (protein–protein interaction (PPI)) and transcriptomes data to construct and analyze PPI networks for MS disease. Methods Gene expression profiles in paired cerebrospinal fluid (CSF) and peripheral blood mononuclear cells (PBMCs) samples from MS patients, sampled in relapse or remission and controls, were analyzed. Differentially expressed genes which determined only in CSF (MS vs. control) and PBMCs (relapse vs. remission) separately integrated with PPI data to construct the Query-Query PPI (QQPPI) networks. The networks were further analyzed to investigate more central genes, functional modules and complexes involved in MS progression. Results The networks were analyzed and high centrality genes were identified. Exploration of functional modules and complexes showed that the majority of high centrality genes incorporated in biological pathways driving MS pathogenesis. Proteasome and spliceosome were also noticeable in enriched pathways in PBMCs (relapse vs. remission) which were identified by both modularity and clique analyses. Finally, STK4, RB1, CDKN1A, CDK1, RAC1, EZH2, SDCBP genes in CSF (MS vs. control) and CDC37, MAP3K3, MYC genes in PBMCs (relapse vs. remission) were identified as potential candidate genes for MS, which were the more central genes involved in biological pathways. Discussion This study showed that network-based analysis could explicate the complex interplay between biological processes underlying MS. Furthermore, an experimental validation of candidate genes can lead to identification of potential therapeutic targets. PMID:28028462
Kaushik, Abhinav; Bhatia, Yashuma; Ali, Shakir; Gupta, Dinesh
2015-01-01
Metastatic melanoma patients have a poor prognosis, mainly attributable to the underlying heterogeneity in melanoma driver genes and altered gene expression profiles. These characteristics of melanoma also make the development of drugs and identification of novel drug targets for metastatic melanoma a daunting task. Systems biology offers an alternative approach to re-explore the genes or gene sets that display dysregulated behaviour without being differentially expressed. In this study, we have performed systems biology studies to enhance our knowledge about the conserved property of disease genes or gene sets among mutually exclusive datasets representing melanoma progression. We meta-analysed 642 microarray samples to generate melanoma reconstructed networks representing four different stages of melanoma progression to extract genes with altered molecular circuitry wiring as compared to a normal cellular state. Intriguingly, a majority of the melanoma network-rewired genes are not differentially expressed and the disease genes involved in melanoma progression consistently modulate its activity by rewiring network connections. We found that the shortlisted disease genes in the study show strong and abnormal network connectivity, which enhances with the disease progression. Moreover, the deviated network properties of the disease gene sets allow ranking/prioritization of different enriched, dysregulated and conserved pathway terms in metastatic melanoma, in agreement with previous findings. Our analysis also reveals presence of distinct network hubs in different stages of metastasizing tumor for the same set of pathways in the statistically conserved gene sets. The study results are also presented as a freely available database at http://bioinfo.icgeb.res.in/m3db/. The web-based database resource consists of results from the analysis presented here, integrated with cytoscape web and user-friendly tools for visualization, retrieval and further analysis. PMID:26558755
Analysis of gene network robustness based on saturated fixed point attractors
2014-01-01
The analysis of gene network robustness to noise and mutation is important for fundamental and practical reasons. Robustness refers to the stability of the equilibrium expression state of a gene network to variations of the initial expression state and network topology. Numerical simulation of these variations is commonly used for the assessment of robustness. Since there exists a great number of possible gene network topologies and initial states, even millions of simulations may be still too small to give reliable results. When the initial and equilibrium expression states are restricted to being saturated (i.e., their elements can only take values 1 or −1 corresponding to maximum activation and maximum repression of genes), an analytical gene network robustness assessment is possible. We present this analytical treatment based on determination of the saturated fixed point attractors for sigmoidal function models. The analysis can determine (a) for a given network, which and how many saturated equilibrium states exist and which and how many saturated initial states converge to each of these saturated equilibrium states and (b) for a given saturated equilibrium state or a given pair of saturated equilibrium and initial states, which and how many gene networks, referred to as viable, share this saturated equilibrium state or the pair of saturated equilibrium and initial states. We also show that the viable networks sharing a given saturated equilibrium state must follow certain patterns. These capabilities of the analytical treatment make it possible to properly define and accurately determine robustness to noise and mutation for gene networks. Previous network research conclusions drawn from performing millions of simulations follow directly from the results of our analytical treatment. Furthermore, the analytical results provide criteria for the identification of model validity and suggest modified models of gene network dynamics. The yeast cell-cycle network is used as an illustration of the practical application of this analytical treatment. PMID:24650364
A prior-based integrative framework for functional transcriptional regulatory network inference
Siahpirani, Alireza F.
2017-01-01
Abstract Transcriptional regulatory networks specify regulatory proteins controlling the context-specific expression levels of genes. Inference of genome-wide regulatory networks is central to understanding gene regulation, but remains an open challenge. Expression-based network inference is among the most popular methods to infer regulatory networks, however, networks inferred from such methods have low overlap with experimentally derived (e.g. ChIP-chip and transcription factor (TF) knockouts) networks. Currently we have a limited understanding of this discrepancy. To address this gap, we first develop a regulatory network inference algorithm, based on probabilistic graphical models, to integrate expression with auxiliary datasets supporting a regulatory edge. Second, we comprehensively analyze our and other state-of-the-art methods on different expression perturbation datasets. Networks inferred by integrating sequence-specific motifs with expression have substantially greater agreement with experimentally derived networks, while remaining more predictive of expression than motif-based networks. Our analysis suggests natural genetic variation as the most informative perturbation for network inference, and, identifies core TFs whose targets are predictable from expression. Multiple reasons make the identification of targets of other TFs difficult, including network architecture and insufficient variation of TF mRNA level. Finally, we demonstrate the utility of our inference algorithm to infer stress-specific regulatory networks and for regulator prioritization. PMID:27794550
Kadarmideen, Haja N; Watson-haigh, Nathan S
2012-01-01
Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four different treatments with Metyrapone, an inhibitor of cortisol biosynthesis. We conducted several microarray quality control checks before applying GCN methods to filtered datasets. Then we compared the outputs of two methods using connectivity as a criterion, as it measures how well a node (gene) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT and node ranks in two methods were compared to identify those nodes which are highly differentially ranked (HDR). A total of 1,017 HDR nodes were identified across one or more of four networks. We investigated HDR nodes by gene enrichment analyses in relation to their biological relevance to phenotypes. We observed that, in contrast to WGCNA method, PCIT algorithm removes many of the edges of the most highly interconnected nodes. Removal of edges of most highly connected nodes or hub genes will have consequences for downstream analyses and biological interpretations. In general, for large GCN construction (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended. PMID:23144540
Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; ...
2015-03-27
Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.
Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
Duan, Fenghai; Xu, Ye
2017-01-01
To analyze a microarray experiment to identify the genes with expressions varying after the diagnosis of breast cancer. A total of 44 928 probe sets in an Affymetrix microarray data publicly available on Gene Expression Omnibus from 249 patients with breast cancer were analyzed by the nonparametric multivariate adaptive splines. Then, the identified genes with turning points were grouped by K-means clustering, and their network relationship was subsequently analyzed by the Ingenuity Pathway Analysis. In total, 1640 probe sets (genes) were reliably identified to have turning points along with the age at diagnosis in their expression profiling, of which 927 expressed lower after turning points and 713 expressed higher after the turning points. K-means clustered them into 3 groups with turning points centering at 54, 62.5, and 72, respectively. The pathway analysis showed that the identified genes were actively involved in various cancer-related functions or networks. In this article, we applied the nonparametric multivariate adaptive splines method to a publicly available gene expression data and successfully identified genes with expressions varying before and after breast cancer diagnosis.
Rubel, Cory A; Wu, San-Pin; Lin, Lin; Wang, Tianyuan; Lanz, Rainer B; Li, Xilong; Kommagani, Ramakrishna; Franco, Heather L; Camper, Sally A; Tong, Qiang; Jeong, Jae-Wook; Lydon, John P; DeMayo, Francesco J
2016-10-25
Altered progesterone responsiveness leads to female infertility and cancer, but underlying mechanisms remain unclear. Mice with uterine-specific ablation of GATA binding protein 2 (Gata2) are infertile, showing failures in embryo implantation, endometrial decidualization, and uninhibited estrogen signaling. Gata2 deficiency results in reduced progesterone receptor (PGR) expression and attenuated progesterone signaling, as evidenced by genome-wide expression profiling and chromatin immunoprecipitation. GATA2 not only occupies at and promotes expression of the Pgr gene but also regulates downstream progesterone responsive genes in conjunction with the PGR. Additionally, Gata2 knockout uteri exhibit abnormal luminal epithelia with ectopic TRP63 expressing squamous cells and a cancer-related molecular profile in a progesterone-independent manner. Lastly, we found a conserved GATA2-PGR regulatory network in both human and mice based on gene signature and path analyses using gene expression profiles of human endometrial tissues. In conclusion, uterine Gata2 regulates a key regulatory network of gene expression for progesterone signaling at the early pregnancy stage. Published by Elsevier Inc.
Nonlinear Dynamics in Gene Regulation Promote Robustness and Evolvability of Gene Expression Levels.
Steinacher, Arno; Bates, Declan G; Akman, Ozgur E; Soyer, Orkun S
2016-01-01
Cellular phenotypes underpinned by regulatory networks need to respond to evolutionary pressures to allow adaptation, but at the same time be robust to perturbations. This creates a conflict in which mutations affecting regulatory networks must both generate variance but also be tolerated at the phenotype level. Here, we perform mathematical analyses and simulations of regulatory networks to better understand the potential trade-off between robustness and evolvability. Examining the phenotypic effects of mutations, we find an inverse correlation between robustness and evolvability that breaks only with nonlinearity in the network dynamics, through the creation of regions presenting sudden changes in phenotype with small changes in genotype. For genotypes embedding low levels of nonlinearity, robustness and evolvability correlate negatively and almost perfectly. By contrast, genotypes embedding nonlinear dynamics allow expression levels to be robust to small perturbations, while generating high diversity (evolvability) under larger perturbations. Thus, nonlinearity breaks the robustness-evolvability trade-off in gene expression levels by allowing disparate responses to different mutations. Using analytical derivations of robustness and system sensitivity, we show that these findings extend to a large class of gene regulatory network architectures and also hold for experimentally observed parameter regimes. Further, the effect of nonlinearity on the robustness-evolvability trade-off is ensured as long as key parameters of the system display specific relations irrespective of their absolute values. We find that within this parameter regime genotypes display low and noisy expression levels. Examining the phenotypic effects of mutations, we find an inverse correlation between robustness and evolvability that breaks only with nonlinearity in the network dynamics. Our results provide a possible solution to the robustness-evolvability trade-off, suggest an explanation for the ubiquity of nonlinear dynamics in gene expression networks, and generate useful guidelines for the design of synthetic gene circuits.
Wang, Ning; Xu, Zhi-Wen; Wang, Kun-Hao
2014-01-01
MicroRNAs (miRNAs) are small non-coding RNA molecules found in multicellular eukaryotes which are implicated in development of cancer, including cutaneous squamous cell carcinoma (cSCC). Expression is controlled by transcription factors (TFs) that bind to specific DNA sequences, thereby controlling the flow (or transcription) of genetic information from DNA to messenger RNA. Interactions result in biological signal control networks. Molecular components involved in cSCC were here assembled at abnormally expressed, related and global levels. Networks at these three levels were constructed with corresponding biological factors in term of interactions between miRNAs and target genes, TFs and miRNAs, and host genes and miRNAs. Up/down regulation or mutation of the factors were considered in the context of the regulation and significant patterns were extracted. Participants of the networks were evaluated based on their expression and regulation of other factors. Sub-networks with two core TFs, TP53 and EIF2C2, as the centers are identified. These share self-adapt feedback regulation in which a mutual restraint exists. Up or down regulation of certain genes and miRNAs are discussed. Some, for example the expression of MMP13, were in line with expectation while others, including FGFR3, need further investigation of their unexpected behavior. The present research suggests that dozens of components, miRNAs, TFs, target genes and host genes included, unite as networks through their regulation to function systematically in human cSCC. Networks built under the currently available sources provide critical signal controlling pathways and frequent patterns. Inappropriate controlling signal flow from abnormal expression of key TFs may push the system into an incontrollable situation and therefore contributes to cSCC development.
Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai
2017-12-28
Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate networks. The research results in this work shows that the developed approach is an efficient and effective method to reverse-engineer gene networks using single-cell experimental observations.
Gene Expression Profiling of Gastric Cancer
Marimuthu, Arivusudar; Jacob, Harrys K.C.; Jakharia, Aniruddha; Subbannayya, Yashwanth; Keerthikumar, Shivakumar; Kashyap, Manoj Kumar; Goel, Renu; Balakrishnan, Lavanya; Dwivedi, Sutopa; Pathare, Swapnali; Dikshit, Jyoti Bajpai; Maharudraiah, Jagadeesha; Singh, Sujay; Sameer Kumar, Ghantasala S; Vijayakumar, M.; Veerendra Kumar, Kariyanakatte Veeraiah; Premalatha, Chennagiri Shrinivasamurthy; Tata, Pramila; Hariharan, Ramesh; Roa, Juan Carlos; Prasad, T.S.K; Chaerkady, Raghothama; Kumar, Rekha Vijay; Pandey, Akhilesh
2015-01-01
Gastric cancer is the second leading cause of cancer death worldwide, both in men and women. A genomewide gene expression analysis was carried out to identify differentially expressed genes in gastric adenocarcinoma tissues as compared to adjacent normal tissues. We used Agilent’s whole human genome oligonucleotide microarray platform representing ~41,000 genes to carry out gene expression analysis. Two-color microarray analysis was employed to directly compare the expression of genes between tumor and normal tissues. Through this approach, we identified several previously known candidate genes along with a number of novel candidate genes in gastric cancer. Testican-1 (SPOCK1) was one of the novel molecules that was 10-fold upregulated in tumors. Using tissue microarrays, we validated the expression of testican-1 by immunohistochemical staining. It was overexpressed in 56% (160/282) of the cases tested. Pathway analysis led to the identification of several networks in which SPOCK1 was among the topmost networks of interacting genes. By gene enrichment analysis, we identified several genes involved in cell adhesion and cell proliferation to be significantly upregulated while those corresponding to metabolic pathways were significantly downregulated. The differentially expressed genes identified in this study are candidate biomarkers for gastric adenoacarcinoma. PMID:27030788
Calabrese, Gina; Mesner, Larry D.; Foley, Patricia L.; Rosen, Clifford J.; Farber, Charles R.
2016-01-01
The postmenopausal period in women is associated with decreased circulating estrogen levels, which accelerate bone loss and increase the risk of fracture. Here, we gained novel insight into the molecular mechanisms mediating bone loss in ovariectomized (OVX) mice, a model of human menopause, using co-expression network analysis. Specifically, we generated a co-expression network consisting of 53 gene modules using expression profiles from intact and OVX mice from a panel of inbred strains. The expression of four modules was altered by OVX, including module 23 whose expression was decreased by OVX across all strains. Module 23 was enriched for genes involved in the response to oxidative stress, a process known to be involved in OVX-induced bone loss. Additionally, module 23 homologs were co-expressed in human bone marrow. Alpha synuclein (Snca) was one of the most highly connected “hub” genes in module 23. We characterized mice deficient in Snca and observed a 40% reduction in OVX-induced bone loss. Furthermore, protection was associated with the altered expression of specific network modules, including module 23. In summary, the results of this study suggest that Snca regulates bone network homeostasis and ovariectomy-induced bone loss. PMID:27378017
Jiang, Peng; Scarpa, Joseph R.; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D.; Hao, Ke; Summa, Keith C.; Yang, He S.; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H.; Turek, Fred W.; Kasarskis, Andrew
2016-01-01
SUMMARY Sleep dysfunction and stress susceptibility are co-morbid complex traits, which often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multi-level organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J×A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests the interplay between sleep, stress, and neuropathology emerge from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework to interrogate the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. PMID:25921536
Visual gene-network analysis reveals the cancer gene co-expression in human endometrial cancer
2014-01-01
Background Endometrial cancers (ECs) are the most common form of gynecologic malignancy. Recent studies have reported that ECs reveal distinct markers for molecular pathogenesis, which in turn is linked to the various histological types of ECs. To understand further the molecular events contributing to ECs and endometrial tumorigenesis in general, a more precise identification of cancer-associated molecules and signaling networks would be useful for the detection and monitoring of malignancy, improving clinical cancer therapy, and personalization of treatments. Results ECs-specific gene co-expression networks were constructed by differential expression analysis and weighted gene co-expression network analysis (WGCNA). Important pathways and putative cancer hub genes contribution to tumorigenesis of ECs were identified. An elastic-net regularized classification model was built using the cancer hub gene signatures to predict the phenotypic characteristics of ECs. The 19 cancer hub gene signatures had high predictive power to distinguish among three key principal features of ECs: grade, type, and stage. Intriguingly, these hub gene networks seem to contribute to ECs progression and malignancy via cell-cycle regulation, antigen processing and the citric acid (TCA) cycle. Conclusions The results of this study provide a powerful biomarker discovery platform to better understand the progression of ECs and to uncover potential therapeutic targets in the treatment of ECs. This information might lead to improved monitoring of ECs and resulting improvement of treatment of ECs, the 4th most common of cancer in women. PMID:24758163
IL-32 is a molecular marker of a host defense network in human tuberculosis
Montoya, Dennis; Inkeles, Megan S.; Liu, Phillip T.; Realegeno, Susan; Teles, Rosane M. B.; Vaidya, Poorva; Munoz, Marcos A.; Schenk, Mirjam; Swindell, William R.; Chun, Rene; Zavala, Kathryn; Hewison, Martin; Adams, John S.; Horvath, Steve; Pellegrini, Matteo; Bloom, Barry R.; Modlin, Robert L.
2014-01-01
Tuberculosis is a leading cause of infectious disease–related death worldwide; however, only 10% of people infected with Mycobacterium tuberculosis develop disease. Factors that contribute to protection could prove to be promising targets for M. tuberculosis therapies. Analysis of peripheral blood gene expression profiles of active tuberculosis patients has identified correlates of risk for disease or pathogenesis. We sought to identify potential human candidate markers of host defense by studying gene expression profiles of macrophages, cells that, upon infection by M. tuberculosis, can mount an antimicrobial response. Weighted gene coexpression network analysis revealed an association between the cytokine interleukin-32 (IL-32) and the vitamin D antimicrobial pathway in a network of interferon-γ– and IL-15–induced “defense response” genes. IL-32 induced the vitamin D–dependent antimicrobial peptides cathelicidin and DEFB4 and to generate antimicrobial activity in vitro, dependent on the presence of adequate 25-hydroxyvitamin D. In addition, the IL-15–induced defense response macrophage gene network was integrated with ranked pairwise comparisons of gene expression from five different clinical data sets of latent compared with active tuberculosis or healthy controls and a coexpression network derived from gene expression in patients with tuberculosis undergoing chemotherapy. Together, these analyses identified eight common genes, including IL-32, as molecular markers of latent tuberculosis and the IL-15–induced gene network. As maintaining M. tuberculosis in a latent state and preventing transition to active disease may represent a form of host resistance, these results identify IL-32 as one functional marker and potential correlate of protection against active tuberculosis. PMID:25143364
IL-32 is a molecular marker of a host defense network in human tuberculosis.
Montoya, Dennis; Inkeles, Megan S; Liu, Phillip T; Realegeno, Susan; Teles, Rosane M B; Vaidya, Poorva; Munoz, Marcos A; Schenk, Mirjam; Swindell, William R; Chun, Rene; Zavala, Kathryn; Hewison, Martin; Adams, John S; Horvath, Steve; Pellegrini, Matteo; Bloom, Barry R; Modlin, Robert L
2014-08-20
Tuberculosis is a leading cause of infectious disease-related death worldwide; however, only 10% of people infected with Mycobacterium tuberculosis develop disease. Factors that contribute to protection could prove to be promising targets for M. tuberculosis therapies. Analysis of peripheral blood gene expression profiles of active tuberculosis patients has identified correlates of risk for disease or pathogenesis. We sought to identify potential human candidate markers of host defense by studying gene expression profiles of macrophages, cells that, upon infection by M. tuberculosis, can mount an antimicrobial response. Weighted gene coexpression network analysis revealed an association between the cytokine interleukin-32 (IL-32) and the vitamin D antimicrobial pathway in a network of interferon-γ- and IL-15-induced "defense response" genes. IL-32 induced the vitamin D-dependent antimicrobial peptides cathelicidin and DEFB4 and to generate antimicrobial activity in vitro, dependent on the presence of adequate 25-hydroxyvitamin D. In addition, the IL-15-induced defense response macrophage gene network was integrated with ranked pairwise comparisons of gene expression from five different clinical data sets of latent compared with active tuberculosis or healthy controls and a coexpression network derived from gene expression in patients with tuberculosis undergoing chemotherapy. Together, these analyses identified eight common genes, including IL-32, as molecular markers of latent tuberculosis and the IL-15-induced gene network. As maintaining M. tuberculosis in a latent state and preventing transition to active disease may represent a form of host resistance, these results identify IL-32 as one functional marker and potential correlate of protection against active tuberculosis. Copyright © 2014, American Association for the Advancement of Science.
A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning
2018-01-01
Risk stratification model for lung cancer with gene expression profile is of great interest. Instead of previous models based on individual prognostic genes, we aimed to develop a novel system-level risk stratification model for lung adenocarcinoma based on gene coexpression network. Using multiple microarray, gene coexpression network analysis was performed to identify survival-related networks. A deep learning based risk stratification model was constructed with representative genes of these networks. The model was validated in two test sets. Survival analysis was performed using the output of the model to evaluate whether it could predict patients' survival independent of clinicopathological variables. Five networks were significantly associated with patients' survival. Considering prognostic significance and representativeness, genes of the two survival-related networks were selected for input of the model. The output of the model was significantly associated with patients' survival in two test sets and training set (p < 0.00001, p < 0.0001 and p = 0.02 for training and test sets 1 and 2, resp.). In multivariate analyses, the model was associated with patients' prognosis independent of other clinicopathological features. Our study presents a new perspective on incorporating gene coexpression networks into the gene expression signature and clinical application of deep learning in genomic data science for prognosis prediction. PMID:29581968
Using genetic markers to orient the edges in quantitative trait networks: the NEO software.
Aten, Jason E; Fuller, Tova F; Lusis, Aldons J; Horvath, Steve
2008-04-15
Systems genetic studies have been used to identify genetic loci that affect transcript abundances and clinical traits such as body weight. The pairwise correlations between gene expression traits and/or clinical traits can be used to define undirected trait networks. Several authors have argued that genetic markers (e.g expression quantitative trait loci, eQTLs) can serve as causal anchors for orienting the edges of a trait network. The availability of hundreds of thousands of genetic markers poses new challenges: how to relate (anchor) traits to multiple genetic markers, how to score the genetic evidence in favor of an edge orientation, and how to weigh the information from multiple markers. We develop and implement Network Edge Orienting (NEO) methods and software that address the challenges of inferring unconfounded and directed gene networks from microarray-derived gene expression data by integrating mRNA levels with genetic marker data and Structural Equation Model (SEM) comparisons. The NEO software implements several manual and automatic methods for incorporating genetic information to anchor traits. The networks are oriented by considering each edge separately, thus reducing error propagation. To summarize the genetic evidence in favor of a given edge orientation, we propose Local SEM-based Edge Orienting (LEO) scores that compare the fit of several competing causal graphs. SEM fitting indices allow the user to assess local and overall model fit. The NEO software allows the user to carry out a robustness analysis with regard to genetic marker selection. We demonstrate the utility of NEO by recovering known causal relationships in the sterol homeostasis pathway using liver gene expression data from an F2 mouse cross. Further, we use NEO to study the relationship between a disease gene and a biologically important gene co-expression module in liver tissue. The NEO software can be used to orient the edges of gene co-expression networks or quantitative trait networks if the edges can be anchored to genetic marker data. R software tutorials, data, and supplementary material can be downloaded from: http://www.genetics.ucla.edu/labs/horvath/aten/NEO.
Govender, Nisha; Senan, Siju; Mohamed-Hussein, Zeti-Azura; Wickneswari, Ratnam
2018-06-15
The plant shoot system consists of reproductive organs such as inflorescences, buds and fruits, and the vegetative leaves and stems. In this study, the reproductive part of the Jatropha curcas shoot system, which includes the aerial shoots, shoots bearing the inflorescence and inflorescence were investigated in regard to gene-to-gene interactions underpinning yield-related biological processes. An RNA-seq based sequencing of shoot tissues performed on an Illumina HiSeq. 2500 platform generated 18 transcriptomes. Using the reference genome-based mapping approach, a total of 64 361 genes was identified in all samples and the data was annotated against the non-redundant database by the BLAST2GO Pro. Suite. After removing the outlier genes and samples, a total of 12 734 genes across 17 samples were subjected to gene co-expression network construction using petal, an R library. A gene co-expression network model built with scale-free and small-world properties extracted four vicinity networks (VNs) with putative involvement in yield-related biological processes as follow; heat stress tolerance, floral and shoot meristem differentiation, biosynthesis of chlorophyll molecules and laticifers, cell wall metabolism and epigenetic regulations. Our VNs revealed putative key players that could be adapted in breeding strategies for J. curcas shoot system improvements.
Construction of diagnosis system and gene regulatory networks based on microarray analysis.
Hong, Chun-Fu; Chen, Ying-Chen; Chen, Wei-Chun; Tu, Keng-Chang; Tsai, Meng-Hsiun; Chan, Yung-Kuan; Yu, Shyr Shen
2018-05-01
A microarray analysis generally contains expression data of thousands of genes, but most of them are irrelevant to the disease of interest, making analyzing the genes concerning specific diseases complicated. Therefore, filtering out a few essential genes as well as their regulatory networks is critical, and a disease can be easily diagnosed just depending on the expression profiles of a few critical genes. In this study, a target gene screening (TGS) system, which is a microarray-based information system that integrates F-statistics, pattern recognition matching, a two-layer K-means classifier, a Parameter Detection Genetic Algorithm (PDGA), a genetic-based gene selector (GBG selector) and the association rule, was developed to screen out a small subset of genes that can discriminate malignant stages of cancers. During the first stage, F-statistic, pattern recognition matching, and a two-layer K-means classifier were applied in the system to filter out the 20 critical genes most relevant to ovarian cancer from 9600 genes, and the PDGA was used to decide the fittest values of the parameters for these critical genes. Among the 20 critical genes, 15 are associated with cancer progression. In the second stage, we further employed a GBG selector and the association rule to screen out seven target gene sets, each with only four to six genes, and each of which can precisely identify the malignancy stage of ovarian cancer based on their expression profiles. We further deduced the gene regulatory networks of the 20 critical genes by applying the Pearson correlation coefficient to evaluate the correlationship between the expression of each gene at the same stages and at different stages. Correlationships between gene pairs were calculated, and then, three regulatory networks were deduced. Their correlationships were further confirmed by the Ingenuity pathway analysis. The prognostic significances of the genes identified via regulatory networks were examined using online tools, and most represented biomarker candidates. In summary, our proposed system provides a new strategy to identify critical genes or biomarkers, as well as their regulatory networks, from microarray data. Copyright © 2018. Published by Elsevier Inc.
Bickel, David R.; Montazeri, Zahra; Hsieh, Pei-Chun; Beatty, Mary; Lawit, Shai J.; Bate, Nicholas J.
2009-01-01
Motivation: Measurements of gene expression over time enable the reconstruction of transcriptional networks. However, Bayesian networks and many other current reconstruction methods rely on assumptions that conflict with the differential equations that describe transcriptional kinetics. Practical approximations of kinetic models would enable inferring causal relationships between genes from expression data of microarray, tag-based and conventional platforms, but conclusions are sensitive to the assumptions made. Results: The representation of a sufficiently large portion of genome enables computation of an upper bound on how much confidence one may place in influences between genes on the basis of expression data. Information about which genes encode transcription factors is not necessary but may be incorporated if available. The methodology is generalized to cover cases in which expression measurements are missing for many of the genes that might control the transcription of the genes of interest. The assumption that the gene expression level is roughly proportional to the rate of translation led to better empirical performance than did either the assumption that the gene expression level is roughly proportional to the protein level or the Bayesian model average of both assumptions. Availability: http://www.oisb.ca points to R code implementing the methods (R Development Core Team 2004). Contact: dbickel@uottawa.ca Supplementary information: http://www.davidbickel.com PMID:19218351
Xiang, Bo; Yu, Minglan; Liang, Xuemei; Lei, Wei; Huang, Chaohua; Chen, Jing; He, Wenying; Zhang, Tao; Li, Tao; Liu, Kezhi
2017-12-10
To explore common biological pathways for attention deficit hyperactivity disorder (ADHD) and low birth weight (LBW). Thei-Gsea4GwasV2 software was used to analyze the result of genome-wide association analysis (GWAS) for LBW (pathways were derived from Reactome), and nominally significant (P< 0.05, FDR< 0.25) pathways were tested for replication in ADHD.Significant pathways were analyzed with DAPPLE and Reatome FI software to identify genes involved in such pathways, with each cluster enriched with the gene ontology (GO). The Centiscape2.0 software was used to calculate the degree of genetic networks and the betweenness value to explore the core node (gene). Weighed gene co-expression network analysis (WGCNA) was then used to explore the co-expression of genes in these pathways.With gene expression data derived from BrainSpan, GO enrichment was carried out for each gene module. Eleven significant biological pathways was identified in association with LBW, among which two (Selenoamino acid metabolism and Diseases associated with glycosaminoglycan metabolism) were replicated during subsequent ADHD analysis. Network analysis of 130 genes in these pathways revealed that some of the sub-networksare related with morphology of cerebellum, development of hippocampus, and plasticity of synaptic structure. Upon co-expression network analysis, 120 genes passed the quality control and were found to express in 3 gene modules. These modules are mainly related to the regulation of synaptic structure and activity regulation. ADHD and LBW share some biological regulation processes. Anomalies of such proces sesmay predispose to ADHD.
Systems Genetic Analysis of Osteoblast-Lineage Cells
Calabrese, Gina; Bennett, Brian J.; Orozco, Luz; Kang, Hyun M.; Eskin, Eleazar; Dombret, Carlos; De Backer, Olivier; Lusis, Aldons J.; Farber, Charles R.
2012-01-01
The osteoblast-lineage consists of cells at various stages of maturation that are essential for skeletal development, growth, and maintenance. Over the past decade, many of the signaling cascades that regulate this lineage have been elucidated; however, little is known of the networks that coordinate, modulate, and transmit these signals. Here, we identify a gene network specific to the osteoblast-lineage through the reconstruction of a bone co-expression network using microarray profiles collected on 96 Hybrid Mouse Diversity Panel (HMDP) inbred strains. Of the 21 modules that comprised the bone network, module 9 (M9) contained genes that were highly correlated with prototypical osteoblast maker genes and were more highly expressed in osteoblasts relative to other bone cells. In addition, the M9 contained many of the key genes that define the osteoblast-lineage, which together suggested that it was specific to this lineage. To use the M9 to identify novel osteoblast genes and highlight its biological relevance, we knocked-down the expression of its two most connected “hub” genes, Maged1 and Pard6g. Their perturbation altered both osteoblast proliferation and differentiation. Furthermore, we demonstrated the mice deficient in Maged1 had decreased bone mineral density (BMD). It was also discovered that a local expression quantitative trait locus (eQTL) regulating the Wnt signaling antagonist Sfrp1 was a key driver of the M9. We also show that the M9 is associated with BMD in the HMDP and is enriched for genes implicated in the regulation of human BMD through genome-wide association studies. In conclusion, we have identified a physiologically relevant gene network and used it to discover novel genes and regulatory mechanisms involved in the function of osteoblast-lineage cells. Our results highlight the power of harnessing natural genetic variation to generate co-expression networks that can be used to gain insight into the function of specific cell-types. PMID:23300464
2010-01-01
Background Discovering novel disease genes is still challenging for diseases for which no prior knowledge - such as known disease genes or disease-related pathways - is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals. To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. Results We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. Conclusion In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype. PMID:20840752
Detection of Significant Pneumococcal Meningitis Biomarkers by Ego Network.
Wang, Qian; Lou, Zhifeng; Zhai, Liansuo; Zhao, Haibin
2017-06-01
To identify significant biomarkers for detection of pneumococcal meningitis based on ego network. Based on the gene expression data of pneumococcal meningitis and global protein-protein interactions (PPIs) data recruited from open access databases, the authors constructed a differential co-expression network (DCN) to identify pneumococcal meningitis biomarkers in a network view. Here EgoNet algorithm was employed to screen the significant ego networks that could accurately distinguish pneumococcal meningitis from healthy controls, by sequentially seeking ego genes, searching candidate ego networks, refinement of candidate ego networks and significance analysis to identify ego networks. Finally, the functional inference of the ego networks was performed to identify significant pathways for pneumococcal meningitis. By differential co-expression analysis, the authors constructed the DCN that covered 1809 genes and 3689 interactions. From the DCN, a total of 90 ego genes were identified. Starting from these ego genes, three significant ego networks (Module 19, Module 70 and Module 71) that could predict clinical outcomes for pneumococcal meningitis were identified by EgoNet algorithm, and the corresponding ego genes were GMNN, MAD2L1 and TPX2, respectively. Pathway analysis showed that these three ego networks were related to CDT1 association with the CDC6:ORC:origin complex, inactivation of APC/C via direct inhibition of the APC/C complex pathway, and DNA strand elongation, respectively. The authors successfully screened three significant ego modules which could accurately predict the clinical outcomes for pneumococcal meningitis and might play important roles in host response to pathogen infection in pneumococcal meningitis.
2012-01-01
Background We have shown previously that pan-HDAC inhibitors (HDACIs) m-carboxycinnamic acid bis-hydroxamide (CBHA) and trichostatin A (TSA) attenuated cardiac hypertrophy in BALB/c mice by inducing hyper-acetylation of cardiac chromatin that was accompanied by suppression of pro-inflammatory gene networks. However, it was not feasible to determine the precise contribution of the myocytes- and non-myocytes to HDACI-induced gene expression in the intact heart. Therefore, the current study was undertaken with a primary goal of elucidating temporal changes in the transcriptomes of cardiac myocytes exposed to CBHA and TSA. Results We incubated H9c2 cardiac myocytes in growth medium containing either of the two HDACIs for 6h and 24h and analyzed changes in gene expression using Illumina microarrays. H9c2 cells exposed to TSA for 6h and 24h led to differential expression of 468 and 231 genes, respectively. In contrast, cardiac myocytes incubated with CBHA for 6h and 24h elicited differential expression of 768 and 999 genes, respectively. We analyzed CBHA- and TSA-induced differentially expressed genes by Ingenuity Pathway (IPA), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Core_TF programs and discovered that CBHA and TSA impinged on several common gene networks. Thus, both HDACIs induced a repertoire of signaling kinases (PTEN-PI3K-AKT and MAPK) and transcription factors (Myc, p53, NFkB and HNF4A) representing canonical TGFβ, TNF-α, IFNγ and IL-6 specific networks. An overrepresentation of E2F, AP2, EGR1 and SP1 specific motifs was also found in the promoters of the differentially expressed genes. Apparently, TSA elicited predominantly TGFβ- and TNF-α-intensive gene networks regardless of the duration of treatment. In contrast, CBHA elicited TNF-α and IFNγ specific networks at 6 h, followed by elicitation of IL-6 and IFNγ-centered gene networks at 24h. Conclusions Our data show that both CBHA and TSA induced similar, but not identical, time-dependent, gene networks in H9c2 cardiac myocytes. Initially, both HDACIs impinged on numerous genes associated with adipokine signaling, intracellular metabolism and energetics, and cell cycle. A continued exposure to either CBHA or TSA led to the emergence of a number of apoptosis- and inflammation-specific gene networks that were apparently suppressed by both HDACIs. Based on these data we posit that the anti-inflammatory and anti-proliferative actions of HDACIs are myocyte-intrinsic. These findings advance our understanding of the mechanisms of actions of HDACIs on cardiac myocytes and reveal potential signaling pathways that may be targeted therapeutically. PMID:23249388
Specht, Alicia T; Li, Jun
2017-03-01
To construct gene co-expression networks based on single-cell RNA-Sequencing data, we present an algorithm called LEAP, which utilizes the estimated pseudotime of the cells to find gene co-expression that involves time delay. R package LEAP available on CRAN. jun.li@nd.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Dai, Jiajuan; Wang, Xusheng; Chen, Ying; Wang, Xiaodong; Zhu, Jun; Lu, Lu
2009-11-01
Previous studies have revealed that the subunit alpha 2 (Gabra2) of the gamma-aminobutyric acid receptor plays a critical role in the stress response. However, little is known about the gentetic regulatory network for Gabra2 and the stress response. We combined gene expression microarray analysis and quantitative trait loci (QTL) mapping to characterize the genetic regulatory network for Gabra2 expression in the hippocampus of BXD recombinant inbred (RI) mice. Our analysis found that the expression level of Gabra2 exhibited much variation in the hippocampus across the BXD RI strains and between the parental strains, C57BL/6J, and DBA/2J. Expression QTL (eQTL) mapping showed three microarray probe sets of Gabra2 to have highly significant linkage likelihood ratio statistic (LRS) scores. Gene co-regulatory network analysis showed that 10 genes, including Gria3, Chka, Drd3, Homer1, Grik2, Odz4, Prkag2, Grm5, Gabrb1, and Nlgn1 are directly or indirectly associated with stress responses. Eleven genes were implicated as Gabra2 downstream genes through mapping joint modulation. The genetical genomics approach demonstrates the importance and the potential power of the eQTL studies in identifying genetic regulatory networks that contribute to complex traits, such as stress responses.
Hodgins-Davis, Andrea; Adomas, Aleksandra B.; Warringer, Jonas; Townsend, Jeffrey P.
2012-01-01
Genetic variation for plastic phenotypes potentially contributes phenotypic variation to populations that can be selected during adaptation to novel ecological contexts. However, the basis and extent of plastic variation that manifests in diverse environments remains elusive. Here, we characterize copper reaction norms for mRNA abundance among five Saccharomyces cerevisiae strains to 1) describe population variation across the full range of ecologically relevant copper concentrations, from starvation to toxicity, and 2) to test the hypothesis that plastic networks exhibit increased population variation for gene expression. We find that although the vast majority of the variation is small in magnitude (considerably <2-fold), not just some, but most genes demonstrate variable expression across environments, across genetic backgrounds, or both. Plastically expressed genes included both genes regulated directly by copper-binding transcription factors Mac1 and Ace1 and genes indirectly responding to the downstream metabolic consequences of the copper gradient, particularly genes involved in copper, iron, and sulfur homeostasis. Copper-regulated gene networks exhibited more similar behavior within the population in environments where those networks have a large impact on fitness. Nevertheless, expression variation in genes like Cup1, important to surviving copper stress, was linked with variation in mitotic fitness and in the breadth of differential expression across the genome. By revealing a broader and deeper range of population variation, our results provide further evidence for the interconnectedness of genome-wide mRNA levels, their dependence on environmental context and genetic background, and the abundance of variation in gene expression that can contribute to future evolution. PMID:23019066
Canales, Javier; Moyano, Tomás C.; Villarroel, Eva; Gutiérrez, Rodrigo A.
2014-01-01
Nitrogen (N) is an essential macronutrient for plant growth and development. Plants adapt to changes in N availability partly by changes in global gene expression. We integrated publicly available root microarray data under contrasting nitrate conditions to identify new genes and functions important for adaptive nitrate responses in Arabidopsis thaliana roots. Overall, more than 2000 genes exhibited changes in expression in response to nitrate treatments in Arabidopsis thaliana root organs. Global regulation of gene expression by nitrate depends largely on the experimental context. However, despite significant differences from experiment to experiment in the identity of regulated genes, there is a robust nitrate response of specific biological functions. Integrative gene network analysis uncovered relationships between nitrate-responsive genes and 11 highly co-expressed gene clusters (modules). Four of these gene network modules have robust nitrate responsive functions such as transport, signaling, and metabolism. Network analysis hypothesized G2-like transcription factors are key regulatory factors controlling transport and signaling functions. Our meta-analysis highlights the role of biological processes not studied before in the context of the nitrate response such as root hair development and provides testable hypothesis to advance our understanding of nitrate responses in plants. PMID:24570678
Differential Connectivity in Colorectal Cancer Gene Expression Network
Izadi, Fereshteh
2018-05-30
Colorectal cancer (CRC) is one of the challenging types of cancers; thus, exploring effective biomarkers related to colorectal could lead to significant progresses toward the treatment of this disease. In the present study, CRC gene expression datasets have been reanalyzed. Mutual differentially expressed genes across 294 normal mucosa and adjacent tumoral samples were then utilized in order to build two independent transcriptional regulatory networks. By analyzing the networks topologically, genes with differential global connectivity related to cancer state were determined for which the potential transcriptional regulators including transcription factors were identified. The majority of differentially connected genes (DCGs) were up-regulated in colorectal transcriptome experiments. Moreover, a number of these genes have been experimentally validated as cancer or CRC-associated genes. The DCGs, including GART, TGFB1, ITGA2, SLC16A5, SOX9, and MMP7, were investigated across 12 cancer types. Functional enrichment analysis followed by detailed data mining exhibited that these candidate genes could be related to CRC by mediating in metastatic cascade in addition to shared pathways with 12 cancer types by triggering the inflammatory events Our study uncovered correlated alterations in gene expression related to CRC susceptibility and progression that the potent candidate biomarkers could provide a link to disease.
Origins of extrinsic variability in eukaryotic gene expression
NASA Astrophysics Data System (ADS)
Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff
2006-02-01
Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes simultaneously, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modelling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous lower limit for expression variability. A second source, which is modelled as originating from a common upstream transcription factor, exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Origins of extrinsic variability in eukaryotic gene expression
NASA Astrophysics Data System (ADS)
Volfson, Dmitri; Marciniak, Jennifer; Blake, William J.; Ostroff, Natalie; Tsimring, Lev S.; Hasty, Jeff
2006-03-01
Variable gene expression within a clonal population of cells has been implicated in a number of important processes including mutation and evolution, determination of cell fates and the development of genetic disease. Recent studies have demonstrated that a significant component of expression variability arises from extrinsic factors thought to influence multiple genes in concert, yet the biological origins of this extrinsic variability have received little attention. Here we combine computational modeling with fluorescence data generated from multiple promoter-gene inserts in Saccharomyces cerevisiae to identify two major sources of extrinsic variability. One unavoidable source arising from the coupling of gene expression with population dynamics leads to a ubiquitous noise floor in expression variability. A second source which is modeled as originating from a common upstream transcription factor exemplifies how regulatory networks can convert noise in upstream regulator expression into extrinsic noise at the output of a target gene. Our results highlight the importance of the interplay of gene regulatory networks with population heterogeneity for understanding the origins of cellular diversity.
Differential transcriptome expression in human nucleus accumbens as a function of loneliness
Canli, Turhan; Wen, Ruofeng; Wang, Xuefeng; Mikhailik, Anatoly; Yu, Lei; Fleischman, Debra; Wilson, Robert S.; Bennett, David A.
2017-01-01
Loneliness is associated with impaired mental and physical health. Studies of lonely individuals reported differential expression of inflammatory genes in peripheral leukocytes and diminished activation in brain reward regions such as nucleus accumbens, but could not address gene expression in the human brain. Here, we examined genome-wide RNA expression in postmortem nucleus accumbens from donors (N = 26) with known loneliness measures. Loneliness was associated with 1 710 differentially expressed transcripts from 1 599 genes (DEGs; FDR p < 0.05, fold-change ≥ |2|, controlling for confounds) previously associated with behavioral processes, neurological disease, psychological disorders, cancer, organismal injury, and skeletal and muscular disorders, as well as networks of upstream RNA regulators. Furthermore, a number of DEGs were associated with Alzheimer’s disease genes (which was correlated with loneliness in this sample, although gene expression analyses controlled for AD diagnosis). These results identify novel targets for future mechanistic studies of gene networks in nucleus accumbens and gene regulatory mechanisms across a variety of diseases exacerbated by loneliness. PMID:27801889
Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm
Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565
Tian, Honglai; Guan, Donghui; Li, Jianmin
2018-06-01
Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.
Quigley, David A; Kandyba, Eve; Huang, Phillips; Halliwill, Kyle D; Sjölund, Jonas; Pelorosso, Facundo; Wong, Christine E; Hirst, Gillian L; Wu, Di; Delrosario, Reyno; Kumar, Atul; Balmain, Allan
2016-07-26
Inherited germline polymorphisms can cause gene expression levels in normal tissues to differ substantially between individuals. We present an analysis of the genetic architecture of normal adult skin from 470 genetically unique mice, demonstrating the effect of germline variants, skin tissue location, and perturbation by exogenous inflammation or tumorigenesis on gene signaling pathways. Gene networks related to specific cell types and signaling pathways, including sonic hedgehog (Shh), Wnt, Lgr family stem cell markers, and keratins, differed at these tissue sites, suggesting mechanisms for the differential susceptibility of dorsal and tail skin to development of skin diseases and tumorigenesis. The Pten tumor suppressor gene network is rewired in premalignant tumors compared to normal tissue, but this response to perturbation is lost during malignant progression. We present a software package for expression quantitative trait loci (eQTL) network analysis and demonstrate how network analysis of whole tissues provides insights into interactions between cell compartments and signaling molecules. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Synthetic incoherent feedforward circuits show adaptation to the amount of their genetic template
Bleris, Leonidas; Xie, Zhen; Glass, David; Adadey, Asa; Sontag, Eduardo; Benenson, Yaakov
2011-01-01
Natural and synthetic biological networks must function reliably in the face of fluctuating stoichiometry of their molecular components. These fluctuations are caused in part by changes in relative expression efficiency and the DNA template amount of the network-coding genes. Gene product levels could potentially be decoupled from these changes via built-in adaptation mechanisms, thereby boosting network reliability. Here, we show that a mechanism based on an incoherent feedforward motif enables adaptive gene expression in mammalian cells. We modeled, synthesized, and tested transcriptional and post-transcriptional incoherent loops and found that in all cases the gene product adapts to changes in DNA template abundance. We also observed that the post-transcriptional form results in superior adaptation behavior, higher absolute expression levels, and lower intrinsic fluctuations. Our results support a previously hypothesized endogenous role in gene dosage compensation for such motifs and suggest that their incorporation in synthetic networks will improve their robustness and reliability. PMID:21811230
van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María
2014-09-26
Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different networks. By simultaneously exploring these networks and metadata, we gained insights into regulatory mechanisms in M. tuberculosis that could not be obtained through the separate analysis of each data type.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.
Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.; ...
2016-03-18
Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
Parker, Hugo J; Bronner, Marianne E; Krumlauf, Robb
2016-06-01
Hindbrain development is orchestrated by a vertebrate gene regulatory network that generates segmental patterning along the anterior-posterior axis via Hox genes. Here, we review analyses of vertebrate and invertebrate chordate models that inform upon the evolutionary origin and diversification of this network. Evidence from the sea lamprey reveals that the hindbrain regulatory network generates rhombomeric compartments with segmental Hox expression and an underlying Hox code. We infer that this basal feature was present in ancestral vertebrates and, as an evolutionarily constrained developmental state, is fundamentally important for patterning of the vertebrate hindbrain across diverse lineages. Despite the common ground plan, vertebrates exhibit neuroanatomical diversity in lineage-specific patterns, with different vertebrates revealing variations of Hox expression in the hindbrain that could underlie this diversification. Invertebrate chordates lack hindbrain segmentation but exhibit some conserved aspects of this network, with retinoic acid signaling playing a role in establishing nested domains of Hox expression. © 2016 WILEY Periodicals, Inc.
An atlas of gene expression and gene co-regulation in the human retina.
Pinelli, Michele; Carissimo, Annamaria; Cutillo, Luisa; Lai, Ching-Hung; Mutarelli, Margherita; Moretti, Maria Nicoletta; Singh, Marwah Veer; Karali, Marianthi; Carrella, Diego; Pizzo, Mariateresa; Russo, Francesco; Ferrari, Stefano; Ponzin, Diego; Angelini, Claudia; Banfi, Sandro; di Bernardo, Diego
2016-07-08
The human retina is a specialized tissue involved in light stimulus transduction. Despite its unique biology, an accurate reference transcriptome is still missing. Here, we performed gene expression analysis (RNA-seq) of 50 retinal samples from non-visually impaired post-mortem donors. We identified novel transcripts with high confidence (Observed Transcriptome (ObsT)) and quantified the expression level of known transcripts (Reference Transcriptome (RefT)). The ObsT included 77 623 transcripts (23 960 genes) covering 137 Mb (35 Mb new transcribed genome). Most of the transcripts (92%) were multi-exonic: 81% with known isoforms, 16% with new isoforms and 3% belonging to new genes. The RefT included 13 792 genes across 94 521 known transcripts. Mitochondrial genes were among the most highly expressed, accounting for about 10% of the reads. Of all the protein-coding genes in Gencode, 65% are expressed in the retina. We exploited inter-individual variability in gene expression to infer a gene co-expression network and to identify genes specifically expressed in photoreceptor cells. We experimentally validated the photoreceptors localization of three genes in human retina that had not been previously reported. RNA-seq data and the gene co-expression network are available online (http://retina.tigem.it). © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Increased entropy of signal transduction in the cancer metastasis phenotype.
Teschendorff, Andrew E; Severini, Simone
2010-07-30
The statistical study of biological networks has led to important novel biological insights, such as the presence of hubs and hierarchical modularity. There is also a growing interest in studying the statistical properties of networks in the context of cancer genomics. However, relatively little is known as to what network features differ between the cancer and normal cell physiologies, or between different cancer cell phenotypes. Based on the observation that frequent genomic alterations underlie a more aggressive cancer phenotype, we asked if such an effect could be detectable as an increase in the randomness of local gene expression patterns. Using a breast cancer gene expression data set and a model network of protein interactions we derive constrained weighted networks defined by a stochastic information flux matrix reflecting expression correlations between interacting proteins. Based on this stochastic matrix we propose and compute an entropy measure that quantifies the degree of randomness in the local pattern of information flux around single genes. By comparing the local entropies in the non-metastatic versus metastatic breast cancer networks, we here show that breast cancers that metastasize are characterised by a small yet significant increase in the degree of randomness of local expression patterns. We validate this result in three additional breast cancer expression data sets and demonstrate that local entropy better characterises the metastatic phenotype than other non-entropy based measures. We show that increases in entropy can be used to identify genes and signalling pathways implicated in breast cancer metastasis and provide examples of de-novo discoveries of gene modules with known roles in apoptosis, immune-mediated tumour suppression, cell-cycle and tumour invasion. Importantly, we also identify a novel gene module within the insulin growth factor signalling pathway, alteration of which may predispose the tumour to metastasize. These results demonstrate that a metastatic cancer phenotype is characterised by an increase in the randomness of the local information flux patterns. Measures of local randomness in integrated protein interaction mRNA expression networks may therefore be useful for identifying genes and signalling pathways disrupted in one phenotype relative to another. Further exploration of the statistical properties of such integrated cancer expression and protein interaction networks will be a fruitful endeavour.
Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A
2017-04-01
Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.
Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.
Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun
2017-12-21
Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene relationships. Taken together, construction and analysis of the GDI network can be a useful approach to identify novel gene-gene relationships in terms of the dynamical influence.
Interplay of Noisy Gene Expression and Dynamics Explains Patterns of Bacterial Operon Organization
NASA Astrophysics Data System (ADS)
Igoshin, Oleg
2011-03-01
Bacterial chromosomes are organized into operons -- sets of genes co-transcribed into polycistronic messenger RNA. Hypotheses explaining the emergence and maintenance of operons include proportional co-regulation, horizontal transfer of intact ``selfish'' operons, emergence via gene duplication, and co-production of physically interacting proteins to speed their association. We hypothesized an alternative: operons can reduce or increase intrinsic gene expression noise in a manner dependent on the post-translational interactions, thereby resulting in selection for or against operons in depending on the network architecture. We devised five classes of two-gene network modules and show that the effects of operons on intrinsic noise depend on class membership. Two classes exhibit decreased noise with co-transcription, two others reveal increased noise, and the remaining one does not show a significant difference. To test our modeling predictions we employed bioinformatic analysis to determine the relationship gene expression noise and operon organization. The results confirm the overrepresentation of noise-minimizing operon architectures and provide evidence against other hypotheses. Our results thereby suggest a central role for gene expression noise in selecting for or maintaining operons in bacterial chromosomes. This demonstrates how post-translational network dynamics may provide selective pressure for organizing bacterial chromosomes, and has practical consequences for designing synthetic gene networks. This work is supported by National Institutes of Health grant 1R01GM096189-01.
Deregulation of an imprinted gene network in prostate cancer
Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A
2014-01-01
Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes. PMID:24513574
Deregulation of an imprinted gene network in prostate cancer.
Ribarska, Teodora; Goering, Wolfgang; Droop, Johanna; Bastian, Klaus-Marius; Ingenwerth, Marc; Schulz, Wolfgang A
2014-05-01
Multiple epigenetic alterations contribute to prostate cancer progression by deregulating gene expression. Epigenetic mechanisms, especially differential DNA methylation at imprinting control regions (termed DMRs), normally ensure the exclusive expression of imprinted genes from one specific parental allele. We therefore wondered to which extent imprinted genes become deregulated in prostate cancer and, if so, whether deregulation is due to altered DNA methylation at DMRs. Therefore, we selected presumptive deregulated imprinted genes from a previously conducted in silico analysis and from the literature and analyzed their expression in prostate cancer tissues by qRT-PCR. We found significantly diminished expression of PLAGL1/ZAC1, MEG3, NDN, CDKN1C, IGF2, and H19, while LIT1 was significantly overexpressed. The PPP1R9A gene, which is imprinted in selected tissues only, was strongly overexpressed, but was expressed biallelically in benign and cancerous prostatic tissues. Expression of many of these genes was strongly correlated, suggesting co-regulation, as in an imprinted gene network (IGN) reported in mice. Deregulation of the network genes also correlated with EZH2 and HOXC6 overexpression. Pyrosequencing analysis of all relevant DMRs revealed generally stable DNA methylation between benign and cancerous prostatic tissues, but frequent hypo- and hyper-methylation was observed at the H19 DMR in both benign and cancerous tissues. Re-expression of the ZAC1 transcription factor induced H19, CDKN1C and IGF2, supporting its function as a nodal regulator of the IGN. Our results indicate that a group of imprinted genes are coordinately deregulated in prostate cancers, independently of DNA methylation changes.
Miyamoto, Tadashi; Furusawa, Chikara; Kaneko, Kunihiko
2015-01-01
Embryonic stem cells exhibit pluripotency: they can differentiate into all types of somatic cells. Pluripotent genes such as Oct4 and Nanog are activated in the pluripotent state, and their expression decreases during cell differentiation. Inversely, expression of differentiation genes such as Gata6 and Gata4 is promoted during differentiation. The gene regulatory network controlling the expression of these genes has been described, and slower-scale epigenetic modifications have been uncovered. Although the differentiation of pluripotent stem cells is normally irreversible, reprogramming of cells can be experimentally manipulated to regain pluripotency via overexpression of certain genes. Despite these experimental advances, the dynamics and mechanisms of differentiation and reprogramming are not yet fully understood. Based on recent experimental findings, we constructed a simple gene regulatory network including pluripotent and differentiation genes, and we demonstrated the existence of pluripotent and differentiated states from the resultant dynamical-systems model. Two differentiation mechanisms, interaction-induced switching from an expression oscillatory state and noise-assisted transition between bistable stationary states, were tested in the model. The former was found to be relevant to the differentiation process. We also introduced variables representing epigenetic modifications, which controlled the threshold for gene expression. By assuming positive feedback between expression levels and the epigenetic variables, we observed differentiation in expression dynamics. Additionally, with numerical reprogramming experiments for differentiated cells, we showed that pluripotency was recovered in cells by imposing overexpression of two pluripotent genes and external factors to control expression of differentiation genes. Interestingly, these factors were consistent with the four Yamanaka factors, Oct4, Sox2, Klf4, and Myc, which were necessary for the establishment of induced pluripotent stem cells. These results, based on a gene regulatory network and expression dynamics, contribute to our wider understanding of pluripotency, differentiation, and reprogramming of cells, and they provide a fresh viewpoint on robustness and control during development. PMID:26308610
Lin, Huapeng; Zhang, Qian; Li, Xiaocheng; Wu, Yushen; Liu, Ye; Hu, Yingchun
2018-01-01
Abstract Hepatitis B virus-associated acute liver failure (HBV-ALF) is a rare but life-threatening syndrome that carried a high morbidity and mortality. Our study aimed to explore the possible molecular mechanisms of HBV-ALF by means of bioinformatics analysis. In this study, genes expression microarray datasets of HBV-ALF from Gene Expression Omnibus were collected, and then we identified differentially expressed genes (DEGs) by the limma package in R. After functional enrichment analysis, we constructed the protein–protein interaction (PPI) network by the Search Tool for the Retrieval of Interacting Genes online database and weighted genes coexpression network by the WGCNA package in R. Subsequently, we picked out the hub genes among the DEGs. A total of 423 DEGs with 198 upregulated genes and 225 downregulated genes were identified between HBV-ALF and normal samples. The upregulated genes were mainly enriched in immune response, and the downregulated genes were mainly enriched in complement and coagulation cascades. Orosomucoid 1 (ORM1), orosomucoid 2 (ORM2), plasminogen (PLG), and aldehyde oxidase 1 (AOX1) were picked out as the hub genes that with a high degree in both PPI network and weighted genes coexpression network. The weighted genes coexpression network analysis found out 3 of the 5 modules that upregulated genes enriched in were closely related to immune system. The downregulated genes enriched in only one module, and the genes in this module majorly enriched in the complement and coagulation cascades pathway. In conclusion, 4 genes (ORM1, ORM2, PLG, and AOX1) with immune response and the complement and coagulation cascades pathway may take part in the pathogenesis of HBV-ALF, and these candidate genes and pathways could be therapeutic targets for HBV-ALF. PMID:29384847
Estimation of the proteomic cancer co-expression sub networks by using association estimators.
Erdoğan, Cihat; Kurt, Zeyneb; Diri, Banu
2017-01-01
In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators' performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists.
Estimation of the proteomic cancer co-expression sub networks by using association estimators
Kurt, Zeyneb; Diri, Banu
2017-01-01
In this study, the association estimators, which have significant influences on the gene network inference methods and used for determining the molecular interactions, were examined within the co-expression network inference concept. By using the proteomic data from five different cancer types, the hub genes/proteins within the disease-associated gene-gene/protein-protein interaction sub networks were identified. Proteomic data from various cancer types is collected from The Cancer Proteome Atlas (TCPA). Correlation and mutual information (MI) based nine association estimators that are commonly used in the literature, were compared in this study. As the gold standard to measure the association estimators’ performance, a multi-layer data integration platform on gene-disease associations (DisGeNET) and the Molecular Signatures Database (MSigDB) was used. Fisher's exact test was used to evaluate the performance of the association estimators by comparing the created co-expression networks with the disease-associated pathways. It was observed that the MI based estimators provided more successful results than the Pearson and Spearman correlation approaches, which are used in the estimation of biological networks in the weighted correlation network analysis (WGCNA) package. In correlation-based methods, the best average success rate for five cancer types was 60%, while in MI-based methods the average success ratio was 71% for James-Stein Shrinkage (Shrink) and 64% for Schurmann-Grassberger (SG) association estimator, respectively. Moreover, the hub genes and the inferred sub networks are presented for the consideration of researchers and experimentalists. PMID:29145449
Lin, Zhe; Lin, Yongsheng
2017-09-05
The aim of this study was to explore potential crucial genes associated with the steroid-induced necrosis of femoral head (SINFH) and to provide valid biological information for further investigation of SINFH. Gene expression profile of GSE26316, generated from 3 SINFH rat samples and 3 normal rat samples were downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified using LIMMA package. After functional enrichment analyses of DEGs, protein-protein interaction (PPI) network and sub-PPI network analyses were conducted based on the STRING database and cytoscape. In total, 59 up-regulated DEGs and 156 downregulated DEGs were identified. The up-regulated DEGs were mainly involved in functions about immunity (e.g. Fcer1A and Il7R), and the downregulated DEGs were mainly enriched in muscle system process (e.g. Tnni2, Mylpf and Myl1). The PPI network of DEGs consisted of 123 nodes and 300 interactions. Tnni2, Mylpf, and Myl1 were the top 3 outstanding genes based on both subgraph centrality and degree centrality evaluation. These three genes interacted with each other in the network. Furthermore, the significant network module was composed of 22 downregulated genes (e.g. Tnni2, Mylpf and Myl1). These genes were mainly enriched in functions like muscle system process. The DEGs related to the regulation of immune system process (e.g. Fcer1A and Il7R), and DEGs correlated with muscle system process (e.g. Tnni2, Mylpf and Myl1) may be closely associated with the progress of SINFH, which is still needed to be confirmed by experiments. Copyright © 2017 Elsevier B.V. All rights reserved.
Bekiaris, Pavlos Stephanos; Tekath, Tobias; Staiger, Dorothee; Danisman, Selahattin
2018-01-01
Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, "Exploration of Distinctive CREs and CRMs" (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, "CRM Network Generator" (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression.
Staiger, Dorothee
2018-01-01
Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, “Exploration of Distinctive CREs and CRMs” (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, “CRM Network Generator” (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression. PMID:29298348
Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data
Liu, Zhi-Ping
2015-01-01
Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented. PMID:25937810
Fuentes, Nathalie; Roy, Arpan; Mishra, Vikas; Cabello, Noe; Silveyra, Patricia
2018-05-08
Sex differences in the incidence and prognosis of respiratory diseases have been reported. Studies have shown that women are at increased risk of adverse health outcomes from air pollution than men, but sex-specific immune gene expression patterns and regulatory networks have not been well studied in the lung. MicroRNAs (miRNAs) are environmentally sensitive posttranscriptional regulators of gene expression that may mediate the damaging effects of inhaled pollutants in the lung, by altering the expression of innate immunity molecules. Male and female mice of the C57BL/6 background were exposed to 2 ppm of ozone or filtered air (control) for 3 h. Female mice were also exposed at different stages of the estrous cycle. Following exposure, lungs were harvested and total RNA was extracted. We used PCR arrays to study sex differences in the expression of 84 miRNAs predicted to target inflammatory and immune genes. We identified differentially expressed miRNA signatures in the lungs of male vs. female exposed to ozone. In silico pathway analyses identified sex-specific biological networks affected by exposure to ozone that ranged from direct predicted gene targeting to complex interactions with multiple intermediates. We also identified differences in miRNA expression and predicted regulatory networks in females exposed to ozone at different estrous cycle stages. Our results indicate that both sex and hormonal status can influence lung miRNA expression in response to ozone exposure, indicating that sex-specific miRNA regulation of inflammatory gene expression could mediate differential pollution-induced health outcomes in men and women.
Reyes-Bermudez, Alejandro; Villar-Briones, Alejandro; Ramirez-Portilla, Catalina; Hidaka, Michio; Mikheyev, Alexander S.
2016-01-01
Corals belong to the most basal class of the Phylum Cnidaria, which is considered the sister group of bilaterian animals, and thus have become an emerging model to study the evolution of developmental mechanisms. Although cell renewal, differentiation, and maintenance of pluripotency are cellular events shared by multicellular animals, the cellular basis of these fundamental biological processes are still poorly understood. To understand how changes in gene expression regulate morphogenetic transitions at the base of the eumetazoa, we performed quantitative RNA-seq analysis during Acropora digitifera’s development. We collected embryonic, larval, and adult samples to characterize stage-specific transcription profiles, as well as broad expression patterns. Transcription profiles reconstructed development revealing two main expression clusters. The first cluster grouped blastula and gastrula and the second grouped subsequent developmental time points. Consistently, we observed clear differences in gene expression between early and late developmental transitions, with higher numbers of differentially expressed genes and fold changes around gastrulation. Furthermore, we identified three coexpression clusters that represented discrete gene expression patterns. During early transitions, transcriptional networks seemed to regulate cellular fate and morphogenesis of the larval body. In late transitions, these networks seemed to play important roles preparing planulae for switch in lifestyle and regulation of adult processes. Although developmental progression in A. digitifera is regulated to some extent by differential coexpression of well-defined gene networks, stage-specific transcription profiles appear to be independent entities. While negative regulation of transcription is predominant in early development, cell differentiation was upregulated in larval and adult stages. PMID:26941230
Ficklin, Stephen P; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.
Ficklin, Stephen P.; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance. PMID:23874666
Regulation of a transcription factor network by Cdk1 coordinates late cell cycle gene expression
Landry, Benjamin D; Mapa, Claudine E; Arsenault, Heather E; Poti, Kristin E; Benanti, Jennifer A
2014-01-01
To maintain genome stability, regulators of chromosome segregation must be expressed in coordination with mitotic events. Expression of these late cell cycle genes is regulated by cyclin-dependent kinase (Cdk1), which phosphorylates a network of conserved transcription factors (TFs). However, the effects of Cdk1 phosphorylation on many key TFs are not known. We find that elimination of Cdk1-mediated phosphorylation of four S-phase TFs decreases expression of many late cell cycle genes, delays mitotic progression, and reduces fitness in budding yeast. Blocking phosphorylation impairs degradation of all four TFs. Consequently, phosphorylation-deficient mutants of the repressors Yox1 and Yhp1 exhibit increased promoter occupancy and decreased expression of their target genes. Interestingly, although phosphorylation of the transcriptional activator Hcm1 on its N-terminus promotes its degradation, phosphorylation on its C-terminus is required for its activity, indicating that Cdk1 both activates and inhibits a single TF. We conclude that Cdk1 promotes gene expression by both activating transcriptional activators and inactivating transcriptional repressors. Furthermore, our data suggest that coordinated regulation of the TF network by Cdk1 is necessary for faithful cell division. PMID:24714560
Regulation of a transcription factor network by Cdk1 coordinates late cell cycle gene expression.
Landry, Benjamin D; Mapa, Claudine E; Arsenault, Heather E; Poti, Kristin E; Benanti, Jennifer A
2014-05-02
To maintain genome stability, regulators of chromosome segregation must be expressed in coordination with mitotic events. Expression of these late cell cycle genes is regulated by cyclin-dependent kinase (Cdk1), which phosphorylates a network of conserved transcription factors (TFs). However, the effects of Cdk1 phosphorylation on many key TFs are not known. We find that elimination of Cdk1-mediated phosphorylation of four S-phase TFs decreases expression of many late cell cycle genes, delays mitotic progression, and reduces fitness in budding yeast. Blocking phosphorylation impairs degradation of all four TFs. Consequently, phosphorylation-deficient mutants of the repressors Yox1 and Yhp1 exhibit increased promoter occupancy and decreased expression of their target genes. Interestingly, although phosphorylation of the transcriptional activator Hcm1 on its N-terminus promotes its degradation, phosphorylation on its C-terminus is required for its activity, indicating that Cdk1 both activates and inhibits a single TF. We conclude that Cdk1 promotes gene expression by both activating transcriptional activators and inactivating transcriptional repressors. Furthermore, our data suggest that coordinated regulation of the TF network by Cdk1 is necessary for faithful cell division.
Rund, Samuel S C; Yoo, Boyoung; Alam, Camille; Green, Taryn; Stephens, Melissa T; Zeng, Erliang; George, Gary F; Sheppard, Aaron D; Duffield, Giles E; Milenković, Tijana; Pfrender, Michael E
2016-08-18
Marine and freshwater zooplankton exhibit daily rhythmic patterns of behavior and physiology which may be regulated directly by the light:dark (LD) cycle and/or a molecular circadian clock. One of the best-studied zooplankton taxa, the freshwater crustacean Daphnia, has a 24 h diel vertical migration (DVM) behavior whereby the organism travels up and down through the water column daily. DVM plays a critical role in resource tracking and the behavioral avoidance of predators and damaging ultraviolet radiation. However, there is little information at the transcriptional level linking the expression patterns of genes to the rhythmic physiology/behavior of Daphnia. Here we analyzed genome-wide temporal transcriptional patterns from Daphnia pulex collected over a 44 h time period under a 12:12 LD cycle (diel) conditions using a cosine-fitting algorithm. We used a comprehensive network modeling and analysis approach to identify novel co-regulated rhythmic genes that have similar network topological properties and functional annotations as rhythmic genes identified by the cosine-fitting analyses. Furthermore, we used the network approach to predict with high accuracy novel gene-function associations, thus enhancing current functional annotations available for genes in this ecologically relevant model species. Our results reveal that genes in many functional groupings exhibit 24 h rhythms in their expression patterns under diel conditions. We highlight the rhythmic expression of immunity, oxidative detoxification, and sensory process genes. We discuss differences in the chronobiology of D. pulex from other well-characterized terrestrial arthropods. This research adds to a growing body of literature suggesting the genetic mechanisms governing rhythmicity in crustaceans may be divergent from other arthropod lineages including insects. Lastly, these results highlight the power of using a network analysis approach to identify differential gene expression and provide novel functional annotation.
Osterndorff-Kahanek, Elizabeth A.; Becker, Howard C.; Lopez, Marcelo F.; Farris, Sean P.; Tiwari, Gayatri R.; Nunez, Yury O.; Harris, R. Adron; Mayfield, R. Dayne
2015-01-01
Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY), nucleus accumbens (NAC), prefrontal cortex (PFC), and liver after four weekly cycles of chronic intermittent ethanol (CIE) vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000) at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600). Within each region, there was little gene overlap across time (~20%). All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global ‘rewiring‘ of coexpression systems involving glial and immune signaling as well as neuronal genes. PMID:25803291
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.
2013-01-01
Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602
Martyniuk, Christopher J; Prucha, Melinda S; Doperalski, Nicholas J; Antczak, Philipp; Kroll, Kevin J; Falciani, Francesco; Barber, David S; Denslow, Nancy D
2013-01-01
Oocyte maturation in fish involves numerous cell signaling cascades that are activated or inhibited during specific stages of oocyte development. The objectives of this study were to characterize molecular pathways and temporal gene expression patterns throughout a complete breeding cycle in wild female largemouth bass to improve understanding of the molecular sequence of events underlying oocyte maturation. Transcriptomic analysis was performed on eight morphologically diverse stages of the ovary, including primary and secondary stages of oocyte growth, ovulation, and atresia. Ovary histology, plasma vitellogenin, 17β-estradiol, and testosterone were also measured to correlate with gene networks. Global expression patterns revealed dramatic differences across ovarian development, with 552 and 2070 genes being differentially expressed during both ovulation and atresia respectively. Gene set enrichment analysis (GSEA) revealed that early primary stages of oocyte growth involved increases in expression of genes involved in pathways of B-cell and T-cell receptor-mediated signaling cascades and fibronectin regulation. These pathways as well as pathways that included adrenergic receptor signaling, sphingolipid metabolism and natural killer cell activation were down-regulated at ovulation. At atresia, down-regulated pathways included gap junction and actin cytoskeleton regulation, gonadotrope and mast cell activation, and vasopressin receptor signaling and up-regulated pathways included oxidative phosphorylation and reactive oxygen species metabolism. Expression targets for luteinizing hormone signaling were low during vitellogenesis but increased 150% at ovulation. Other networks found to play a significant role in oocyte maturation included those with genes regulated by members of the TGF-beta superfamily (activins, inhibins, bone morphogenic protein 7 and growth differentiation factor 9), neuregulin 1, retinoid X receptor, and nerve growth factor family. This study offers novel insight into the gene networks underlying vitellogenesis, ovulation and atresia and generates new hypotheses about the cellular pathways regulating oocyte maturation.
Jenkins, Dafyd J; Stekel, Dov J
2010-02-01
Gene regulation is one important mechanism in producing observed phenotypes and heterogeneity. Consequently, the study of gene regulatory network (GRN) architecture, function and evolution now forms a major part of modern biology. However, it is impossible to experimentally observe the evolution of GRNs on the timescales on which living species evolve. In silico evolution provides an approach to studying the long-term evolution of GRNs, but many models have either considered network architecture from non-adaptive evolution, or evolution to non-biological objectives. Here, we address a number of important modelling and biological questions about the evolution of GRNs to the realistic goal of biomass production. Can different commonly used simulation paradigms, in particular deterministic and stochastic Boolean networks, with and without basal gene expression, be used to compare adaptive with non-adaptive evolution of GRNs? Are these paradigms together with this goal sufficient to generate a range of solutions? Will the interaction between a biological goal and evolutionary dynamics produce trade-offs between growth and mutational robustness? We show that stochastic basal gene expression forces shrinkage of genomes due to energetic constraints and is a prerequisite for some solutions. In systems that are able to evolve rates of basal expression, two optima, one with and one without basal expression, are observed. Simulation paradigms without basal expression generate bloated networks with non-functional elements. Further, a range of functional solutions was observed under identical conditions only in stochastic networks. Moreover, there are trade-offs between efficiency and yield, indicating an inherent intertwining of fitness and evolutionary dynamics.
Connahs, Heidi; Rhen, Turk; Simmons, Rebecca B
2016-03-31
Butterfly wing color patterns are an important model system for understanding the evolution and development of morphological diversity and animal pigmentation. Wing color patterns develop from a complex network composed of highly conserved patterning genes and pigmentation pathways. Patterning genes are involved in regulating pigment synthesis however the temporal expression dynamics of these interacting networks is poorly understood. Here, we employ next generation sequencing to examine expression patterns of the gene network underlying wing development in the nymphalid butterfly, Vanessa cardui. We identified 9, 376 differentially expressed transcripts during wing color pattern development, including genes involved in patterning, pigmentation and gene regulation. Differential expression of these genes was highest at the pre-ommochrome stage compared to early pupal and late melanin stages. Overall, an increasing number of genes were down-regulated during the progression of wing development. We observed dynamic expression patterns of a large number of pigment genes from the ommochrome, melanin and also pteridine pathways, including contrasting patterns of expression for paralogs of the yellow gene family. Surprisingly, many patterning genes previously associated with butterfly pattern elements were not significantly up-regulated at any time during pupation, although many other transcription factors were differentially expressed. Several genes involved in Notch signaling were significantly up-regulated during the pre-ommochrome stage including slow border cells, bunched and pebbles; the function of these genes in the development of butterfly wings is currently unknown. Many genes involved in ecdysone signaling were also significantly up-regulated during early pupal and late melanin stages and exhibited opposing patterns of expression relative to the ecdysone receptor. Finally, a comparison across four butterfly transcriptomes revealed 28 transcripts common to all four species that have no known homologs in other metazoans. This study provides a comprehensive list of differentially expressed transcripts during wing development, revealing potential candidate genes that may be involved in regulating butterfly wing patterns. Some differentially expressed genes have no known homologs possibly representing genes unique to butterflies. Results from this study also indicate that development of nymphalid wing patterns may arise not only from melanin and ommochrome pigments but also the pteridine pigment pathway.
Li, Chen; Shen, Weixing; Shen, Sheng; Ai, Zhilong
2013-12-01
To explore the molecular mechanisms of cholangiocarcinoma (CC), microarray technology was used to find biomarkers for early detection and diagnosis. The gene expression profiles from 6 patients with CC and 5 normal controls were downloaded from Gene Expression Omnibus and compared. As a result, 204 differentially co-expressed genes (DCGs) in CC patients compared to normal controls were identified using a computational bioinformatics analysis. These genes were mainly involved in coenzyme metabolic process, peptidase activity and oxidation reduction. A regulatory network was constructed by mapping the DCGs to known regulation data. Four transcription factors, FOXC1, ZIC2, NKX2-2 and GCGR, were hub nodes in the network. In conclusion, this study provides a set of targets useful for future investigations into molecular biomarker studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
An incoherent feedforward loop facilitates adaptive tuning of gene expression.
Hong, Jungeui; Brandt, Nathan; Abdul-Rahman, Farah; Yang, Ally; Hughes, Tim; Gresham, David
2018-04-05
We studied adaptive evolution of gene expression using long-term experimental evolution of Saccharomyces cerevisiae in ammonium-limited chemostats. We found repeated selection for non-synonymous variation in the DNA binding domain of the transcriptional activator, GAT1, which functions with the repressor, DAL80 in an incoherent type-1 feedforward loop (I1-FFL) to control expression of the high affinity ammonium transporter gene, MEP2. Missense mutations in the DNA binding domain of GAT1 reduce its binding to the GATAA consensus sequence. However, we show experimentally, and using mathematical modeling, that decreases in GAT1 binding result in increased expression of MEP2 as a consequence of properties of I1-FFLs. Our results show that I1-FFLs, one of the most commonly occurring network motifs in transcriptional networks, can facilitate adaptive tuning of gene expression through modulation of transcription factor binding affinities. Our findings highlight the importance of gene regulatory architectures in the evolution of gene expression. © 2018, Hong et al.
Emmert-Streib, Frank; Glazko, Galina V.; Altay, Gökmen; de Matos Simoes, Ricardo
2012-01-01
In this paper, we present a systematic and conceptual overview of methods for inferring gene regulatory networks from observational gene expression data. Further, we discuss two classic approaches to infer causal structures and compare them with contemporary methods by providing a conceptual categorization thereof. We complement the above by surveying global and local evaluation measures for assessing the performance of inference algorithms. PMID:22408642
Strakova, Eva; Zikova, Alice; Vohradsky, Jiri
2014-01-01
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
PINTA: a web server for network-based gene prioritization from expression data
Nitsch, Daniela; Tranchevent, Léon-Charles; Gonçalves, Joana P.; Vogt, Josef Korbinian; Madeira, Sara C.; Moreau, Yves
2011-01-01
PINTA (available at http://www.esat.kuleuven.be/pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein–protein interaction network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed and downloaded by the user. PMID:21602267
Reverse engineering the gap gene network of Drosophila melanogaster.
Perkins, Theodore J; Jaeger, Johannes; Reinitz, John; Glass, Leon
2006-05-01
A fundamental problem in functional genomics is to determine the structure and dynamics of genetic networks based on expression data. We describe a new strategy for solving this problem and apply it to recently published data on early Drosophila melanogaster development. Our method is orders of magnitude faster than current fitting methods and allows us to fit different types of rules for expressing regulatory relationships. Specifically, we use our approach to fit models using a smooth nonlinear formalism for modeling gene regulation (gene circuits) as well as models using logical rules based on activation and repression thresholds for transcription factors. Our technique also allows us to infer regulatory relationships de novo or to test network structures suggested by the literature. We fit a series of models to test several outstanding questions about gap gene regulation, including regulation of and by hunchback and the role of autoactivation. Based on our modeling results and validation against the experimental literature, we propose a revised network structure for the gap gene system. Interestingly, some relationships in standard textbook models of gap gene regulation appear to be unnecessary for or even inconsistent with the details of gap gene expression during wild-type development.
Relevance of phenotypic noise to adaptation and evolution.
Kaneko, K; Furusawa, C
2008-09-01
Biological processes are inherently noisy, as highlighted in recent measurements of stochasticity in gene expression. Here, the authors show that such phenotypic noise is essential to the adaptation of organisms to a variety of environments and also to the evolution of robustness against mutations. First, the authors show that for any growing cell showing stochastic gene expression, the adaptive cellular state is inevitably selected by noise, without the use of a specific signal transduction network. In general, changes in any protein concentration in a cell are products of its synthesis minus dilution and degradation, both of which are proportional to the rate of cell growth. In an adaptive state, both the synthesis and dilution terms of proteins are large, and so the adaptive state is less affected by stochasticity in gene expression, whereas for a non-adaptive state, both terms are smaller, and so cells are easily knocked out of their original state by noise. This leads to a novel, generic mechanism for the selection of adaptive states. The authors have confirmed this selection by model simulations. Secondly, the authors consider the evolution of gene networks to acquire robustness of the phenotype against noise and mutation. Through simulations using a simple stochastic gene expression network that undergoes mutation and selection, the authors show that a threshold level of noise in gene expression is required for the network to acquire both types of robustness. The results reveal how the noise that cells encounter during growth and development shapes any network's robustness, not only to noise but also to mutations. The authors also establish a relationship between developmental and mutational robustness.
Systems genetics identifies a convergent gene network for cognition and neurodevelopmental disease.
Johnson, Michael R; Shkura, Kirill; Langley, Sarah R; Delahaye-Duriez, Andree; Srivastava, Prashant; Hill, W David; Rackham, Owen J L; Davies, Gail; Harris, Sarah E; Moreno-Moral, Aida; Rotival, Maxime; Speed, Doug; Petrovski, Slavé; Katz, Anaïs; Hayward, Caroline; Porteous, David J; Smith, Blair H; Padmanabhan, Sandosh; Hocking, Lynne J; Starr, John M; Liewald, David C; Visconti, Alessia; Falchi, Mario; Bottolo, Leonardo; Rossetti, Tiziana; Danis, Bénédicte; Mazzuferi, Manuela; Foerch, Patrik; Grote, Alexander; Helmstaedter, Christoph; Becker, Albert J; Kaminski, Rafal M; Deary, Ian J; Petretto, Enrico
2016-02-01
Genetic determinants of cognition are poorly characterized, and their relationship to genes that confer risk for neurodevelopmental disease is unclear. Here we performed a systems-level analysis of genome-wide gene expression data to infer gene-regulatory networks conserved across species and brain regions. Two of these networks, M1 and M3, showed replicable enrichment for common genetic variants underlying healthy human cognitive abilities, including memory. Using exome sequence data from 6,871 trios, we found that M3 genes were also enriched for mutations ascertained from patients with neurodevelopmental disease generally, and intellectual disability and epileptic encephalopathy in particular. M3 consists of 150 genes whose expression is tightly developmentally regulated, but which are collectively poorly annotated for known functional pathways. These results illustrate how systems-level analyses can reveal previously unappreciated relationships between neurodevelopmental disease-associated genes in the developed human brain, and provide empirical support for a convergent gene-regulatory network influencing cognition and neurodevelopmental disease.
Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen
2018-01-01
Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.
Jiang, Peng; Scarpa, Joseph R; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D; Hao, Ke; Summa, Keith C; Yang, He S; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew
2015-05-05
Sleep dysfunction and stress susceptibility are comorbid complex traits that often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multilevel organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J × A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type-specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests that the interplay among sleep, stress, and neuropathology emerges from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework for interrogating the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice.
Smita, Shuchi; Katiyar, Amit; Chinnusamy, Viswanathan; Pandey, Dev M; Bansal, Kailash C
2015-01-01
MYB transcription factor (TF) is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by "top-down" and "guide-gene" approaches. More than 50% of OsMYBs were strongly correlated under 50 experimental conditions with 51 hub genes via "top-down" approach. Further, clusters were identified using Markov Clustering (MCL). To maximize the clustering performance, parameter evaluation of the MCL inflation score (I) was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by "guide-gene" approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought response in rice. Thus, the co-regulatory network analysis facilitated the identification of complex OsMYB regulatory networks, and candidate target regulon genes of selected guide MYB genes. The results contribute to the candidate gene screening, and experimentally testable hypotheses for potential regulatory MYB TFs, and their targets under stress conditions.
Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W
2006-03-01
Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.
2014-01-01
Background Plant secondary metabolites are critical to various biological processes. However, the regulations of these metabolites are complex because of regulatory rewiring or crosstalk. To unveil how regulatory behaviors on secondary metabolism reshape biological processes, we constructed and analyzed a dynamic regulatory network of secondary metabolic pathways in Arabidopsis. Results The dynamic regulatory network was constructed through integrating co-expressed gene pairs and regulatory interactions. Regulatory interactions were either predicted by conserved transcription factor binding sites (TFBSs) or proved by experiments. We found that integrating two data (co-expression and predicted regulatory interactions) enhanced the number of highly confident regulatory interactions by over 10% compared with using single data. The dynamic changes of regulatory network systematically manifested regulatory rewiring to explain the mechanism of regulation, such as in terpenoids metabolism, the regulatory crosstalk of RAV1 (AT1G13260) and ATHB1 (AT3G01470) on HMG1 (hydroxymethylglutaryl-CoA reductase, AT1G76490); and regulation of RAV1 on epoxysqualene biosynthesis and sterol biosynthesis. Besides, we investigated regulatory rewiring with expression, network topology and upstream signaling pathways. Regulatory rewiring was revealed by the variability of genes’ expression: pathway genes and transcription factors (TFs) were significantly differentially expressed under different conditions (such as terpenoids biosynthetic genes in tissue experiments and E2F/DP family members in genotype experiments). Both network topology and signaling pathways supported regulatory rewiring. For example, we discovered correlation among the numbers of pathway genes, TFs and network topology: one-gene pathways (such as δ-carotene biosynthesis) were regulated by a fewer TFs, and were not critical to metabolic network because of their low degrees in topology. Upstream signaling pathways of 50 TFs were identified to comprehend the underlying mechanism of TFs’ regulatory rewiring. Conclusion Overall, this dynamic regulatory network largely improves the understanding of perplexed regulatory rewiring in secondary metabolism in Arabidopsis. PMID:24993737
2014-01-01
Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved. PMID:24444313
Farber, Charles R
2010-11-01
Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD. © 2010 American Society for Bone and Mineral Research.
Modrák, Martin; Vohradský, Jiří
2018-04-13
Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.
O'Brien, M.A.; Costin, B.N.; Miles, M.F.
2014-01-01
Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313
Identification of the Key Genes and Pathways in Esophageal Carcinoma.
Su, Peng; Wen, Shiwang; Zhang, Yuefeng; Li, Yong; Xu, Yanzhao; Zhu, Yonggang; Lv, Huilai; Zhang, Fan; Wang, Mingbo; Tian, Ziqiang
2016-01-01
Objective . Esophageal carcinoma (EC) is a frequently common malignancy of gastrointestinal cancer in the world. This study aims to screen key genes and pathways in EC and elucidate the mechanism of it. Methods . 5 microarray datasets of EC were downloaded from Gene Expression Omnibus. Differentially expressed genes (DEGs) were screened by bioinformatics analysis. Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, and protein-protein interaction (PPI) network construction were performed to obtain the biological roles of DEGs in EC. Quantitative real-time polymerase chain reaction (qRT-PCR) was used to verify the expression level of DEGs in EC. Results . A total of 1955 genes were filtered as DEGs in EC. The upregulated genes were significantly enriched in cell cycle and the downregulated genes significantly enriched in Endocytosis. PPI network displayed CDK4 and CCT3 were hub proteins in the network. The expression level of 8 dysregulated DEGs including CDK4, CCT3, THSD4, SIM2, MYBL2, CENPF, CDCA3, and CDKN3 was validated in EC compared to adjacent nontumor tissues and the results were matched with the microarray analysis. Conclusion . The significantly DEGs including CDK4, CCT3, THSD4, and SIM2 may play key roles in tumorigenesis and development of EC involved in cell cycle and Endocytosis.
Gérard, Claude; Novák, Béla
2013-01-01
microRNAs (miRNAs) are small noncoding RNAs that are important post-transcriptional regulators of gene expression. miRNAs can induce thresholds in protein synthesis. Such thresholds in protein output can be also achieved by oligomerization of transcription factors (TF) for the control of gene expression. First, we propose a minimal model for protein expression regulated by miRNA and by oligomerization of TF. We show that miRNA and oligomerization of TF generate a buffer, which increases the robustness of protein output towards molecular noise as well as towards random variation of kinetics parameters. Next, we extend the model by considering that the same miRNA can bind to multiple messenger RNAs, which accounts for the dynamics of a minimal competing endogenous RNAs (ceRNAs) network. The model shows that, through common miRNA regulation, TF can control the expression of all proteins formed by the ceRNA network, even if it drives the expression of only one gene in the network. The model further suggests that the threshold in protein synthesis mediated by the oligomerization of TF can be propagated to the other genes, which can increase the robustness of the expression of all genes in such ceRNA network. Furthermore, we show that a miRNA could increase the time delay of a “Goodwin-like” oscillator model, which may favor the occurrence of oscillations of large amplitude. This result predicts important roles of miRNAs in the control of the molecular mechanisms leading to the emergence of biological rhythms. Moreover, a model for the latter oscillator embedded in a ceRNA network indicates that the oscillatory behavior can be propagated, via the shared miRNA, to all proteins formed by such ceRNA network. Thus, by means of computational models, we show that miRNAs could act as vectors allowing the propagation of robustness in protein synthesis as well as oscillatory behaviors within ceRNA networks. PMID:24376695
Crombach, Anton; Cicin-Sain, Damjan; Wotton, Karl R; Jaeger, Johannes
2012-01-01
Understanding the function and evolution of developmental regulatory networks requires the characterisation and quantification of spatio-temporal gene expression patterns across a range of systems and species. However, most high-throughput methods to measure the dynamics of gene expression do not preserve the detailed spatial information needed in this context. For this reason, quantification methods based on image bioinformatics have become increasingly important over the past few years. Most available approaches in this field either focus on the detailed and accurate quantification of a small set of gene expression patterns, or attempt high-throughput analysis of spatial expression through binary pattern extraction and large-scale analysis of the resulting datasets. Here we present a robust, "medium-throughput" pipeline to process in situ hybridisation patterns from embryos of different species of flies. It bridges the gap between high-resolution, and high-throughput image processing methods, enabling us to quantify graded expression patterns along the antero-posterior axis of the embryo in an efficient and straightforward manner. Our method is based on a robust enzymatic (colorimetric) in situ hybridisation protocol and rapid data acquisition through wide-field microscopy. Data processing consists of image segmentation, profile extraction, and determination of expression domain boundary positions using a spline approximation. It results in sets of measured boundaries sorted by gene and developmental time point, which are analysed in terms of expression variability or spatio-temporal dynamics. Our method yields integrated time series of spatial gene expression, which can be used to reverse-engineer developmental gene regulatory networks across species. It is easily adaptable to other processes and species, enabling the in silico reconstitution of gene regulatory networks in a wide range of developmental contexts.
Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat
2011-09-15
Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.
García-Alonso, Luz; Alonso, Roberto; Vidal, Enrique; Amadoz, Alicia; de María, Alejandro; Minguez, Pablo; Medina, Ignacio; Dopazo, Joaquín
2012-01-01
Genomic experiments (e.g. differential gene expression, single-nucleotide polymorphism association) typically produce ranked list of genes. We present a simple but powerful approach which uses protein–protein interaction data to detect sub-networks within such ranked lists of genes or proteins. We performed an exhaustive study of network parameters that allowed us concluding that the average number of components and the average number of nodes per component are the parameters that best discriminate between real and random networks. A novel aspect that increases the efficiency of this strategy in finding sub-networks is that, in addition to direct connections, also connections mediated by intermediate nodes are considered to build up the sub-networks. The possibility of using of such intermediate nodes makes this approach more robust to noise. It also overcomes some limitations intrinsic to experimental designs based on differential expression, in which some nodes are invariant across conditions. The proposed approach can also be used for candidate disease-gene prioritization. Here, we demonstrate the usefulness of the approach by means of several case examples that include a differential expression analysis in Fanconi Anemia, a genome-wide association study of bipolar disorder and a genome-scale study of essentiality in cancer genes. An efficient and easy-to-use web interface (available at http://www.babelomics.org) based on HTML5 technologies is also provided to run the algorithm and represent the network. PMID:22844098
Sierra, Crystal S.; Haase, Steven B.
2016-01-01
The pathogenic yeast Cryptococcus neoformans causes fungal meningitis in immune-compromised patients. Cell proliferation in the budding yeast form is required for C. neoformans to infect human hosts, and virulence factors such as capsule formation and melanin production are affected by cell-cycle perturbation. Thus, understanding cell-cycle regulation is critical for a full understanding of virulence factors for disease. Our group and others have demonstrated that a large fraction of genes in Saccharomyces cerevisiae is expressed periodically during the cell cycle, and that proper regulation of this transcriptional program is important for proper cell division. Despite the evolutionary divergence of the two budding yeasts, we found that a similar percentage of all genes (~20%) is periodically expressed during the cell cycle in both yeasts. However, the temporal ordering of periodic expression has diverged for some orthologous cell-cycle genes, especially those related to bud emergence and bud growth. Genes regulating DNA replication and mitosis exhibited a conserved ordering in both yeasts, suggesting that essential cell-cycle processes are conserved in periodicity and in timing of expression (i.e. duplication before division). In S. cerevisiae cells, we have proposed that an interconnected network of periodic transcription factors (TFs) controls the bulk of the cell-cycle transcriptional program. We found that temporal ordering of orthologous network TFs was not always maintained; however, the TF network topology at cell-cycle commitment appears to be conserved in C. neoformans. During the C. neoformans cell cycle, DNA replication genes, mitosis genes, and 40 genes involved in virulence are periodically expressed. Future work toward understanding the gene regulatory network that controls cell-cycle genes is critical for developing novel antifungals to inhibit pathogen proliferation. PMID:27918582
Fang, Lingzhao; Sørensen, Peter; Sahana, Goutam; Panitz, Frank; Su, Guosheng; Zhang, Shengli; Yu, Ying; Li, Bingjie; Ma, Li; Liu, George; Lund, Mogens Sandø; Thomsen, Bo
2018-06-19
MicroRNAs (miRNA) are key modulators of gene expression and so act as putative fine-tuners of complex phenotypes. Here, we hypothesized that causal variants of complex traits are enriched in miRNAs and miRNA-target networks. First, we conducted a genome-wide association study (GWAS) for seven functional and milk production traits using imputed sequence variants (13~15 million) and >10,000 animals from three dairy cattle breeds, i.e., Holstein (HOL), Nordic red cattle (RDC) and Jersey (JER). Second, we analyzed for enrichments of association signals in miRNAs and their miRNA-target networks. Our results demonstrated that genomic regions harboring miRNA genes were significantly (P < 0.05) enriched with GWAS signals for milk production traits and mastitis, and that enrichments within miRNA-target gene networks were significantly higher than in random gene-sets for the majority of traits. Furthermore, most between-trait and across-breed correlations of enrichments with miRNA-target networks were significantly greater than with random gene-sets, suggesting pleiotropic effects of miRNAs. Intriguingly, genes that were differentially expressed in response to mammary gland infections were significantly enriched in the miRNA-target networks associated with mastitis. All these findings were consistent across three breeds. Collectively, our observations demonstrate the importance of miRNAs and their targets for the expression of complex traits.
Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping
2013-01-01
Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach. Findings from this proof-of-concept study suggest that our approach has a great potential in providing a novel and sensitive tool for threshold setting in chemical risk assessment. In future work, we plan to analyze more time-series datasets with a full spectrum of concentrations and sufficient replications per treatment. The pathway alteration-derived thresholds will also be compared with those derived from apical endpoints such as cell growth rate.
2011-01-01
Background Gene co-expression, in the form of a correlation coefficient, has been valuable in the analysis, classification and prediction of protein-protein interactions. However, it is susceptible to bias from a few samples having a large effect on the correlation coefficient. Gene co-expression stability is a means of quantifying this bias, with high stability indicating robust, unbiased co-expression correlation coefficients. We assess the utility of gene co-expression stability as an additional measure to support the co-expression correlation in the analysis of protein-protein interaction networks. Results We studied the patterns of co-expression correlation and stability in interacting proteins with respect to their interaction promiscuity, levels of intrinsic disorder, and essentiality or disease-relatedness. Co-expression stability, along with co-expression correlation, acts as a better classifier of hub proteins in interaction networks, than co-expression correlation alone, enabling the identification of a class of hubs that are functionally distinct from the widely accepted transient (date) and obligate (party) hubs. Proteins with high levels of intrinsic disorder have low co-expression correlation and high stability with their interaction partners suggesting their involvement in transient interactions, except for a small group that have high co-expression correlation and are typically subunits of stable complexes. Similar behavior was seen for disease-related and essential genes. Interacting proteins that are both disordered have higher co-expression stability than ordered protein pairs. Using co-expression correlation and stability, we found that transient interactions are more likely to occur between an ordered and a disordered protein while obligate interactions primarily occur between proteins that are either both ordered, or disordered. Conclusions We observe that co-expression stability shows distinct patterns in structurally and functionally different groups of proteins and interactions. We conclude that it is a useful and important measure to be used in concert with gene co-expression correlation for further insights into the characteristics of proteins in the context of their interaction network. PMID:22369639
A Scalable Approach for Discovering Conserved Active Subnetworks across Species
Verfaillie, Catherine M.; Hu, Wei-Shou; Myers, Chad L.
2010-01-01
Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. Despite successes in finding active subnetworks in the context of a single species, the idea of overlaying lists of differentially expressed genes on networks has not yet been extended to support the analysis of multiple species' interaction networks. To address this problem, we designed a scalable, cross-species network search algorithm, neXus (Network - cross(X)-species - Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. Our approach leverages functional linkage networks, which provide more comprehensive coverage of functional relationships than physical interaction networks by combining heterogeneous types of genomic data. We applied our cross-species approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on parallel gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved active subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Using a variation of this approach, we also find a number of species-specific networks, which likely reflect mechanisms of stem cell function that have diverged between mouse and human. We assess the statistical significance of the subnetworks by comparing them with subnetworks discovered on random permutations of the differential expression data. We also describe several case examples that illustrate the utility of comparative analysis of active subnetworks. PMID:21170309
Dutta, B; Pusztai, L; Qi, Y; André, F; Lazar, V; Bianchini, G; Ueno, N; Agarwal, R; Wang, B; Shiang, C Y; Hortobagyi, G N; Mills, G B; Symmans, W F; Balázsi, G
2012-01-01
Background: The rapid collection of diverse genome-scale data raises the urgent need to integrate and utilise these resources for biological discovery or biomedical applications. For example, diverse transcriptomic and gene copy number variation data are currently collected for various cancers, but relatively few current methods are capable to utilise the emerging information. Methods: We developed and tested a data-integration method to identify gene networks that drive the biology of breast cancer clinical subtypes. The method simultaneously overlays gene expression and gene copy number data on protein–protein interaction, transcriptional-regulatory and signalling networks by identifying coincident genomic and transcriptional disturbances in local network neighborhoods. Results: We identified distinct driver-networks for each of the three common clinical breast cancer subtypes: oestrogen receptor (ER)+, human epidermal growth factor receptor 2 (HER2)+, and triple receptor-negative breast cancers (TNBC) from patient and cell line data sets. Driver-networks inferred from independent datasets were significantly reproducible. We also confirmed the functional relevance of a subset of randomly selected driver-network members for TNBC in gene knockdown experiments in vitro. We found that TNBC driver-network members genes have increased functional specificity to TNBC cell lines and higher functional sensitivity compared with genes selected by differential expression alone. Conclusion: Clinical subtype-specific driver-networks identified through data integration are reproducible and functionally important. PMID:22343619
Ryan, Margaret M; Ryan, Brigid; Kyrke-Smith, Madeleine; Logan, Barbara; Tate, Warren P; Abraham, Wickliffe C; Williams, Joanna M
2012-01-01
Long-term potentiation (LTP) is widely accepted as a cellular mechanism underlying memory processes. It is well established that LTP persistence is strongly dependent on activation of constitutive and inducible transcription factors, but there is limited information regarding the downstream gene networks and controlling elements that coalesce to stabilise LTP. To identify these gene networks, we used Affymetrix RAT230.2 microarrays to detect genes regulated 5 h and 24 h (n = 5) after LTP induction at perforant path synapses in the dentate gyrus of awake adult rats. The functional relationships of the differentially expressed genes were examined using DAVID and Ingenuity Pathway Analysis, and compared with our previous data derived 20 min post-LTP induction in vivo. This analysis showed that LTP-related genes are predominantly upregulated at 5 h but that there is pronounced downregulation of gene expression at 24 h after LTP induction. Analysis of the structure of the networks and canonical pathways predicted a regulation of calcium dynamics via G-protein coupled receptors, dendritogenesis and neurogenesis at the 5 h time-point. By 24 h neurotrophin-NFKB driven pathways of neuronal growth were identified. The temporal shift in gene expression appears to be mediated by regulation of protein synthesis, ubiquitination and time-dependent regulation of specific microRNA and histone deacetylase expression. Together this programme of genomic responses, marked by both homeostatic and growth pathways, is likely to be critical for the consolidation of LTP in vivo.
A gene regulatory network armature for T-lymphocyte specification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fung, Elizabeth-sharon
Choice of a T-lymphoid fate by hematopoietic progenitor cells depends on sustained Notch-Delta signaling combined with tightly-regulated activities of multiple transcription factors. To dissect the regulatory network connections that mediate this process, we have used high-resolution analysis of regulatory gene expression trajectories from the beginning to the end of specification; tests of the short-term Notchdependence of these gene expression changes; and perturbation analyses of the effects of overexpression of two essential transcription factors, namely PU.l and GATA-3. Quantitative expression measurements of >50 transcription factor and marker genes have been used to derive the principal components of regulatory change through whichmore » T-cell precursors progress from primitive multipotency to T-lineage commitment. Distinct parts of the path reveal separate contributions of Notch signaling, GATA-3 activity, and downregulation of PU.l. Using BioTapestry, the results have been assembled into a draft gene regulatory network for the specification of T-cell precursors and the choice of T as opposed to myeloid dendritic or mast-cell fates. This network also accommodates effects of E proteins and mutual repression circuits of Gfil against Egr-2 and of TCF-l against PU.l as proposed elsewhere, but requires additional functions that remain unidentified. Distinctive features of this network structure include the intense dose-dependence of GATA-3 effects; the gene-specific modulation of PU.l activity based on Notch activity; the lack of direct opposition between PU.l and GATA-3; and the need for a distinct, late-acting repressive function or functions to extinguish stem and progenitor-derived regulatory gene expression.« less
NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM1
Liu, Li; Lei, Jing; Roeder, Kathryn
2016-01-01
While studies show that autism is highly heritable, the nature of the genetic basis of this disorder remains illusive. Based on the idea that highly correlated genes are functionally interrelated and more likely to affect risk, we develop a novel statistical tool to find more potentially autism risk genes by combining the genetic association scores with gene co-expression in specific brain regions and periods of development. The gene dependence network is estimated using a novel partial neighborhood selection (PNS) algorithm, where node specific properties are incorporated into network estimation for improved statistical and computational efficiency. Then we adopt a hidden Markov random field (HMRF) model to combine the estimated network and the genetic association scores in a systematic manner. The proposed modeling framework can be naturally extended to incorporate additional structural information concerning the dependence between genes. Using currently available genetic association data from whole exome sequencing studies and brain gene expression levels, the proposed algorithm successfully identified 333 genes that plausibly affect autism risk. PMID:27134692
A Systems' Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks
Kunz, Manfred; Vera, Julio; Wolkenhauer, Olaf
2013-01-01
MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. PMID:24350286
Causal network analysis of head and neck keloid tissue identifies potential master regulators.
Garcia-Rodriguez, Laura; Jones, Lamont; Chen, Kang Mei; Datta, Indrani; Divine, George; Worsham, Maria J
2016-10-01
To generate novel insights and hypotheses in keloid development from potential master regulators. Prospective cohort. Six fresh keloid and six normal skin samples from 12 anonymous donors were used in a prospective cohort study. Genome-wide profiling was done previously on the cohort using the Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA). The 190 statistically significant CpG islands between keloid and normal tissue mapped to 152 genes (P < .05). The top 10 statistically significant genes (VAMP5, ACTR3C, GALNT3, KCNAB2, LRRC61, SCML4, SYNGR1, TNS1, PLEKHG5, PPP1R13-α, false discovery rate <.015) were uploaded into the Ingenuity Pathway Analysis software's Causal Network Analysis (QIAGEN, Redwood City, CA). To reflect expected gene expression direction in the context of methylation changes, the inverse of the methylation ratio from keloid versus normal tissue was used for the analysis. Causal Network Analysis identified disease-specific master regulator molecules based on downstream differentially expressed keloid-specific genes and expected directionality of expression (hypermethylated vs. hypomethylated). Causal Network Analysis software identified four hierarchical networks that included four master regulators (pyroxamide, tributyrin, PRKG2, and PENK) and 19 intermediate regulators. Causal Network Analysis of differentiated methylated gene data of keloid versus normal skin demonstrated four causal networks with four master regulators. These hierarchical networks suggest potential driver roles for their downstream keloid gene targets in the pathogenesis of the keloid phenotype, likely triggered due to perturbation/injury to normal tissue. NA Laryngoscope, 126:E319-E324, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Nariai, N; Kim, S; Imoto, S; Miyano, S
2004-01-01
We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.
Guo, Sheng-Min; Wang, Jian-Xiong; Li, Jin; Xu, Fang-Yuan; Wei, Quan; Wang, Hai-Ming; Huang, Hou-Qiang; Zheng, Si-Lin; Xie, Yu-Jie; Zhang, Chi
2018-06-15
Osteoarthritis (OA) significantly influences the quality life of people around the world. It is urgent to find an effective way to understand the genetic etiology of OA. We used weighted gene coexpression network analysis (WGCNA) to explore the key genes involved in the subchondral bone pathological process of OA. Fifty gene expression profiles of GSE51588 were downloaded from the Gene Expression Omnibus database. The OA-associated genes and gene ontologies were acquired from JuniorDoc. Weighted gene coexpression network analysis was used to find disease-related networks based on 21756 gene expression correlation coefficients, hub-genes with the highest connectivity in each module were selected, and the correlation between module eigengene and clinical traits was calculated. The genes in the traits-related gene coexpression modules were subject to functional annotation and pathway enrichment analysis using ClusterProfiler. A total of 73 gene modules were identified, of which, 12 modules were found with high connectivity with clinical traits. Five modules were found with enriched OA-associated genes. Moreover, 310 OA-associated genes were found, and 34 of them were among hub-genes in each module. Consequently, enrichment results indicated some key metabolic pathways, such as extracellular matrix (ECM)-receptor interaction (hsa04512), focal adhesion (hsa04510), the phosphatidylinositol 3'-kinase (PI3K)-Akt signaling pathway (PI3K-AKT) (hsa04151), transforming growth factor beta pathway, and Wnt pathway. We intended to identify some core genes, collagen (COL)6A3, COL6A1, ITGA11, BAMBI, and HCK, which could influence downstream signaling pathways once they were activated. In this study, we identified important genes within key coexpression modules, which associate with a pathological process of subchondral bone in OA. Functional analysis results could provide important information to understand the mechanism of OA. © 2018 Wiley Periodicals, Inc.
Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks
Yamanaka, Ryota; Kitano, Hiroaki
2013-01-01
Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks. PMID:24278007
Large-Scale Analysis of Network Bistability for Human Cancers
Shiraishi, Tetsuya; Matsuyama, Shinako; Kitano, Hiroaki
2010-01-01
Protein–protein interaction and gene regulatory networks are likely to be locked in a state corresponding to a disease by the behavior of one or more bistable circuits exhibiting switch-like behavior. Sets of genes could be over-expressed or repressed when anomalies due to disease appear, and the circuits responsible for this over- or under-expression might persist for as long as the disease state continues. This paper shows how a large-scale analysis of network bistability for various human cancers can identify genes that can potentially serve as drug targets or diagnosis biomarkers. PMID:20628618
Gehan, Malia A; Mockler, Todd C; Weinig, Cynthia; Ewers, Brent E
2017-01-01
The dynamics of local climates make development of agricultural strategies challenging. Yield improvement has progressed slowly, especially in drought-prone regions where annual crop production suffers from episodic aridity. Underlying drought responses are circadian and diel control of gene expression that regulate daily variations in metabolic and physiological pathways. To identify transcriptomic changes that occur in the crop Brassica rapa during initial perception of drought, we applied a co-expression network approach to associate rhythmic gene expression changes with physiological responses. Coupled analysis of transcriptome and physiological parameters over a two-day time course in control and drought-stressed plants provided temporal resolution necessary for correlation of network modules with dynamic changes in stomatal conductance, photosynthetic rate, and photosystem II efficiency. This approach enabled the identification of drought-responsive genes based on their differential rhythmic expression profiles in well-watered versus droughted networks and provided new insights into the dynamic physiological changes that occur during drought. PMID:28826479
Diverse types of genetic variation converge on functional gene networks involved in schizophrenia.
Gilman, Sarah R; Chang, Jonathan; Xu, Bin; Bawa, Tejdeep S; Gogos, Joseph A; Karayiorgou, Maria; Vitkup, Dennis
2012-12-01
Despite the successful identification of several relevant genomic loci, the underlying molecular mechanisms of schizophrenia remain largely unclear. We developed a computational approach (NETBAG+) that allows an integrated analysis of diverse disease-related genetic data using a unified statistical framework. The application of this approach to schizophrenia-associated genetic variations, obtained using unbiased whole-genome methods, allowed us to identify several cohesive gene networks related to axon guidance, neuronal cell mobility, synaptic function and chromosomal remodeling. The genes forming the networks are highly expressed in the brain, with higher brain expression during prenatal development. The identified networks are functionally related to genes previously implicated in schizophrenia, autism and intellectual disability. A comparative analysis of copy number variants associated with autism and schizophrenia suggests that although the molecular networks implicated in these distinct disorders may be related, the mutations associated with each disease are likely to lead, at least on average, to different functional consequences.
Stability and structural properties of gene regulation networks with coregulation rules.
Warrell, Jonathan; Mhlanga, Musa
2017-05-07
Coregulation of the expression of groups of genes has been extensively demonstrated empirically in bacterial and eukaryotic systems. Such coregulation can arise through the use of shared regulatory motifs, which allow the coordinated expression of modules (and module groups) of functionally related genes across the genome. Coregulation can also arise through the physical association of multi-gene complexes through chromosomal looping, which are then transcribed together. We present a general formalism for modeling coregulation rules in the framework of Random Boolean Networks (RBN), and develop specific models for transcription factor networks with modular structure (including module groups, and multi-input modules (MIM) with autoregulation) and multi-gene complexes (including hierarchical differentiation between multi-gene complex members). We develop a mean-field approach to analyse the dynamical stability of large networks incorporating coregulation, and show that autoregulated MIM and hierarchical gene-complex models can achieve greater stability than networks without coregulation whose rules have matching activation frequency. We provide further analysis of the stability of small networks of both kinds through simulations. We also characterize several general properties of the transients and attractors in the hierarchical coregulation model, and show using simulations that the steady-state distribution factorizes hierarchically as a Bayesian network in a Markov Jump Process analogue of the RBN model. Copyright © 2017. Published by Elsevier Ltd.
Meyer, Miriah; Wunderlich, Zeba; Simirenko, Lisa; Luengo Hendriks, Cris L.; Keränen, Soile V. E.; Henriquez, Clara; Knowles, David W.; Biggin, Mark D.; Eisen, Michael B.; DePace, Angela H.
2011-01-01
Differences in the level, timing, or location of gene expression can contribute to alternative phenotypes at the molecular and organismal level. Understanding the origins of expression differences is complicated by the fact that organismal morphology and gene regulatory networks could potentially vary even between closely related species. To assess the scope of such changes, we used high-resolution imaging methods to measure mRNA expression in blastoderm embryos of Drosophila yakuba and Drosophila pseudoobscura and assembled these data into cellular resolution atlases, where expression levels for 13 genes in the segmentation network are averaged into species-specific, cellular resolution morphological frameworks. We demonstrate that the blastoderm embryos of these species differ in their morphology in terms of size, shape, and number of nuclei. We present an approach to compare cellular gene expression patterns between species, while accounting for varying embryo morphology, and apply it to our data and an equivalent dataset for Drosophila melanogaster. Our analysis reveals that all individual genes differ quantitatively in their spatio-temporal expression patterns between these species, primarily in terms of their relative position and dynamics. Despite many small quantitative differences, cellular gene expression profiles for the whole set of genes examined are largely similar. This suggests that cell types at this stage of development are conserved, though they can differ in their relative position by up to 3–4 cell widths and in their relative proportion between species by as much as 5-fold. Quantitative differences in the dynamics and relative level of a subset of genes between corresponding cell types may reflect altered regulatory functions between species. Our results emphasize that transcriptional networks can diverge over short evolutionary timescales and that even small changes can lead to distinct output in terms of the placement and number of equivalent cells. PMID:22046143
Mank, Nils N; Berghoff, Bork A; Klug, Gabriele
2013-03-01
Living cells use a variety of regulatory network motifs for accurate gene expression in response to changes in their environment or during differentiation processes. In Rhodobacter sphaeroides, a complex regulatory network controls expression of photosynthesis genes to guarantee optimal energy supply on one hand and to avoid photooxidative stress on the other hand. Recently, we identified a mixed incoherent feed-forward loop comprising the transcription factor PrrA, the sRNA PcrZ and photosynthesis target genes as part of this regulatory network. This point-of-view provides a comparison to other described feed-forward loops and discusses the physiological relevance of PcrZ in more detail.
A mixed incoherent feed-forward loop contributes to the regulation of bacterial photosynthesis genes
Mank, Nils N.; Berghoff, Bork A.; Klug, Gabriele
2013-01-01
Living cells use a variety of regulatory network motifs for accurate gene expression in response to changes in their environment or during differentiation processes. In Rhodobacter sphaeroides, a complex regulatory network controls expression of photosynthesis genes to guarantee optimal energy supply on one hand and to avoid photooxidative stress on the other hand. Recently, we identified a mixed incoherent feed-forward loop comprising the transcription factor PrrA, the sRNA PcrZ and photosynthesis target genes as part of this regulatory network. This point-of-view provides a comparison to other described feed-forward loops and discusses the physiological relevance of PcrZ in more detail. PMID:23392242
Identifying gene networks underlying the neurobiology of ethanol and alcoholism.
Wolen, Aaron R; Miles, Michael F
2012-01-01
For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.
The WRKY transcription factor family and senescence in switchgrass.
Rinerson, Charles I; Scully, Erin D; Palmer, Nathan A; Donze-Reiner, Teresa; Rabara, Roel C; Tripathi, Prateek; Shen, Qingxi J; Sattler, Scott E; Rohila, Jai S; Sarath, Gautam; Rushton, Paul J
2015-11-09
Early aerial senescence in switchgrass (Panicum virgatum) can significantly limit biomass yields. WRKY transcription factors that can regulate senescence could be used to reprogram senescence and enhance biomass yields. All potential WRKY genes present in the version 1.0 of the switchgrass genome were identified and curated using manual and bioinformatic methods. Expression profiles of WRKY genes in switchgrass flag leaf RNA-Seq datasets were analyzed using clustering and network analyses tools to identify both WRKY and WRKY-associated gene co-expression networks during leaf development and senescence onset. We identified 240 switchgrass WRKY genes including members of the RW5 and RW6 families of resistance proteins. Weighted gene co-expression network analysis of the flag leaf transcriptomes across development readily separated clusters of co-expressed genes into thirteen modules. A visualization highlighted separation of modules associated with the early and senescence-onset phases of flag leaf growth. The senescence-associated module contained 3000 genes including 23 WRKYs. Putative promoter regions of senescence-associated WRKY genes contained several cis-element-like sequences suggestive of responsiveness to both senescence and stress signaling pathways. A phylogenetic comparison of senescence-associated WRKY genes from switchgrass flag leaf with senescence-associated WRKY genes from other plants revealed notable hotspots in Group I, IIb, and IIe of the phylogenetic tree. We have identified and named 240 WRKY genes in the switchgrass genome. Twenty three of these genes show elevated mRNA levels during the onset of flag leaf senescence. Eleven of the WRKY genes were found in hotspots of related senescence-associated genes from multiple species and thus represent promising targets for future switchgrass genetic improvement. Overall, individual WRKY gene expression profiles could be readily linked to developmental stages of flag leaves.
Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui
2012-01-01
Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result.
Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui
2012-01-01
Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result. PMID:23284986
Tao, Wenjing; Chen, Jinlin; Tan, Dejie; Yang, Jing; Sun, Lina; Wei, Jing; Conte, Matthew A; Kocher, Thomas D; Wang, Deshou
2018-05-15
The factors determining sex in teleosts are diverse. Great efforts have been made to characterize the underlying genetic network in various species. However, only seven master sex-determining genes have been identified in teleosts. While the function of a few genes involved in sex determination and differentiation has been studied, we are far from fully understanding how genes interact to coordinate in this process. To enable systematic insights into fish sexual differentiation, we generated a dynamic co-expression network from tilapia gonadal transcriptomes at 5, 20, 30, 40, 90, and 180 dah (days after hatching), plus 45 and 90 dat (days after treatment) and linked gene expression profiles to both development and sexual differentiation. Transcriptomic profiles of female and male gonads at 5 and 20 dah exhibited high similarities except for a small number of genes that were involved in sex determination, while drastic changes were observed from 90 to 180 dah, with a group of differently expressed genes which were involved in gonadal differentiation and gametogenesis. Weighted gene correlation network analysis identified changes in the expression of Borealin, Gtsf1, tesk1, Zar1, Cdn15, and Rpl that were correlated with the expression of genes previously known to be involved in sex differentiation, such as Foxl2, Cyp19a1a, Gsdf, Dmrt1, and Amh. Global gonadal gene expression kinetics during sex determination and differentiation have been extensively profiled in tilapia. These findings provide insights into the genetic framework underlying sex determination and sexual differentiation, and expand our current understanding of developmental pathways during teleost sex determination.
Ramayo-Caldas, Yuliaxis; Ballester, Maria; Fortes, Marina R S; Esteve-Codina, Anna; Castelló, Anna; Noguera, Jose L; Fernández, Ana I; Pérez-Enciso, Miguel; Reverter, Antonio; Folch, Josep M
2014-03-26
Fatty acids (FA) play a critical role in energy homeostasis and metabolic diseases; in the context of livestock species, their profile also impacts on meat quality for healthy human consumption. Molecular pathways controlling lipid metabolism are highly interconnected and are not fully understood. Elucidating these molecular processes will aid technological development towards improvement of pork meat quality and increased knowledge of FA metabolism, underpinning metabolic diseases in humans. The results from genome-wide association studies (GWAS) across 15 phenotypes were subjected to an Association Weight Matrix (AWM) approach to predict a network of 1,096 genes related to intramuscular FA composition in pigs. To identify the key regulators of FA metabolism, we focused on the minimal set of transcription factors (TF) that the explored the majority of the network topology. Pathway and network analyses pointed towards a trio of TF as key regulators of FA metabolism: NCOA2, FHL2 and EP300. Promoter sequence analyses confirmed that these TF have binding sites for some well-know regulators of lipid and carbohydrate metabolism. For the first time in a non-model species, some of the co-associations observed at the genetic level were validated through co-expression at the transcriptomic level based on real-time PCR of 40 genes in adipose tissue, and a further 55 genes in liver. In particular, liver expression of NCOA2 and EP300 differed between pig breeds (Iberian and Landrace) extreme in terms of fat deposition. Highly clustered co-expression networks in both liver and adipose tissues were observed. EP300 and NCOA2 showed centrality parameters above average in the both networks. Over all genes, co-expression analyses confirmed 28.9% of the AWM predicted gene-gene interactions in liver and 33.0% in adipose tissue. The magnitude of this validation varied across genes, with up to 60.8% of the connections of NCOA2 in adipose tissue being validated via co-expression. Our results recapitulate the known transcriptional regulation of FA metabolism, predict gene interactions that can be experimentally validated, and suggest that genetic variants mapped to EP300, FHL2, and NCOA2 modulate lipid metabolism and control energy homeostasis in pigs.
A proof of the DBRF-MEGN method, an algorithm for deducing minimum equivalent gene networks
2011-01-01
Background We previously developed the DBRF-MEGN (difference-based regulation finding-minimum equivalent gene network) method, which deduces the most parsimonious signed directed graphs (SDGs) consistent with expression profiles of single-gene deletion mutants. However, until the present study, we have not presented the details of the method's algorithm or a proof of the algorithm. Results We describe in detail the algorithm of the DBRF-MEGN method and prove that the algorithm deduces all of the exact solutions of the most parsimonious SDGs consistent with expression profiles of gene deletion mutants. Conclusions The DBRF-MEGN method provides all of the exact solutions of the most parsimonious SDGs consistent with expression profiles of gene deletion mutants. PMID:21699737
Jambusaria, Ankit; Klomp, Jeff; Hong, Zhigang; Rafii, Shahin; Dai, Yang; Malik, Asrar B; Rehman, Jalees
2018-06-07
The heterogeneity of cells across tissue types represents a major challenge for studying biological mechanisms as well as for therapeutic targeting of distinct tissues. Computational prediction of tissue-specific gene regulatory networks may provide important insights into the mechanisms underlying the cellular heterogeneity of cells in distinct organs and tissues. Using three pathway analysis techniques, gene set enrichment analysis (GSEA), parametric analysis of gene set enrichment (PGSEA), alongside our novel model (HeteroPath), which assesses heterogeneously upregulated and downregulated genes within the context of pathways, we generated distinct tissue-specific gene regulatory networks. We analyzed gene expression data derived from freshly isolated heart, brain, and lung endothelial cells and populations of neurons in the hippocampus, cingulate cortex, and amygdala. In both datasets, we found that HeteroPath segregated the distinct cellular populations by identifying regulatory pathways that were not identified by GSEA or PGSEA. Using simulated datasets, HeteroPath demonstrated robustness that was comparable to what was seen using existing gene set enrichment methods. Furthermore, we generated tissue-specific gene regulatory networks involved in vascular heterogeneity and neuronal heterogeneity by performing motif enrichment of the heterogeneous genes identified by HeteroPath and linking the enriched motifs to regulatory transcription factors in the ENCODE database. HeteroPath assesses contextual bidirectional gene expression within pathways and thus allows for transcriptomic assessment of cellular heterogeneity. Unraveling tissue-specific heterogeneity of gene expression can lead to a better understanding of the molecular underpinnings of tissue-specific phenotypes.
Krienen, Fenna M.; Yeo, B. T. Thomas; Ge, Tian; Buckner, Randy L.; Sherwood, Chet C.
2016-01-01
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute’s human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections. PMID:26739559
Krienen, Fenna M; Yeo, B T Thomas; Ge, Tian; Buckner, Randy L; Sherwood, Chet C
2016-01-26
The human brain is patterned with disproportionately large, distributed cerebral networks that connect multiple association zones in the frontal, temporal, and parietal lobes. The expansion of the cortical surface, along with the emergence of long-range connectivity networks, may be reflected in changes to the underlying molecular architecture. Using the Allen Institute's human brain transcriptional atlas, we demonstrate that genes particularly enriched in supragranular layers of the human cerebral cortex relative to mouse distinguish major cortical classes. The topography of transcriptional expression reflects large-scale brain network organization consistent with estimates from functional connectivity MRI and anatomical tracing in nonhuman primates. Microarray expression data for genes preferentially expressed in human upper layers (II/III), but enriched only in lower layers (V/VI) of mouse, were cross-correlated to identify molecular profiles across the cerebral cortex of postmortem human brains (n = 6). Unimodal sensory and motor zones have similar molecular profiles, despite being distributed across the cortical mantle. Sensory/motor profiles were anticorrelated with paralimbic and certain distributed association network profiles. Tests of alternative gene sets did not consistently distinguish sensory and motor regions from paralimbic and association regions: (i) genes enriched in supragranular layers in both humans and mice, (ii) genes cortically enriched in humans relative to nonhuman primates, (iii) genes related to connectivity in rodents, (iv) genes associated with human and mouse connectivity, and (v) 1,454 gene sets curated from known gene ontologies. Molecular innovations of upper cortical layers may be an important component in the evolution of long-range corticocortical projections.
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders
Li, Jingjing; Shi, Minyi; Ma, Zhihai; Zhao, Shuchun; Euskirchen, Ghia; Ziskin, Jennifer; Urban, Alexander; Hallmayer, Joachim; Snyder, Michael
2014-01-01
Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology. PMID:25549968
Ghosh Dasgupta, Modhumita; Dharanishanthi, Veeramuthu
2017-09-05
Ecophysiological studies in Eucalyptus have shown that water is the principal factor limiting stem growth. Effect of water deficit conditions on physiological and biochemical parameters has been extensively reported in Eucalyptus. The present study was conducted to identify major polyethylene glycol induced water stress responsive transcripts in Eucalyptus grandis using gene co-expression network. A customized array representing 3359 water stress responsive genes was designed to document their expression in leaves of E. grandis cuttings subjected to -0.225MPa of PEG treatment. The differentially expressed transcripts were documented and significantly co-expressed transcripts were used for construction of network. The co-expression network was constructed with 915 nodes and 3454 edges with degree ranging from 2 to 45. Ninety four GO categories and 117 functional pathways were identified in the network. MCODE analysis generated 27 modules and module 6 with 479 nodes and 1005 edges was identified as the biologically relevant network. The major water responsive transcripts represented in the module included dehydrin, osmotin, LEA protein, expansin, arabinogalactans, heat shock proteins, major facilitator proteins, ARM repeat proteins, raffinose synthase, tonoplast intrinsic protein and transcription factors like DREB2A, ARF9, AGL24, UNE12, WLIM1 and MYB66, MYB70, MYB 55, MYB 16 and MYB 103. The coordinated analysis of gene expression patterns and coexpression networks developed in this study identified an array of transcripts that may regulate PEG induced water stress responses in E. grandis. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome-Wide Responses of Female Fruit Flies Subjected to Divergent Mating Regimes
Gerrard, Dave T.; Fricke, Claudia; Edward, Dominic A.; Edwards, Dylan R.; Chapman, Tracey
2013-01-01
Elevated rates of mating and reproduction cause decreased female survival and lifetime reproductive success across a wide range of taxa from flies to humans. These costs are fundamentally important to the evolution of life histories. Here we investigate the potential mechanistic basis of this classic life history component. We conducted 4 independent replicated experiments in which female Drosophila melanogaster were subjected to ‘high’ and ‘low’ mating regimes, resulting in highly significant differences in lifespan. We sampled females for transcriptomic analysis at day 10 of life, before the visible onset of ageing, and used Tiling expression arrays to detect differential gene expression in two body parts (abdomen versus head+thorax). The divergent mating regimes were associated with significant differential expression in a network of genes showing evidence for interactions with ecdysone receptor. Preliminary experimental manipulation of two genes in that network with roles in post-transcriptional modification (CG11486, eyegone) tended to enhance sensitivity to mating costs. However, the subtle nature of those effects suggests substantial functional redundancy or parallelism in this gene network, which could buffer females against excessive responses. There was also evidence for differential expression in genes involved in germline maintenance, cell proliferation and in gustation / odorant reception. Interestingly, we detected differential expression in three specific genes (EcR, keap1, lbk1) and one class of genes (gustation / odorant receptors) with previously reported roles in determining lifespan. Our results suggest that high and low mating regimes that lead to divergence in lifespan are associated with changes in the expression of genes such as reproductive hormones, that influence resource allocation to the germ line, and that may modify post-translational gene expression. This predicts that the correct signalling of nutrient levels to the reproductive system is important for maintaining organismal integrity. PMID:23826372
Fu, Shijie; Pan, Xufeng; Fang, Wentao
2014-08-01
Lung cancer severely reduces the quality of life worldwide and causes high socioeconomic burdens. However, key genes leading to the generation of pulmonary adenocarcinoma remain elusive despite intensive research efforts. The present study aimed to identify the potential associations between transcription factors (TFs) and differentially co‑expressed genes (DCGs) in the regulation of transcription in pulmonary adenocarcinoma. Gene expression profiles of pulmonary adenocarcinoma were downloaded from the Gene Expression Omnibus, and gene expression was analyzed using a computational method. A total of 37,094 differentially co‑expressed links (DCLs) and 251 DCGs were identified, which were significantly enriched in 10 pathways. The construction of the regulatory network and the analysis of the regulatory impact factors revealed eight crucial TFs in the regulatory network. These TFs regulated the expression of DCGs by promoting or inhibiting their expression. In addition, certain TFs and target genes associated with DCGs did not appear in the DCLs, which indicated that those TFs could be synergistic with other factors. This is likely to provide novel insights for research into pulmonary adenocarcinoma. In conclusion, the present study may enhance the understanding of disease mechanisms and lead to an improved diagnosis of lung cancer. However, further studies are required to confirm these observations.
Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H
2017-11-01
Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.
Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua
2017-07-01
Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harwood, Caroline S
The goal of this project is to identify gene networks that are critical for efficient biohydrogen production by leveraging variation in gene content and gene expression in independently isolated Rhodopseudomonas palustris strains. Coexpression methods were applied to large data sets that we have collected to define probabilistic causal gene networks. To our knowledge this a first systems level approach that takes advantage of strain-to strain variability to computationally define networks critical for a particular bacterial phenotypic trait.
Zhang, Fang; Xu, Xiang; Zhou, Ben; He, Zhishui; Zhai, Qiwei
2011-01-01
Food availability regulates basal metabolism and progression of many diseases, and liver plays an important role in these processes. The effects of food availability on digital gene expression profile, physiological and pathological functions in liver are yet to be further elucidated. In this study, we applied high-throughput sequencing technology to detect digital gene expression profile of mouse liver in fed, fasted and refed states. Totally 12162 genes were detected, and 2305 genes were significantly regulated by food availability. Biological process and pathway analysis showed that fasting mainly affected lipid and carboxylic acid metabolic processes in liver. Moreover, the genes regulated by fasting and refeeding in liver were mainly enriched in lipid metabolic process or fatty acid metabolism. Network analysis demonstrated that fasting mainly regulated Drug Metabolism, Small Molecule Biochemistry and Endocrine System Development and Function, and the networks including Lipid Metabolism, Small Molecule Biochemistry and Gene Expression were affected by refeeding. In addition, FunDo analysis showed that liver cancer and diabetes mellitus were most likely to be affected by food availability. This study provides the digital gene expression profile of mouse liver regulated by food availability, and demonstrates the main biological processes, pathways, gene networks and potential hepatic diseases regulated by fasting and refeeding. These results show that food availability mainly regulates hepatic lipid metabolism and is highly correlated with liver-related diseases including liver cancer and diabetes. PMID:22096593
Zhang, Fang; Xu, Xiang; Zhou, Ben; He, Zhishui; Zhai, Qiwei
2011-01-01
Food availability regulates basal metabolism and progression of many diseases, and liver plays an important role in these processes. The effects of food availability on digital gene expression profile, physiological and pathological functions in liver are yet to be further elucidated. In this study, we applied high-throughput sequencing technology to detect digital gene expression profile of mouse liver in fed, fasted and refed states. Totally 12162 genes were detected, and 2305 genes were significantly regulated by food availability. Biological process and pathway analysis showed that fasting mainly affected lipid and carboxylic acid metabolic processes in liver. Moreover, the genes regulated by fasting and refeeding in liver were mainly enriched in lipid metabolic process or fatty acid metabolism. Network analysis demonstrated that fasting mainly regulated Drug Metabolism, Small Molecule Biochemistry and Endocrine System Development and Function, and the networks including Lipid Metabolism, Small Molecule Biochemistry and Gene Expression were affected by refeeding. In addition, FunDo analysis showed that liver cancer and diabetes mellitus were most likely to be affected by food availability. This study provides the digital gene expression profile of mouse liver regulated by food availability, and demonstrates the main biological processes, pathways, gene networks and potential hepatic diseases regulated by fasting and refeeding. These results show that food availability mainly regulates hepatic lipid metabolism and is highly correlated with liver-related diseases including liver cancer and diabetes.
Integrative analyses of leprosy susceptibility genes indicate a common autoimmune profile.
Zhang, Deng-Feng; Wang, Dong; Li, Yu-Ye; Yao, Yong-Gang
2016-04-01
Leprosy is an ancient chronic infection in the skin and peripheral nerves caused by Mycobacterium leprae. The development of leprosy depends on genetic background and the immune status of the host. However, there is no systematic view focusing on the biological pathways, interaction networks and overall expression pattern of leprosy-related immune and genetic factors. To identify the hub genes in the center of leprosy genetic network and to provide an insight into immune and genetic factors contributing to leprosy. We retrieved all reported leprosy-related genes and performed integrative analyses covering gene expression profiling, pathway analysis, protein-protein interaction network, and evolutionary analyses. A list of 123 differentially expressed leprosy related genes, which were enriched in activation and regulation of immune response, was obtained in our analyses. Cross-disorder analysis showed that the list of leprosy susceptibility genes was largely shared by typical autoimmune diseases such as lupus erythematosus and arthritis, suggesting that similar pathways might be affected in leprosy and autoimmune diseases. Protein-protein interaction (PPI) and positive selection analyses revealed a co-evolution network of leprosy risk genes. Our analyses showed that leprosy associated genes constituted a co-evolution network and might undergo positive selection driven by M. leprae. We suggested that leprosy may be a kind of autoimmune disease and the development of leprosy is a matter of defect or over-activation of body immunity. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Nigam, Deepti; Sawant, Samir V
2013-01-01
Technological development led to an increased interest in systems biological approaches in plants to characterize developmental mechanism and candidate genes relevant to specific tissue or cell morphology. AUX-IAA proteins are important plant-specific putative transcription factors. There are several reports on physiological response of this family in Arabidopsis but in cotton fiber the transcriptional network through which AUX-IAA regulated its target genes is still unknown. in-silico modelling of cotton fiber development specific gene expression data (108 microarrays and 22,737 genes) using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals 3690 putative AUX-IAA target genes of which 139 genes were known to be AUX-IAA co-regulated within Arabidopsis. Further AUX-IAA targeted gene regulatory network (GRN) had substantial impact on the transcriptional dynamics of cotton fiber, as showed by, altered TF networks, and Gene Ontology (GO) biological processes and metabolic pathway associated with its target genes. Analysis of the AUX-IAA-correlated gene network reveals multiple functions for AUX-IAA target genes such as unidimensional cell growth, cellular nitrogen compound metabolic process, nucleosome organization, DNA-protein complex and process related to cell wall. These candidate networks/pathways have a variety of profound impacts on such cellular functions as stress response, cell proliferation, and cell differentiation. While these functions are fairly broad, their underlying TF networks may provide a global view of AUX-IAA regulated gene expression and a GRN that guides future studies in understanding role of AUX-IAA box protein and its targets regulating fiber development. PMID:24497725
Statistical mechanics of scale-free gene expression networks
NASA Astrophysics Data System (ADS)
Gross, Eitan
2012-12-01
The gene co-expression networks of many organisms including bacteria, mice and man exhibit scale-free distribution. This heterogeneous distribution of connections decreases the vulnerability of the network to random attacks and thus may confer the genetic replication machinery an intrinsic resilience to such attacks, triggered by changing environmental conditions that the organism may be subject to during evolution. This resilience to random attacks comes at an energetic cost, however, reflected by the lower entropy of the scale-free distribution compared to the more homogenous, random network. In this study we found that the cell cycle-regulated gene expression pattern of the yeast Saccharomyces cerevisiae obeys a power-law distribution with an exponent α = 2.1 and an entropy of 1.58. The latter is very close to the maximal value of 1.65 obtained from linear optimization of the entropy function under the constraint of a constant cost function, determined by the average degree connectivity
Primiani, Christopher T.; Ryan, Veronica H.; Rao, Jagadeesh S.; Cam, Margaret C.; Ahn, Kwangmi; Modi, Hiren R.; Rapoport, Stanley I.
2014-01-01
Background Age changes in expression of inflammatory, synaptic, and neurotrophic genes are not well characterized during human brain development and senescence. Knowing these changes may elucidate structural, metabolic, and functional brain processes over the lifespan, as well vulnerability to neurodevelopmental or neurodegenerative diseases. Hypothesis Expression levels of inflammatory, synaptic, and neurotrophic genes in the human brain are coordinated over the lifespan and underlie changes in phenotypic networks or cascades. Methods We used a large-scale microarray dataset from human prefrontal cortex, BrainCloud, to quantify age changes over the lifespan, divided into Development (0 to 21 years, 87 brains) and Aging (22 to 78 years, 144 brains) intervals, in transcription levels of 39 genes. Results Gene expression levels followed different trajectories over the lifespan. Many changes were intercorrelated within three similar groups or clusters of genes during both Development and Aging, despite different roles of the gene products in the two intervals. During Development, changes were related to reported neuronal loss, dendritic growth and pruning, and microglial events; TLR4, IL1R1, NFKB1, MOBP, PLA2G4A, and PTGS2 expression increased in the first years of life, while expression of synaptic genes GAP43 and DBN1 decreased, before reaching plateaus. During Aging, expression was upregulated for potentially pro-inflammatory genes such as NFKB1, TRAF6, TLR4, IL1R1, TSPO, and GFAP, but downregulated for neurotrophic and synaptic integrity genes such as BDNF, NGF, PDGFA, SYN, and DBN1. Conclusions Coordinated changes in gene transcription cascades underlie changes in synaptic, neurotrophic, and inflammatory phenotypic networks during brain Development and Aging. Early postnatal expression changes relate to neuronal, glial, and myelin growth and synaptic pruning events, while late Aging is associated with pro-inflammatory and synaptic loss changes. Thus, comparable transcriptional regulatory networks that operate throughout the lifespan underlie different phenotypic processes during Aging compared to Development. PMID:25329999
Primiani, Christopher T; Ryan, Veronica H; Rao, Jagadeesh S; Cam, Margaret C; Ahn, Kwangmi; Modi, Hiren R; Rapoport, Stanley I
2014-01-01
Age changes in expression of inflammatory, synaptic, and neurotrophic genes are not well characterized during human brain development and senescence. Knowing these changes may elucidate structural, metabolic, and functional brain processes over the lifespan, as well vulnerability to neurodevelopmental or neurodegenerative diseases. Expression levels of inflammatory, synaptic, and neurotrophic genes in the human brain are coordinated over the lifespan and underlie changes in phenotypic networks or cascades. We used a large-scale microarray dataset from human prefrontal cortex, BrainCloud, to quantify age changes over the lifespan, divided into Development (0 to 21 years, 87 brains) and Aging (22 to 78 years, 144 brains) intervals, in transcription levels of 39 genes. Gene expression levels followed different trajectories over the lifespan. Many changes were intercorrelated within three similar groups or clusters of genes during both Development and Aging, despite different roles of the gene products in the two intervals. During Development, changes were related to reported neuronal loss, dendritic growth and pruning, and microglial events; TLR4, IL1R1, NFKB1, MOBP, PLA2G4A, and PTGS2 expression increased in the first years of life, while expression of synaptic genes GAP43 and DBN1 decreased, before reaching plateaus. During Aging, expression was upregulated for potentially pro-inflammatory genes such as NFKB1, TRAF6, TLR4, IL1R1, TSPO, and GFAP, but downregulated for neurotrophic and synaptic integrity genes such as BDNF, NGF, PDGFA, SYN, and DBN1. Coordinated changes in gene transcription cascades underlie changes in synaptic, neurotrophic, and inflammatory phenotypic networks during brain Development and Aging. Early postnatal expression changes relate to neuronal, glial, and myelin growth and synaptic pruning events, while late Aging is associated with pro-inflammatory and synaptic loss changes. Thus, comparable transcriptional regulatory networks that operate throughout the lifespan underlie different phenotypic processes during Aging compared to Development.
Yao, Ting; Wang, Qinfu; Zhang, Wenyong; Bian, Aihong; Zhang, Jinping
2016-07-01
Renal cell carcinoma (RCC) is the most common type of kidney cancer in adults and accounts for ~80% of all kidney cancer cases. However, the pathogenesis of RCC has not yet been fully elucidated. To interpret the pathogenesis of RCC at the molecular level, gene expression data and bio-informatics methods were used to identify RCC associated genes. Gene expression data was downloaded from Gene Expression Omnibus (GEO) database and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in RCC patients compared with controls. In addition, a regulatory network was constructed using the known regulatory data between transcription factors (TFs) and target genes in the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) and the regulatory impact factor of each TF was calculated. A total of 258,0427 pairs of DCGs were identified. The regulatory network contained 1,525 pairs of regulatory associations between 126 TFs and 1,259 target genes and these genes were mainly enriched in cancer pathways, ErbB and MAPK. In the regulatory network, the 10 most strongly associated TFs were FOXC1, GATA3, ESR1, FOXL1, PATZ1, MYB, STAT5A, EGR2, EGR3 and PELP1. GATA3, ERG and MYB serve important roles in RCC while FOXC1, ESR1, FOXL1, PATZ1, STAT5A and PELP1 may be potential genes associated with RCC. In conclusion, the present study constructed a regulatory network and screened out several TFs that may be used as molecular biomarkers of RCC. However, future studies are needed to confirm the findings of the present study.
YAO, TING; WANG, QINFU; ZHANG, WENYONG; BIAN, AIHONG; ZHANG, JINPING
2016-01-01
Renal cell carcinoma (RCC) is the most common type of kidney cancer in adults and accounts for ~80% of all kidney cancer cases. However, the pathogenesis of RCC has not yet been fully elucidated. To interpret the pathogenesis of RCC at the molecular level, gene expression data and bio-informatics methods were used to identify RCC associated genes. Gene expression data was downloaded from Gene Expression Omnibus (GEO) database and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in RCC patients compared with controls. In addition, a regulatory network was constructed using the known regulatory data between transcription factors (TFs) and target genes in the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) and the regulatory impact factor of each TF was calculated. A total of 258,0427 pairs of DCGs were identified. The regulatory network contained 1,525 pairs of regulatory associations between 126 TFs and 1,259 target genes and these genes were mainly enriched in cancer pathways, ErbB and MAPK. In the regulatory network, the 10 most strongly associated TFs were FOXC1, GATA3, ESR1, FOXL1, PATZ1, MYB, STAT5A, EGR2, EGR3 and PELP1. GATA3, ERG and MYB serve important roles in RCC while FOXC1, ESR1, FOXL1, PATZ1, STAT5A and PELP1 may be potential genes associated with RCC. In conclusion, the present study constructed a regulatory network and screened out several TFs that may be used as molecular biomarkers of RCC. However, future studies are needed to confirm the findings of the present study. PMID:27347102
Li, Angsheng; Yin, Xianchen; Pan, Yicheng
2016-01-01
In this study, we propose a method for constructing cell sample networks from gene expression profiles, and a structural entropy minimisation principle for detecting natural structure of networks and for identifying cancer cell subtypes. Our method establishes a three-dimensional gene map of cancer cell types and subtypes. The identified subtypes are defined by a unique gene expression pattern, and a three-dimensional gene map is established by defining the unique gene expression pattern for each identified subtype for cancers, including acute leukaemia, lymphoma, multi-tissue, lung cancer and healthy tissue. Our three-dimensional gene map demonstrates that a true tumour type may be divided into subtypes, each defined by a unique gene expression pattern. Clinical data analyses demonstrate that most cell samples of an identified subtype share similar survival times, survival indicators and International Prognostic Index (IPI) scores and indicate that distinct subtypes identified by our algorithms exhibit different overall survival times, survival ratios and IPI scores. Our three-dimensional gene map establishes a high-definition, one-to-one map between the biologically and medically meaningful tumour subtypes and the gene expression patterns, and identifies remarkable cells that form singleton submodules. PMID:26842724
Pan, Weiran; Li, Gang; Yang, Xiaoxiao; Miao, Jinming
2015-04-01
This study aims to explore the potential mechanism of glioma through bioinformatic approaches. The gene expression profile (GSE4290) of glioma tumor and non-tumor samples was downloaded from Gene Expression Omnibus database. A total of 180 samples were available, including 23 non-tumor and 157 tumor samples. Then the raw data were preprocessed using robust multiarray analysis, and 8,890 differentially expressed genes (DEGs) were identified by using t-test (false discovery rate < 0.0005). Furthermore, 16 known glioma related genes were abstracted from Genetic Association Database. After mapping 8,890 DEGs and 16 known glioma related genes to Human Protein Reference Database, a glioma associated protein-protein interaction network (GAPN) was constructed. In addition, 51 sub-networks in GAPN were screened out through Molecular Complex Detection (score ≥ 1), and sub-network 1 was found to have the closest interaction (score = 3). What' more, for the top 10 sub-networks, Gene Ontology (GO) enrichment analysis (p value < 0.05) was performed, and DEGs involved in sub-network 1 and 2, such as BRMS1L and CCNA1, were predicted to regulate cell growth, cell cycle, and DNA replication via interacting with known glioma related genes. Finally, the overlaps of DEGs and human essential, housekeeping, tissue-specific genes were calculated (p value = 1.0, 1.0, and 0.00014, respectively) and visualized by Venn Diagram package in R. About 61% of human tissue-specific genes were DEGs as well. This research shed new light on the pathogenesis of glioma based on DEGs and GAPN, and our findings might provide potential targets for clinical glioma treatment.
NASA Astrophysics Data System (ADS)
Furusawa, Chikara; Kaneko, Kunihiko
2003-02-01
Using data from gene expression databases on various organisms and tissues, including yeast, nematodes, human normal and cancer tissues, and embryonic stem cells, we found that the abundances of expressed genes exhibit a power-law distribution with an exponent close to -1; i.e., they obey Zipf’s law. Furthermore, by simulations of a simple model with an intracellular reaction network, we found that Zipf’s law of chemical abundance is a universal feature of cells where such a network optimizes the efficiency and faithfulness of self-reproduction. These findings provide novel insights into the nature of the organization of reaction dynamics in living cells.
Romero-Garcia, Rafael; Whitaker, Kirstie J; Váša, František; Seidlitz, Jakob; Shinn, Maxwell; Fonagy, Peter; Dolan, Raymond J; Jones, Peter B; Goodyer, Ian M; Bullmore, Edward T; Vértes, Petra E
2018-05-01
Complex network topology is characteristic of many biological systems, including anatomical and functional brain networks (connectomes). Here, we first constructed a structural covariance network from MRI measures of cortical thickness on 296 healthy volunteers, aged 14-24 years. Next, we designed a new algorithm for matching sample locations from the Allen Brain Atlas to the nodes of the SCN. Subsequently we used this to define, transcriptomic brain networks by estimating gene co-expression between pairs of cortical regions. Finally, we explored the hypothesis that transcriptional networks and structural MRI connectomes are coupled. A transcriptional brain network (TBN) and a structural covariance network (SCN) were correlated across connection weights and showed qualitatively similar complex topological properties: assortativity, small-worldness, modularity, and a rich-club. In both networks, the weight of an edge was inversely related to the anatomical (Euclidean) distance between regions. There were differences between networks in degree and distance distributions: the transcriptional network had a less fat-tailed degree distribution and a less positively skewed distance distribution than the SCN. However, cortical areas connected to each other within modules of the SCN had significantly higher levels of whole genome co-expression than expected by chance. Nodes connected in the SCN had especially high levels of expression and co-expression of a human supragranular enriched (HSE) gene set that has been specifically located to supragranular layers of human cerebral cortex and is known to be important for large-scale, long-distance cortico-cortical connectivity. This coupling of brain transcriptome and connectome topologies was largely but not entirely accounted for by the common constraint of physical distance on both networks. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Monteiro, Antónia
2012-03-01
Co-option of the eye developmental gene regulatory network may have led to the appearance of novel functional traits on the wings of flies and butterflies. The first trait is a recently described wing organ in a species of extinct midge resembling the outer layers of the midge's own compound eye. The second trait is red pigment patches on Heliconius butterfly wings connected to the expression of an eye selector gene, optix. These examples, as well as others, are discussed regarding the type of empirical evidence and burden of proof that have been used to infer gene network co-option underlying the origin of novel traits. A conceptual framework describing increasing confidence in inference of network co-option is proposed. Novel research directions to facilitate inference of network co-option are also highlighted, especially in cases where the pre-existent and novel traits do not resemble each other. Copyright © 2012 WILEY Periodicals, Inc.
Regulatory Divergence between Parental Alleles Determines Gene Expression Patterns in Hybrids
Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe
2015-01-01
Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. PMID:25819221
Li, Wei; Xiang, Fen; Zhong, Micai; Zhou, Lingyun; Liu, Hongyan; Li, Saijun; Wang, Xuewen
2017-05-10
Applied nitrogen (N) fertilizer significantly increases the leaf yield. However, most N is not utilized by the plant, negatively impacting the environment. To date, little is known regarding N utilization genes and mechanisms in the leaf production. To understand this, we investigated transcriptomes using RNA-seq and amino acid levels with N treatment in tea (Camellia sinensis), the most popular beverage crop. We identified 196 and 29 common differentially expressed genes in roots and leaves, respectively, in response to ammonium in two tea varieties. Among those genes, AMT, NRT and AQP for N uptake and GOGAT and GS for N assimilation were the key genes, validated by RT-qPCR, which expressed in a network manner with tissue specificity. Importantly, only AQP and three novel DEGs associated with stress, manganese binding, and gibberellin-regulated transcription factor were common in N responses across all tissues and varieties. A hypothesized gene regulatory network for N was proposed. A strong statistical correlation between key genes' expression and amino acid content was revealed. The key genes and regulatory network improve our understanding of the molecular mechanism of N usage and offer gene targets for plant improvement.
Meta-Analysis of Tumor Stem-Like Breast Cancer Cells Using Gene Set and Network Analysis
Lee, Won Jun; Kim, Sang Cheol; Yoon, Jung-Ho; Yoon, Sang Jun; Lim, Johan; Kim, You-Sun; Kwon, Sung Won; Park, Jeong Hill
2016-01-01
Generally, cancer stem cells have epithelial-to-mesenchymal-transition characteristics and other aggressive properties that cause metastasis. However, there have been no confident markers for the identification of cancer stem cells and comparative methods examining adherent and sphere cells are widely used to investigate mechanism underlying cancer stem cells, because sphere cells have been known to maintain cancer stem cell characteristics. In this study, we conducted a meta-analysis that combined gene expression profiles from several studies that utilized tumorsphere technology to investigate tumor stem-like breast cancer cells. We used our own gene expression profiles along with the three different gene expression profiles from the Gene Expression Omnibus, which we combined using the ComBat method, and obtained significant gene sets using the gene set analysis of our datasets and the combined dataset. This experiment focused on four gene sets such as cytokine-cytokine receptor interaction that demonstrated significance in both datasets. Our observations demonstrated that among the genes of four significant gene sets, six genes were consistently up-regulated and satisfied the p-value of < 0.05, and our network analysis showed high connectivity in five genes. From these results, we established CXCR4, CXCL1 and HMGCS1, the intersecting genes of the datasets with high connectivity and p-value of < 0.05, as significant genes in the identification of cancer stem cells. Additional experiment using quantitative reverse transcription-polymerase chain reaction showed significant up-regulation in MCF-7 derived sphere cells and confirmed the importance of these three genes. Taken together, using meta-analysis that combines gene set and network analysis, we suggested CXCR4, CXCL1 and HMGCS1 as candidates involved in tumor stem-like breast cancer cells. Distinct from other meta-analysis, by using gene set analysis, we selected possible markers which can explain the biological mechanisms and suggested network analysis as an additional criterion for selecting candidates. PMID:26870956
Network Analysis of Rodent Transcriptomes in Spaceflight
NASA Technical Reports Server (NTRS)
Ramachandran, Maya; Fogle, Homer; Costes, Sylvain
2017-01-01
Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.
Wotton, Karl R; Jiménez-Guri, Eva; Crombach, Anton; Janssens, Hilde; Alcaine-Colet, Anna; Lemke, Steffen; Schmidt-Ott, Urs; Jaeger, Johannes
2015-01-01
The segmentation gene network in insects can produce equivalent phenotypic outputs despite differences in upstream regulatory inputs between species. We investigate the mechanistic basis of this phenomenon through a systems-level analysis of the gap gene network in the scuttle fly Megaselia abdita (Phoridae). It combines quantification of gene expression at high spatio-temporal resolution with systematic knock-downs by RNA interference (RNAi). Initiation and dynamics of gap gene expression differ markedly between M. abdita and Drosophila melanogaster, while the output of the system converges to equivalent patterns at the end of the blastoderm stage. Although the qualitative structure of the gap gene network is conserved, there are differences in the strength of regulatory interactions between species. We term such network rewiring ‘quantitative system drift’. It provides a mechanistic explanation for the developmental hourglass model in the dipteran lineage. Quantitative system drift is likely to be a widespread mechanism for developmental evolution. DOI: http://dx.doi.org/10.7554/eLife.04785.001 PMID:25560971
Penfold, Christopher A.; Jenkins, Dafyd J.; Legaie, Roxane; Lawson, Tracy; Vialet-Chabrand, Silvere R.M.; Subramaniam, Sunitha; Hickman, Richard; Feil, Regina; Bowden, Laura; Hill, Claire; Lunn, John E.; Finkenstädt, Bärbel; Buchanan-Wollaston, Vicky; Beynon, Jim; Wild, David L.; Ott, Sascha
2016-01-01
In Arabidopsis thaliana, changes in metabolism and gene expression drive increased drought tolerance and initiate diverse drought avoidance and escape responses. To address regulatory processes that link these responses, we set out to identify genes that govern early responses to drought. To do this, a high-resolution time series transcriptomics data set was produced, coupled with detailed physiological and metabolic analyses of plants subjected to a slow transition from well-watered to drought conditions. A total of 1815 drought-responsive differentially expressed genes were identified. The early changes in gene expression coincided with a drop in carbon assimilation, and only in the late stages with an increase in foliar abscisic acid content. To identify gene regulatory networks (GRNs) mediating the transition between the early and late stages of drought, we used Bayesian network modeling of differentially expressed transcription factor (TF) genes. This approach identified AGAMOUS-LIKE22 (AGL22), as key hub gene in a TF GRN. It has previously been shown that AGL22 is involved in the transition from vegetative state to flowering but here we show that AGL22 expression influences steady state photosynthetic rates and lifetime water use. This suggests that AGL22 uniquely regulates a transcriptional network during drought stress, linking changes in primary metabolism and the initiation of stress responses. PMID:26842464
CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks
Baumbach, Jan
2007-01-01
Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320
When is hub gene selection better than standard meta-analysis?
Langfelder, Peter; Mischel, Paul S; Horvath, Steve
2013-01-01
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to gene expression data and presents novel R functions for carrying out consensus network analysis, network based screening, and meta analysis.
USDA-ARS?s Scientific Manuscript database
Functional annotations of large plant genome projects mostly provide information on gene function and gene families based on the presence of protein domains and gene homology, but not necessarily in association with gene expression or metabolic and regulatory networks. These additional annotations a...
Identification and function analysis of contrary genes in Dupuytren's contracture.
Ji, Xianglu; Tian, Feng; Tian, Lijie
2015-07-01
The present study aimed to analyze the expression of genes involved in Dupuytren's contracture (DC), using bioinformatic methods. The profile of GSE21221 was downloaded from the gene expression ominibus, which included six samples, derived from fibroblasts and six healthy control samples, derived from carpal-tunnel fibroblasts. A Distributed Intrusion Detection System was used in order to identify differentially expressed genes. The term contrary genes is proposed. Contrary genes were the genes that exhibited opposite expression patterns in the positive and negative groups, and likely exhibited opposite functions. These were identified using Coexpress software. Gene ontology (GO) function analysis was conducted for the contrary genes. A network of GO terms was constructed using the reduce and visualize gene ontology database. Significantly expressed genes (801) and contrary genes (98) were screened. A significant association was observed between Chitinase-3-like protein 1 and ten genes in the positive gene set. Positive regulation of transcription and the activation of nuclear factor-κB (NF-κB)-inducing kinase activity exhibited the highest degree values in the network of GO terms. In the present study, the expression of genes involved in the development of DC was analyzed, and the concept of contrary genes proposed. The genes identified in the present study are involved in the positive regulation of transcription and activation of NF-κB-inducing kinase activity. The contrary genes and GO terms identified in the present study may potentially be used for DC diagnosis and treatment.
de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.
2012-01-01
Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806
Regional and temporal differences in gene expression of LH(BETA)T(AG) retinoblastoma tumors.
Houston, Samuel K; Pina, Yolanda; Clarke, Jennifer; Koru-Sengul, Tulay; Scott, William K; Nathanson, Lubov; Schefler, Amy C; Murray, Timothy G
2011-07-23
The purpose of this study was to evaluate by microarray the hypothesis that LH(BETA)T(AG) retinoblastoma tumors exhibit regional and temporal variations in gene expression. LH(BETA)T(AG) mice aged 12, 16, and 20 weeks were euthanatized (n = 9). Specimens were taken from five tumor areas (apex, anterior lateral, center, base, and posterior lateral). Samples were hybridized to gene microarrays. The data were preprocessed and analyzed, and genes with a P < 0.01, according to the ANOVA models, and a log(2)-fold change >2.5 were considered to be differentially expressed. Differentially expressed genes were analyzed for overlap with known networks by using pathway analysis tools. There were significant temporal (P < 10(-8)) and regional differences in gene expression for LH(BETA)T(AG) retinoblastoma tumors. At P < 0.01 and log(2)-fold change >2.5, there were significant changes in gene expression of 190 genes apically, 84 genes anterolaterally, 126 genes posteriorly, 56 genes centrally, and 134 genes at the base. Differentially expressed genes overlapped with known networks, with significant involvement in regulation of cellular proliferation and growth, response to oxygen levels and hypoxia, regulation of cellular processes, cellular signaling cascades, and angiogenesis. There are significant temporal and regional variations in the LH(BETA)T(AG) retinoblastoma model. Differentially expressed genes overlap with key pathways that may play pivotal roles in murine retinoblastoma development. These findings suggest the mechanisms involved in tumor growth and progression in murine retinoblastoma tumors and identify pathways for analysis at a functional level, to determine significance in human retinoblastoma. Microarray analysis of LH(BETA)T(AG) retinal tumors showed significant regional and temporal variations in gene expression, including dysregulation of genes involved in hypoxic responses and angiogenesis.
Novel transcriptional networks regulated by CLOCK in human neurons.
Fontenot, Miles R; Berto, Stefano; Liu, Yuxiang; Werthmann, Gordon; Douglas, Connor; Usui, Noriyoshi; Gleason, Kelly; Tamminga, Carol A; Takahashi, Joseph S; Konopka, Genevieve
2017-11-01
The molecular mechanisms underlying human brain evolution are not fully understood; however, previous work suggested that expression of the transcription factor CLOCK in the human cortex might be relevant to human cognition and disease. In this study, we investigated this novel transcriptional role for CLOCK in human neurons by performing chromatin immunoprecipitation sequencing for endogenous CLOCK in adult neocortices and RNA sequencing following CLOCK knockdown in differentiated human neurons in vitro. These data suggested that CLOCK regulates the expression of genes involved in neuronal migration, and a functional assay showed that CLOCK knockdown increased neuronal migratory distance. Furthermore, dysregulation of CLOCK disrupts coexpressed networks of genes implicated in neuropsychiatric disorders, and the expression of these networks is driven by hub genes with human-specific patterns of expression. These data support a role for CLOCK-regulated transcriptional cascades involved in human brain evolution and function. © 2017 Fontenot et al.; Published by Cold Spring Harbor Laboratory Press.
Uncovering co-expression gene network modules regulating fruit acidity in diverse apples.
Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong
2015-08-16
Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P < 0.001) correlation with malate. Of these, the Ma1 containing module (Turquoise) of 336 genes showed the highest correlation (0.79). We also identified 12 intramodular hub genes from each of the five modules and 18 enriched gene ontology (GO) terms and MapMan sub-bines, including two GO terms (GO:0015979 and GO:0009765) and two MapMap sub-bins (1.3.4 and 1.1.1.1) related to photosynthesis in module Turquoise. Using Lemon-Tree algorithms, we identified 12 regulator genes of probabilistic scores 35.5-81.0, including MDP0000525602 (a LLR receptor kinase), MDP0000319170 (an IQD2-like CaM binding protein) and MDP0000190273 (an EIN3-like transcription factor) of greater interest for being one of the 18 MSAGs or one of the 12 intramodular hub genes in Turquoise, and/or a regulator to the cluster containing Ma1. The most relevant finding of this study is the identification of the MSAGs, intramodular hub genes, enriched photosynthesis related processes, and regulator genes in a WGCNA module Turquoise that not only encompasses Ma1 but also shows the highest modular correlation with acidity. Overall, this study provides important insight into the Ma1-mediated gene network controlling acidity in mature apple fruit of diverse genetic background.
Zahoor, Imran; de Koning, Dirk-Jan; Hocking, Paul M
2017-09-20
In recent years, the commercial importance of changes in muscle function of broiler chickens and of the corresponding effects on meat quality has increased. Furthermore, broilers are more sensitive to heat stress during transport and at high ambient temperatures than smaller egg-laying chickens. We hypothesised that heat stress would amplify muscle damage and expression of genes that are involved in such changes and, thus, lead to the identification of pathways and networks associated with broiler muscle and meat quality traits. Broiler and layer chickens were exposed to control or high ambient temperatures to characterise differences in gene expression between the two genotypes and the two environments. Whole-genome expression studies in breast muscles of broiler and layer chickens were conducted before and after heat stress; 2213 differentially-expressed genes were detected based on a significant (P < 0.05) genotype × treatment interaction. This gene set was analysed with the BioLayout Express 3D and Ingenuity Pathway Analysis software and relevant biological pathways and networks were identified. Genes involved in functions related to inflammatory reactions, cell death, oxidative stress and tissue damage were upregulated in control broilers compared with control and heat-stressed layers. Expression of these genes was further increased in heat-stressed broilers. Differences in gene expression between broiler and layer chickens under control and heat stress conditions suggest that damage of breast muscles in broilers at normal ambient temperatures is similar to that in heat-stressed layers and is amplified when broilers are exposed to heat stress. The patterns of gene expression of the two genotypes under heat stress were almost the polar opposite of each other, which is consistent with the conclusion that broiler chickens were not able to cope with heat stress by dissipating their body heat. The differentially expressed gene networks and pathways were consistent with the pathological changes that are observed in the breast muscle of heat-stressed broilers.
D'Antò, Vincenzo; Cantile, Monica; D'Armiento, Maria; Schiavo, Giulia; Spagnuolo, Gianrico; Terracciano, Luigi; Vecchione, Raffaela; Cillo, Clemente
2006-03-01
Homeobox-containing genes play a crucial role in odontogenesis. After the detection of Dlx and Msx genes in overlapping domains along maxillary and mandibular processes, a homeobox odontogenic code has been proposed to explain the interaction between different homeobox genes during dental lamina patterning. No role has so far been assigned to the Hox gene network in the homeobox odontogenic code due to studies on specific Hox genes and evolutionary considerations. Despite its involvement in early patterning during embryonal development, the HOX gene network, the most repeat-poor regions of the human genome, controls the phenotype identity of adult eukaryotic cells. Here, according to our results, the HOX gene network appears to be active in human tooth germs between 18 and 24 weeks of development. The immunohistochemical localization of specific HOX proteins mostly concerns the epithelial tooth germ compartment. Furthermore, only a few genes of the network are active in embryonal retromolar tissues, as well as in ectomesenchymal dental pulp cells (DPC) grown in vitro from adult human molar. Exposure of DPCs to cAMP induces the expression of from three to nine total HOX genes of the network in parallel with phenotype modifications with traits of neuronal differentiation. Our observations suggest that: (i) by combining its component genes, the HOX gene network determines the phenotype identity of epithelial and ectomesenchymal cells interacting in the generation of human tooth germ; (ii) cAMP treatment activates the HOX network and induces, in parallel, a neuronal-like phenotype in human primary ectomesenchymal dental pulp cells. 2005 Wiley-Liss, Inc.
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weighill, Deborah; Jones, Piet; Shah, Manesh
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Pleiotropic and Epistatic Network-Based Discovery: Integrated Networks for Target Gene Discovery
Weighill, Deborah; Jones, Piet; Shah, Manesh; ...
2018-05-11
Biological organisms are complex systems that are composed of functional networks of interacting molecules and macro-molecules. Complex phenotypes are the result of orchestrated, hierarchical, heterogeneous collections of expressed genomic variants. However, the effects of these variants are the result of historic selective pressure and current environmental and epigenetic signals, and, as such, their co-occurrence can be seen as genome-wide correlations in a number of different manners. Biomass recalcitrance (i.e., the resistance of plants to degradation or deconstruction, which ultimately enables access to a plant's sugars) is a complex polygenic phenotype of high importance to biofuels initiatives. This study makes usemore » of data derived from the re-sequenced genomes from over 800 different Populus trichocarpa genotypes in combination with metabolomic and pyMBMS data across this population, as well as co-expression and co-methylation networks in order to better understand the molecular interactions involved in recalcitrance, and identify target genes involved in lignin biosynthesis/degradation. A Lines Of Evidence (LOE) scoring system is developed to integrate the information in the different layers and quantify the number of lines of evidence linking genes to target functions. This new scoring system was applied to quantify the lines of evidence linking genes to lignin-related genes and phenotypes across the network layers, and allowed for the generation of new hypotheses surrounding potential new candidate genes involved in lignin biosynthesis in P. trichocarpa, including various AGAMOUS-LIKE genes. Lastly, the resulting Genome Wide Association Study networks, integrated with Single Nucleotide Polymorphism (SNP) correlation, co-methylation, and co-expression networks through the LOE scores are proving to be a powerful approach to determine the pleiotropic and epistatic relationships underlying cellular functions and, as such, the molecular basis for complex phenotypes, such as recalcitrance.« less
Lin, Mingyan; Pedrosa, Erika; Hrabovsky, Anastasia; Chen, Jian; Puliafito, Benjamin R; Gilbert, Stephanie R; Zheng, Deyou; Lachman, Herbert M
2016-11-15
Individuals with 22q11.2 Deletion Syndrome (22q11.2 DS) are a specific high-risk group for developing schizophrenia (SZ), schizoaffective disorder (SAD) and autism spectrum disorders (ASD). Several genes in the deleted region have been implicated in the development of SZ, e.g., PRODH and DGCR8. However, the mechanistic connection between these genes and the neuropsychiatric phenotype remains unclear. To elucidate the molecular consequences of 22q11.2 deletion in early neural development, we carried out RNA-seq analysis to investigate gene expression in early differentiating human neurons derived from induced pluripotent stem cells (iPSCs) of 22q11.2 DS SZ and SAD patients. Eight cases (ten iPSC-neuron samples in total including duplicate clones) and seven controls (nine in total including duplicate clones) were subjected to RNA sequencing. Using a systems level analysis, differentially expressed genes/gene-modules and pathway of interests were identified. Lastly, we related our findings from in vitro neuronal cultures to brain development by mapping differentially expressed genes to BrainSpan transcriptomes. We observed ~2-fold reduction in expression of almost all genes in the 22q11.2 region in SZ (37 genes reached p-value < 0.05, 36 of which reached a false discovery rate < 0.05). Outside of the deleted region, 745 genes showed significant differences in expression between SZ and control neurons (p < 0.05). Function enrichment and network analysis of the differentially expressed genes uncovered converging evidence on abnormal expression in key functional pathways, such as apoptosis, cell cycle and survival, and MAPK signaling in the SZ and SAD samples. By leveraging transcriptome profiles of normal human brain tissues across human development into adulthood, we showed that the differentially expressed genes converge on a sub-network mediated by CDC45 and the cell cycle, which would be disrupted by the 22q11.2 deletion during embryonic brain development, and another sub-network modulated by PRODH, which could contribute to disruption of brain function during adolescence. This study has provided evidence for disruption of potential molecular events in SZ patient with 22q11.2 deletion and related our findings from in vitro neuronal cultures to functional perturbations that can occur during brain development in SZ.
Acerbi, Enzo; Viganò, Elena; Poidinger, Michael; Mortellaro, Alessandra; Zelante, Teresa; Stella, Fabio
2016-01-01
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4+ naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments. PMID:26976045
Ji, S C; Pan, Y T; Lu, Q Y; Sun, Z Y; Liu, Y Z
2014-03-17
The purpose of this study was to identify critical genes associated with septic multiple trauma by comparing peripheral whole blood samples from multiple trauma patients with and without sepsis. A microarray data set was downloaded from the Gene Expression Omnibus (GEO) database. This data set included 70 samples, 36 from multiple trauma patients with sepsis and 34 from multiple trauma patients without sepsis (as a control set). The data were preprocessed, and differentially expressed genes (DEGs) were then screened for using packages of the R language. Functional analysis of DEGs was performed with DAVID. Interaction networks were then established for the most up- and down-regulated genes using HitPredict. Pathway-enrichment analysis was conducted for genes in the networks using WebGestalt. Fifty-eight DEGs were identified. The expression levels of PLAU (down-regulated) and MMP8 (up-regulated) presented the largest fold-changes, and interaction networks were established for these genes. Further analysis revealed that PLAT (plasminogen activator, tissue) and SERPINF2 (serpin peptidase inhibitor, clade F, member 2), which interact with PLAU, play important roles in the pathway of the component and coagulation cascade. We hypothesize that PLAU is a major regulator of the component and coagulation cascade, and down-regulation of PLAU results in dysfunction of the pathway, causing sepsis.
Ma, Min; Chen, Xiaofei; Lu, Liangyu; Yuan, Feng; Zeng, Wen; Luo, Shulin; Yin, Feng; Cai, Junfeng
2016-12-01
Postmenopausal osteoporosis is a common bone disease and characterized by low bone mineral density. This study aimed to reveal key genes associated with postmenopausal osteoporosis (PMO), and provide a theoretical basis for subsequent experiments. The dataset GSE7429 was obtained from Gene Expression Omnibus. A total of 20 B cell samples (ten ones, respectively from postmenopausal women with low or high bone mineral density (BMD) were included in this dataset. Following screening of differentially expressed genes (DEGs), coexpression analysis of all genes was performed, and key genes in the coexpression network were screened using the random walk algorithm. Afterwards, functional and pathway analyses were conducted. Additionally, protein-protein interactions (PPIs) between DEGs and key genes were analyzed. A set of 308 DEGs (170 up-regulated ones and 138 down-regulated ones) between low BMD and high BMD samples were identified, and 101 key genes in the coexpression network were screened out. In the coexpression network, some genes had a higher score and degree, such as CSTA. The key genes in the coexpression network were mainly enriched in GO terms of the defense response (e.g., SERPINA1 and CST3), immune response (e.g., IL32 and CLEC7A); while, the DEGs were mainly enriched in structural constituent of cytoskeleton (e.g., CYLC2 and TUBA1B) and membrane-enclosed lumen (e.g., CCNE1 and INTS5). In the PPI network, CCNE1 interacted with REL; and TUBA1B interacted with ESR1. A series of interactions, such as CSTA/TYROBP, CCNE1/REL and TUBA1B/ESR1 might play pivotal roles in the occurrence and development of PMO.
Identifying key genes associated with acute myocardial infarction.
Cheng, Ming; An, Shoukuan; Li, Junquan
2017-10-01
This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21-5p and hsa-miR-30c-5p were obviously decreased in AMI. A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs.
Identifying key genes associated with acute myocardial infarction
Cheng, Ming; An, Shoukuan; Li, Junquan
2017-01-01
Abstract Background: This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Methods: Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. Result: A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21–5p and hsa-miR-30c-5p were obviously decreased in AMI. Conclusion: A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs. PMID:29049183
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S.
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites. PMID:27588023
Pathania, Shivalika; Bagler, Ganesh; Ahuja, Paramvir S
2016-01-01
Comparative co-expression analysis of multiple species using high-throughput data is an integrative approach to determine the uniformity as well as diversification in biological processes. Rauvolfia serpentina and Catharanthus roseus, both members of Apocyanacae family, are reported to have remedial properties against multiple diseases. Despite of sharing upstream of terpenoid indole alkaloid pathway, there is significant diversity in tissue-specific synthesis and accumulation of specialized metabolites in these plants. This led us to implement comparative co-expression network analysis to investigate the modules and genes responsible for differential tissue-specific expression as well as species-specific synthesis of metabolites. Toward these goals differential network analysis was implemented to identify candidate genes responsible for diversification of metabolites profile. Three genes were identified with significant difference in connectivity leading to differential regulatory behavior between these plants. These genes may be responsible for diversification of secondary metabolism, and thereby for species-specific metabolite synthesis. The network robustness of R. serpentina, determined based on topological properties, was also complemented by comparison of gene-metabolite networks of both plants, and may have evolved to have complex metabolic mechanisms as compared to C. roseus under the influence of various stimuli. This study reveals evolution of complexity in secondary metabolism of R. serpentina, and key genes that contribute toward diversification of specific metabolites.
Liu, Rong; Guo, Cheng-Xian; Zhou, Hong-Hao
2015-01-01
This study aims to identify effective gene networks and prognostic biomarkers associated with estrogen receptor positive (ER+) breast cancer using human mRNA studies. Weighted gene coexpression network analysis was performed with a complex ER+ breast cancer transcriptome to investigate the function of networks and key genes in the prognosis of breast cancer. We found a significant correlation of an expression module with distant metastasis-free survival (HR = 2.25; 95% CI .21.03-4.88 in discovery set; HR = 1.78; 95% CI = 1.07-2.93 in validation set). This module contained genes enriched in the biological process of the M phase. From this module, we further identified and validated 5 hub genes (CDK1, DLGAP5, MELK, NUSAP1, and RRM2), the expression levels of which were strongly associated with poor survival. Highly expressed MELK indicated poor survival in luminal A and luminal B breast cancer molecular subtypes. This gene was also found to be associated with tamoxifen resistance. Results indicated that a network-based approach may facilitate the discovery of biomarkers for the prognosis of ER+ breast cancer and may also be used as a basis for establishing personalized therapies. Nevertheless, before the application of this approach in clinical settings, in vivo and in vitro experiments and multi-center randomized controlled clinical trials are still needed.
Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong
2014-05-15
In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.
Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico
2014-01-01
Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
Zinati, Zahra; Shamloo-Dashtpagerdi, Roohollah; Behpouri, Ali
2016-01-01
As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characterization of miRNAs along with the corresponding target genes in C. sativus might expand our perspectives on the roles of miRNAs in carotenoid/apocarotenoid biosynthetic pathway. A computational analysis was used to identify miRNAs and their targets using EST (Expressed Sequence Tag) library from mature saffron stigmas. Then, a gene co- expression network was constructed to identify genes which are potentially involved in carotenoid/apocarotenoid biosynthetic pathways. EST analysis led to the identification of two putative miRNAs (miR414 and miR837-5p) along with the corresponding stem- looped precursors. To our knowledge, this is the first report on miR414 and miR837-5p in C. sativus. Co-expression network analysis indicated that miR414 and miR837-5p may play roles in C. sativus metabolic pathways and led to identification of candidate genes including six transcription factors and one protein kinase probably involved in carotenoid/apocarotenoid biosynthetic pathway. Presence of transcription factors, miRNAs and protein kinase in the network indicated multiple layers of regulation in saffron stigma. The candidate genes from this study may help unraveling regulatory networks underlying the carotenoid/apocarotenoid biosynthesis in saffron and designing metabolic engineering for enhanced secondary metabolites. PMID:28261627
Kaufman, Alon; Dror, Gideon; Meilijson, Isaac; Ruppin, Eytan
2006-12-08
The claim that genetic properties of neurons significantly influence their synaptic network structure is a common notion in neuroscience. The nematode Caenorhabditis elegans provides an exciting opportunity to approach this question in a large-scale quantitative manner. Its synaptic connectivity network has been identified, and, combined with cellular studies, we currently have characteristic connectivity and gene expression signatures for most of its neurons. By using two complementary analysis assays we show that the expression signature of a neuron carries significant information about its synaptic connectivity signature, and identify a list of putative genes predicting neural connectivity. The current study rigorously quantifies the relation between gene expression and synaptic connectivity signatures in the C. elegans nervous system and identifies subsets of neurons where this relation is highly marked. The results presented and the genes identified provide a promising starting point for further, more detailed computational and experimental investigations.
Network information improves cancer outcome prediction.
Roy, Janine; Winter, Christof; Isik, Zerrin; Schroeder, Michael
2014-07-01
Disease progression in cancer can vary substantially between patients. Yet, patients often receive the same treatment. Recently, there has been much work on predicting disease progression and patient outcome variables from gene expression in order to personalize treatment options. Despite first diagnostic kits in the market, there are open problems such as the choice of random gene signatures or noisy expression data. One approach to deal with these two problems employs protein-protein interaction networks and ranks genes using the random surfer model of Google's PageRank algorithm. In this work, we created a benchmark dataset collection comprising 25 cancer outcome prediction datasets from literature and systematically evaluated the use of networks and a PageRank derivative, NetRank, for signature identification. We show that the NetRank performs significantly better than classical methods such as fold change or t-test. Despite an order of magnitude difference in network size, a regulatory and protein-protein interaction network perform equally well. Experimental evaluation on cancer outcome prediction in all of the 25 underlying datasets suggests that the network-based methodology identifies highly overlapping signatures over all cancer types, in contrast to classical methods that fail to identify highly common gene sets across the same cancer types. Integration of network information into gene expression analysis allows the identification of more reliable and accurate biomarkers and provides a deeper understanding of processes occurring in cancer development and progression. © The Author 2012. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Dumas, Marc-Emmanuel; Domange, Céline; Calderari, Sophie; Martínez, Andrea Rodríguez; Ayala, Rafael; Wilder, Steven P; Suárez-Zamorano, Nicolas; Collins, Stephan C; Wallis, Robert H; Gu, Quan; Wang, Yulan; Hue, Christophe; Otto, Georg W; Argoud, Karène; Navratil, Vincent; Mitchell, Steve C; Lindon, John C; Holmes, Elaine; Cazier, Jean-Baptiste; Nicholson, Jeremy K; Gauguier, Dominique
2016-09-30
The genetic regulation of metabolic phenotypes (i.e., metabotypes) in type 2 diabetes mellitus occurs through complex organ-specific cellular mechanisms and networks contributing to impaired insulin secretion and insulin resistance. Genome-wide gene expression profiling systems can dissect the genetic contributions to metabolome and transcriptome regulations. The integrative analysis of multiple gene expression traits and metabolic phenotypes (i.e., metabotypes) together with their underlying genetic regulation remains a challenge. Here, we introduce a systems genetics approach based on the topological analysis of a combined molecular network made of genes and metabolites identified through expression and metabotype quantitative trait locus mapping (i.e., eQTL and mQTL) to prioritise biological characterisation of candidate genes and traits. We used systematic metabotyping by 1 H NMR spectroscopy and genome-wide gene expression in white adipose tissue to map molecular phenotypes to genomic blocks associated with obesity and insulin secretion in a series of rat congenic strains derived from spontaneously diabetic Goto-Kakizaki (GK) and normoglycemic Brown-Norway (BN) rats. We implemented a network biology strategy approach to visualize the shortest paths between metabolites and genes significantly associated with each genomic block. Despite strong genomic similarities (95-99 %) among congenics, each strain exhibited specific patterns of gene expression and metabotypes, reflecting the metabolic consequences of series of linked genetic polymorphisms in the congenic intervals. We subsequently used the congenic panel to map quantitative trait loci underlying specific mQTLs and genome-wide eQTLs. Variation in key metabolites like glucose, succinate, lactate, or 3-hydroxybutyrate and second messenger precursors like inositol was associated with several independent genomic intervals, indicating functional redundancy in these regions. To navigate through the complexity of these association networks we mapped candidate genes and metabolites onto metabolic pathways and implemented a shortest path strategy to highlight potential mechanistic links between metabolites and transcripts at colocalized mQTLs and eQTLs. Minimizing the shortest path length drove prioritization of biological validations by gene silencing. These results underline the importance of network-based integration of multilevel systems genetics datasets to improve understanding of the genetic architecture of metabotype and transcriptomic regulation and to characterize novel functional roles for genes determining tissue-specific metabolism.
Gene expression variability in human hepatic drug metabolizing enzymes and transporters.
Yang, Lun; Price, Elvin T; Chang, Ching-Wei; Li, Yan; Huang, Ying; Guo, Li-Wu; Guo, Yongli; Kaput, Jim; Shi, Leming; Ning, Baitang
2013-01-01
Interindividual variability in the expression of drug-metabolizing enzymes and transporters (DMETs) in human liver may contribute to interindividual differences in drug efficacy and adverse reactions. Published studies that analyzed variability in the expression of DMET genes were limited by sample sizes and the number of genes profiled. We systematically analyzed the expression of 374 DMETs from a microarray data set consisting of gene expression profiles derived from 427 human liver samples. The standard deviation of interindividual expression for DMET genes was much higher than that for non-DMET genes. The 20 DMET genes with the largest variability in the expression provided examples of the interindividual variation. Gene expression data were also analyzed using network analysis methods, which delineates the similarities of biological functionalities and regulation mechanisms for these highly variable DMET genes. Expression variability of human hepatic DMET genes may affect drug-gene interactions and disease susceptibility, with concomitant clinical implications.
Detection of gene communities in multi-networks reveals cancer drivers
NASA Astrophysics Data System (ADS)
Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele
2015-12-01
We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.
Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena
2015-06-01
Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Dubovenko, Alexey; Serebryiskaya, Tatiana; Nikolsky, Yuri; Nikolskaya, Tatiana; Perlina, Ally; JeBailey, Lellean; Bureeva, Svetlana; Katta, Shilpa; Srivastava, Shiv; Dobi, Albert; Khasanova, Tatiana
2015-01-01
Background: Despite a growing number of studies evaluating cancer of prostate (CaP) specific gene alterations, oncogenic activation of the ETS Related Gene (ERG) by gene fusions remains the most validated cancer gene alteration in CaP. Prevalent gene fusions have been described between the ERG gene and promoter upstream sequences of androgen-inducible genes, predominantly TMPRSS2 (transmembrane protease serine 2). Despite the extensive evaluations of ERG genomic rearrangements, fusion transcripts and the ERG oncoprotein, the prognostic value of ERG remains to be better understood. Using gene expression dataset from matched prostate tumor and normal epithelial cells from an 80 GeneChip experiment examining 40 tumors and their matching normal pairs in 40 patients with known ERG status, we conducted a cancer signaling-focused functional analysis of prostatic carcinoma representing moderate and aggressive cancers stratified by ERG expression. Results: In the present study of matched pairs of laser capture microdissected normal epithelial cells and well-to-moderately differentiated tumor epithelial cells with known ERG gene expression status from 20 patients with localized prostate cancer, we have discovered novel ERG associated biochemical networks. Conclusions: Using causal network reconstruction methods, we have identified three major signaling pathways related to MAPK/PI3K cascade that may indeed contribute synergistically to the ERG dependent tumor development. Moreover, the key components of these pathways have potential as biomarkers and therapeutic target for ERG positive prostate tumors. PMID:26000039
Munger, Steven C.; Aylor, David L.; Syed, Haider Ali; Magwene, Paul M.; Threadgill, David W.; Capel, Blanche
2009-01-01
Despite the identification of some key genes that regulate sex determination, most cases of disorders of sexual development remain unexplained. Evidence suggests that the sexual fate decision in the developing gonad depends on a complex network of interacting factors that converge on a critical threshold. To elucidate the transcriptional network underlying sex determination, we took the first expression quantitative trait loci (eQTL) approach in a developing organ. We identified reproducible differences in the transcriptome of the embryonic day 11.5 (E11.5) XY gonad between C57BL/6J (B6) and 129S1/SvImJ (129S1), indicating that the reported sensitivity of B6 to sex reversal is consistent with a higher expression of a female-like transcriptome in B6. Gene expression is highly variable in F2 XY gonads from B6 and 129S1 intercrosses, yet strong correlations emerged. We estimated the F2 coexpression network and predicted roles for genes of unknown function based on their connectivity and position within the network. A genetic analysis of the F2 population detected autosomal regions that control the expression of many sex-related genes, including Sry (sex-determining region of the Y chromosome) and Sox9 (Sry-box containing gene 9), the key regulators of male sex determination. Our results reveal the complex transcription architecture underlying sex determination, and provide a mechanism by which individuals may be sensitized for sex reversal. PMID:19884258
Network Security via Biometric Recognition of Patterns of Gene Expression
NASA Technical Reports Server (NTRS)
Shaw, Harry C.
2016-01-01
Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time expression and assay of gene expression products.
Pan, Yu; Bradley, Glyn; Pyke, Kevin; Ball, Graham; Lu, Chungui; Fray, Rupert; Marshall, Alexandra; Jayasuta, Subhalai; Baxter, Charles; van Wijk, Rik; Boyden, Laurie; Cade, Rebecca; Chapman, Natalie H; Fraser, Paul D; Hodgman, Charlie; Seymour, Graham B
2013-03-01
Carotenoids represent some of the most important secondary metabolites in the human diet, and tomato (Solanum lycopersicum) is a rich source of these health-promoting compounds. In this work, a novel and fruit-related regulator of pigment accumulation in tomato has been identified by artificial neural network inference analysis and its function validated in transgenic plants. A tomato fruit gene regulatory network was generated using artificial neural network inference analysis and transcription factor gene expression profiles derived from fruits sampled at various points during development and ripening. One of the transcription factor gene expression profiles with a sequence related to an Arabidopsis (Arabidopsis thaliana) ARABIDOPSIS PSEUDO RESPONSE REGULATOR2-LIKE gene (APRR2-Like) was up-regulated at the breaker stage in wild-type tomato fruits and, when overexpressed in transgenic lines, increased plastid number, area, and pigment content, enhancing the levels of chlorophyll in immature unripe fruits and carotenoids in red ripe fruits. Analysis of the transcriptome of transgenic lines overexpressing the tomato APPR2-Like gene revealed up-regulation of several ripening-related genes in the overexpression lines, providing a link between the expression of this tomato gene and the ripening process. A putative ortholog of the tomato APPR2-Like gene in sweet pepper (Capsicum annuum) was associated with pigment accumulation in fruit tissues. We conclude that the function of this gene is conserved across taxa and that it encodes a protein that has an important role in ripening.
Martyniuk, Christopher J.; Prucha, Melinda S.; Doperalski, Nicholas J.; Antczak, Philipp; Kroll, Kevin J.; Falciani, Francesco; Barber, David S.; Denslow, Nancy D.
2013-01-01
Background Oocyte maturation in fish involves numerous cell signaling cascades that are activated or inhibited during specific stages of oocyte development. The objectives of this study were to characterize molecular pathways and temporal gene expression patterns throughout a complete breeding cycle in wild female largemouth bass to improve understanding of the molecular sequence of events underlying oocyte maturation. Methods Transcriptomic analysis was performed on eight morphologically diverse stages of the ovary, including primary and secondary stages of oocyte growth, ovulation, and atresia. Ovary histology, plasma vitellogenin, 17β-estradiol, and testosterone were also measured to correlate with gene networks. Results Global expression patterns revealed dramatic differences across ovarian development, with 552 and 2070 genes being differentially expressed during both ovulation and atresia respectively. Gene set enrichment analysis (GSEA) revealed that early primary stages of oocyte growth involved increases in expression of genes involved in pathways of B-cell and T-cell receptor-mediated signaling cascades and fibronectin regulation. These pathways as well as pathways that included adrenergic receptor signaling, sphingolipid metabolism and natural killer cell activation were down-regulated at ovulation. At atresia, down-regulated pathways included gap junction and actin cytoskeleton regulation, gonadotrope and mast cell activation, and vasopressin receptor signaling and up-regulated pathways included oxidative phosphorylation and reactive oxygen species metabolism. Expression targets for luteinizing hormone signaling were low during vitellogenesis but increased 150% at ovulation. Other networks found to play a significant role in oocyte maturation included those with genes regulated by members of the TGF-beta superfamily (activins, inhibins, bone morphogenic protein 7 and growth differentiation factor 9), neuregulin 1, retinoid X receptor, and nerve growth factor family. Conclusions This study offers novel insight into the gene networks underlying vitellogenesis, ovulation and atresia and generates new hypotheses about the cellular pathways regulating oocyte maturation. PMID:23527095
Listening to the noise: random fluctuations reveal gene network parameters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Munsky, Brian; Khammash, Mustafa
2009-01-01
The cellular environment is abuzz with noise. The origin of this noise is attributed to the inherent random motion of reacting molecules that take part in gene expression and post expression interactions. In this noisy environment, clonal populations of cells exhibit cell-to-cell variability that frequently manifests as significant phenotypic differences within the cellular population. The stochastic fluctuations in cellular constituents induced by noise can be measured and their statistics quantified. We show that these random fluctuations carry within them valuable information about the underlying genetic network. Far from being a nuisance, the ever-present cellular noise acts as a rich sourcemore » of excitation that, when processed through a gene network, carries its distinctive fingerprint that encodes a wealth of information about that network. We demonstrate that in some cases the analysis of these random fluctuations enables the full identification of network parameters, including those that may otherwise be difficult to measure. This establishes a potentially powerful approach for the identification of gene networks and offers a new window into the workings of these networks.« less
Musungu, Bryan M; Bhatnagar, Deepak; Brown, Robert L; Payne, Gary A; OBrian, Greg; Fakhoury, Ahmad M; Geisler, Matt
2016-01-01
A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus , a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays , and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays , there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus . Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus .
Musungu, Bryan M.; Bhatnagar, Deepak; Brown, Robert L.; Payne, Gary A.; OBrian, Greg; Fakhoury, Ahmad M.; Geisler, Matt
2016-01-01
A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus, a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays, and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays, there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus. Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus. PMID:27917194
Cánovas, Angela; Reverter, Antonio; DeAtley, Kasey L.; Ashley, Ryan L.; Colgrave, Michelle L.; Fortes, Marina R. S.; Islas-Trejo, Alma; Lehnert, Sigrid; Porto-Neto, Laercio; Rincón, Gonzalo; Silver, Gail A.; Snelling, Warren M.; Medrano, Juan F.; Thomas, Milton G.
2014-01-01
Puberty is a complex physiological event by which animals mature into an adult capable of sexual reproduction. In order to enhance our understanding of the genes and regulatory pathways and networks involved in puberty, we characterized the transcriptome of five reproductive tissues (i.e. hypothalamus, pituitary gland, ovary, uterus, and endometrium) as well as tissues known to be relevant to growth and metabolism needed to achieve puberty (i.e., longissimus dorsi muscle, adipose, and liver). These tissues were collected from pre- and post-pubertal Brangus heifers (3/8 Brahman; Bos indicus x 5/8 Angus; Bos taurus) derived from a population of cattle used to identify quantitative trait loci associated with fertility traits (i.e., age of first observed corpus luteum (ACL), first service conception (FSC), and heifer pregnancy (HPG)). In order to exploit the power of complementary omics analyses, pre- and post-puberty co-expression gene networks were constructed by combining the results from genome-wide association studies (GWAS), RNA-Seq, and bovine transcription factors. Eight tissues among pre-pubertal and post-pubertal Brangus heifers revealed 1,515 differentially expressed and 943 tissue-specific genes within the 17,832 genes confirmed by RNA-Seq analysis. The hypothalamus experienced the most notable up-regulation of genes via puberty (i.e., 204 out of 275 genes). Combining the results of GWAS and RNA-Seq, we identified 25 loci containing a single nucleotide polymorphism (SNP) associated with ACL, FSC, and (or) HPG. Seventeen of these SNP were within a gene and 13 of the genes were expressed in uterus or endometrium. Multi-tissue omics analyses revealed 2,450 co-expressed genes relative to puberty. The pre-pubertal network had 372,861 connections whereas the post-pubertal network had 328,357 connections. A sub-network from this process revealed key transcriptional regulators (i.e., PITX2, FOXA1, DACH2, PROP1, SIX6, etc.). Results from these multi-tissue omics analyses improve understanding of the number of genes and their complex interactions for puberty in cattle. PMID:25048735
Chen, Kai; Li, Yajie; Xu, Hui; Zhang, Chunfeng; Li, Zhiqiang; Wang, Wei; Wang, Baofeng
2017-10-20
Though there were many researches about the effects of cancer cells on non-small cell lung cancer (NSCLC) currently, it has been rarely reported completed oncogene and its mechanism in tumors by far. Here, we used biological methods with known oncogene of NSCLC to find new oncogene and explore its functionary mechanism in NSCLC. The study firstly built NSCLC genetic interaction network based on bioinformatics methods and then combined shortest path algorithm with significance test to confirmed core genes that were closely involved with given genes; real-time qPCR was conducted to detect expression levels between patients with NSCLC and normal people; additionally, detection of PARP1's role in migration and invasion was performed by trans-well assays and wound-healing. Through gene interaction network, it was found that, core genes like PARP1, EGFR and ALK had a direct interaction. TCGA database showed that PARP1 presented strong expression in NSCLC and the expression level of metastatic NSCLC was significantly higher than that of non-metastatic NSCLC. Cell migration of NSCLC in accordance to the scratch test was suppressed by PARP1 silence but stimulated noticeably by PARP1 overexpression. According to Kaplan-meier survival curve, the higher PARP1 expression, the poorer patient survival rate and prognosis. Thus, PARP1 expression had a negative correction with patient survival rate and prognosis. New oncogene PARP1 was found from known NSCLC oncogene in terms of gene interaction network, demonstrating PARP1's impact on NSCLC cell migration.
The many faces of REST oversee epigenetic programming of neuronal genes.
Ballas, Nurit; Mandel, Gail
2005-10-01
Nervous system development relies on a complex signaling network to engineer the orderly transitions that lead to the acquisition of a neural cell fate. Progression from the non-neuronal pluripotent stem cell to a restricted neural lineage is characterized by distinct patterns of gene expression, particularly the restriction of neuronal gene expression to neurons. Concurrently, cells outside the nervous system acquire and maintain a non-neuronal fate that permanently excludes expression of neuronal genes. Studies of the transcriptional repressor REST, which regulates a large network of neuronal genes, provide a paradigm for elucidating the link between epigenetic mechanisms and neurogenesis. REST orchestrates a set of epigenetic modifications that are distinct between non-neuronal cells that give rise to neurons and those that are destined to remain as nervous system outsiders.
eXpression2Kinases (X2K) Web: linking expression signatures to upstream cell signaling networks.
Clarke, Daniel J B; Kuleshov, Maxim V; Schilder, Brian M; Torre, Denis; Duffy, Mary E; Keenan, Alexandra B; Lachmann, Alexander; Feldmann, Axel S; Gundersen, Gregory W; Silverstein, Moshe C; Wang, Zichen; Ma'ayan, Avi
2018-05-25
While gene expression data at the mRNA level can be globally and accurately measured, profiling the activity of cell signaling pathways is currently much more difficult. eXpression2Kinases (X2K) computationally predicts involvement of upstream cell signaling pathways, given a signature of differentially expressed genes. X2K first computes enrichment for transcription factors likely to regulate the expression of the differentially expressed genes. The next step of X2K connects these enriched transcription factors through known protein-protein interactions (PPIs) to construct a subnetwork. The final step performs kinase enrichment analysis on the members of the subnetwork. X2K Web is a new implementation of the original eXpression2Kinases algorithm with important enhancements. X2K Web includes many new transcription factor and kinase libraries, and PPI networks. For demonstration, thousands of gene expression signatures induced by kinase inhibitors, applied to six breast cancer cell lines, are provided for fetching directly into X2K Web. The results are displayed as interactive downloadable vector graphic network images and bar graphs. Benchmarking various settings via random permutations enabled the identification of an optimal set of parameters to be used as the default settings in X2K Web. X2K Web is freely available from http://X2K.cloud.
Crocker, Amanda; Guan, Xiao-Juan; Murphy, Coleen T; Murthy, Mala
2016-05-17
Learning and memory formation in Drosophila rely on a network of neurons in the mushroom bodies (MBs). Whereas numerous studies have delineated roles for individual cell types within this network in aspects of learning or memory, whether or not these cells can also be distinguished by the genes they express remains unresolved. In addition, the changes in gene expression that accompany long-term memory formation within the MBs have not yet been studied by neuron type. Here, we address both issues by performing RNA sequencing on single cell types (harvested via patch pipets) within the MB. We discover that the expression of genes that encode cell surface receptors is sufficient to identify cell types and that a subset of these genes, required for sensory transduction in peripheral sensory neurons, is not only expressed within individual neurons of the MB in the central brain, but is also critical for memory formation. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Liang, Yajun; Wu, Heng; Lei, Rong; Chong, Robert A.; Wei, Yong; Lu, Xin; Tagkopoulos, Ilias; Kung, Sun-Yuan; Yang, Qifeng; Hu, Guohong; Kang, Yibin
2012-01-01
The application of functional genomic analysis of breast cancer metastasis has led to the identification of a growing number of organ-specific metastasis genes, which often function in concert to facilitate different steps of the metastatic cascade. However, the gene regulatory network that controls the expression of these metastasis genes remains largely unknown. Here, we demonstrate a computational approach for the deconvolution of transcriptional networks to discover master regulators of breast cancer bone metastasis. Several known regulators of breast cancer bone metastasis such as Smad4 and HIF1 were identified in our analysis. Experimental validation of the networks revealed BACH1, a basic leucine zipper transcription factor, as the common regulator of several functional metastasis genes, including MMP1 and CXCR4. Ectopic expression of BACH1 enhanced the malignance of breast cancer cells, and conversely, BACH1 knockdown significantly reduced bone metastasis. The expression of BACH1 and its target genes was linked to the higher risk of breast cancer recurrence in patients. This study established BACH1 as the master regulator of breast cancer bone metastasis and provided a paradigm to identify molecular determinants in complex pathological processes. PMID:22875853
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.
Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H
2017-12-20
Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
GSNFS: Gene subnetwork biomarker identification of lung cancer expression data.
Doungpan, Narumol; Engchuan, Worrawat; Chan, Jonathan H; Meechai, Asawin
2016-12-05
Gene expression has been used to identify disease gene biomarkers, but there are ongoing challenges. Single gene or gene-set biomarkers are inadequate to provide sufficient understanding of complex disease mechanisms and the relationship among those genes. Network-based methods have thus been considered for inferring the interaction within a group of genes to further study the disease mechanism. Recently, the Gene-Network-based Feature Set (GNFS), which is capable of handling case-control and multiclass expression for gene biomarker identification, has been proposed, partly taking into account of network topology. However, its performance relies on a greedy search for building subnetworks and thus requires further improvement. In this work, we establish a new approach named Gene Sub-Network-based Feature Selection (GSNFS) by implementing the GNFS framework with two proposed searching and scoring algorithms, namely gene-set-based (GS) search and parent-node-based (PN) search, to identify subnetworks. An additional dataset is used to validate the results. The two proposed searching algorithms of the GSNFS method for subnetwork expansion are concerned with the degree of connectivity and the scoring scheme for building subnetworks and their topology. For each iteration of expansion, the neighbour genes of a current subnetwork, whose expression data improved the overall subnetwork score, is recruited. While the GS search calculated the subnetwork score using an activity score of a current subnetwork and the gene expression values of its neighbours, the PN search uses the expression value of the corresponding parent of each neighbour gene. Four lung cancer expression datasets were used for subnetwork identification. In addition, using pathway data and protein-protein interaction as network data in order to consider the interaction among significant genes were discussed. Classification was performed to compare the performance of the identified gene subnetworks with three subnetwork identification algorithms. The two searching algorithms resulted in better classification and gene/gene-set agreement compared to the original greedy search of the GNFS method. The identified lung cancer subnetwork using the proposed searching algorithm resulted in an improvement of the cross-dataset validation and an increase in the consistency of findings between two independent datasets. The homogeneity measurement of the datasets was conducted to assess dataset compatibility in cross-dataset validation. The lung cancer dataset with higher homogeneity showed a better result when using the GS search while the dataset with low homogeneity showed a better result when using the PN search. The 10-fold cross-dataset validation on the independent lung cancer datasets showed higher classification performance of the proposed algorithms when compared with the greedy search in the original GNFS method. The proposed searching algorithms provide a higher number of genes in the subnetwork expansion step than the greedy algorithm. As a result, the performance of the subnetworks identified from the GSNFS method was improved in terms of classification performance and gene/gene-set level agreement depending on the homogeneity of the datasets used in the analysis. Some common genes obtained from the four datasets using different searching algorithms are genes known to play a role in lung cancer. The improvement of classification performance and the gene/gene-set level agreement, and the biological relevance indicated the effectiveness of the GSNFS method for gene subnetwork identification using expression data.
Zhou, Lei-Lei; Xu, Xiao-Yue; Ni, Jie; Zhao, Xia; Zhou, Jian-Wei; Feng, Ji-Feng
2018-06-01
Due to the low incidence and the heterogeneity of subtypes, the biological process of T-cell lymphomas is largely unknown. Although many genes have been detected in T-cell lymphomas, the role of these genes in biological process of T-cell lymphomas was not further analyzed. Two qualified datasets were downloaded from Gene Expression Omnibus database. The biological functions of differentially expressed genes were evaluated by gene ontology enrichment and KEGG pathway analysis. The network for intersection genes was constructed by the cytoscape v3.0 software. Kaplan-Meier survival curves and log-rank test were employed to assess the association between differentially expressed genes and clinical characters. The intersection mRNAs were proved to be associated with fundamental processes of T-cell lymphoma cells. These intersection mRNAs were involved in the activation of some cancer-related pathways, including PI3K/AKT, Ras, JAK-STAT, and NF-kappa B signaling pathway. PDGFRA, CXCL12, and CCL19 were the most significant central genes in the signal-net analysis. The results of survival analysis are not entirely credible. Our findings uncovered aberrantly expressed genes and a complex RNA signal network in T-cell lymphomas and indicated cancer-related pathways involved in disease initiation and progression, providing a new insight for biotargeted therapy in T-cell lymphomas. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The purpose of this study was to develop a method of classifying cancers to specific diagnostic categories based on their gene expression signatures using artificial neural networks (ANNs). We trained the ANNs using the small, round blue-cell tumors (SRBCTs) as a model. These cancers belong to four distinct diagnostic categories and often present diagnostic dilemmas in
Identification of Cell Cycle-Regulated Genes by Convolutional Neural Network.
Liu, Chenglin; Cui, Peng; Huang, Tao
2017-01-01
The cell cycle-regulated genes express periodically with the cell cycle stages, and the identification and study of these genes can provide a deep understanding of the cell cycle process. Large false positives and low overlaps are big problems in cell cycle-regulated gene detection. Here, a computational framework called DLGene was proposed for cell cycle-regulated gene detection. It is based on the convolutional neural network, a deep learning algorithm representing raw form of data pattern without assumption of their distribution. First, the expression data was transformed to categorical state data to denote the changing state of gene expression, and four different expression patterns were revealed for the reported cell cycle-regulated genes. Then, DLGene was applied to discriminate the non-cell cycle gene and the four subtypes of cell cycle genes. Its performances were compared with six traditional machine learning methods. At last, the biological functions of representative cell cycle genes for each subtype are analyzed. Our method showed better and more balanced performance of sensitivity and specificity comparing to other machine learning algorithms. The cell cycle genes had very different expression pattern with non-cell cycle genes and among the cell-cycle genes, there were four subtypes. Our method not only detects the cell cycle genes, but also describes its expression pattern, such as when its highest expression level is reached and how it changes with time. For each type, we analyzed the biological functions of the representative genes and such results provided novel insight to the cell cycle mechanisms. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Oh, Sunghee; Song, Seongho
2017-01-01
In gene expression profile, data analysis pipeline is categorized into four levels, major downstream tasks, i.e., (1) identification of differential expression; (2) clustering co-expression patterns; (3) classification of subtypes of samples; and (4) detection of genetic regulatory networks, are performed posterior to preprocessing procedure such as normalization techniques. To be more specific, temporal dynamic gene expression data has its inherent feature, namely, two neighboring time points (previous and current state) are highly correlated with each other, compared to static expression data which samples are assumed as independent individuals. In this chapter, we demonstrate how HMMs and hierarchical Bayesian modeling methods capture the horizontal time dependency structures in time series expression profiles by focusing on the identification of differential expression. In addition, those differential expression genes and transcript variant isoforms over time detected in core prerequisite steps can be generally further applied in detection of genetic regulatory networks to comprehensively uncover dynamic repertoires in the aspects of system biology as the coupled framework.
Verdugo, Ricardo A; Zeller, Tanja; Rotival, Maxime; Wild, Philipp S; Münzel, Thomas; Lackner, Karl J; Weidmann, Henri; Ninio, Ewa; Trégouët, David-Alexandre; Cambien, François; Blankenberg, Stefan; Tiret, Laurence
2013-01-01
Smoking is a risk factor for atherosclerosis with reported widespread effects on gene expression in circulating blood cells. We hypothesized that a molecular signature mediating the relation between smoking and atherosclerosis may be found in the transcriptome of circulating monocytes. Genome-wide expression profiles and counts of atherosclerotic plaques in carotid arteries were collected in 248 smokers and 688 non-smokers from the general population. Patterns of co-expressed genes were identified by Independent Component Analysis (ICA) and network structure of the pattern-specific gene modules was inferred by the PC-algorithm. A likelihood-based causality test was implemented to select patterns that fit models containing a path "smoking→gene expression→plaques". Robustness of the causal inference was assessed by bootstrapping. At a FDR ≤0.10, 3,368 genes were associated to smoking or plaques, of which 93% were associated to smoking only. SASH1 showed the strongest association to smoking and PPARG the strongest association to plaques. Twenty-nine gene patterns were identified by ICA. Modules containing SASH1 and PPARG did not show evidence for the "smoking→gene expression→plaques" causality model. Conversely, three modules had good support for causal effects and exhibited a network topology consistent with gene expression mediating the relation between smoking and plaques. The network with the strongest support for causal effects was connected to plaques through SLC39A8, a gene with known association to HDL-cholesterol and cellular uptake of cadmium from tobacco, while smoking was directly connected to GAS6, a gene reported to have anti-inflammatory effects in atherosclerosis and to be up-regulated in the placenta of women smoking during pregnancy. Our analysis of the transcriptome of monocytes recovered genes relevant for association to smoking and atherosclerosis, and connected genes that before, were only studied in separate contexts. Inspection of correlation structure revealed candidates that would be missed by expression-phenotype association analysis alone.
Verdugo, Ricardo A.; Zeller, Tanja; Rotival, Maxime; Wild, Philipp S.; Münzel, Thomas; Lackner, Karl J.; Weidmann, Henri; Ninio, Ewa; Trégouët, David-Alexandre; Cambien, François; Blankenberg, Stefan; Tiret, Laurence
2013-01-01
Smoking is a risk factor for atherosclerosis with reported widespread effects on gene expression in circulating blood cells. We hypothesized that a molecular signature mediating the relation between smoking and atherosclerosis may be found in the transcriptome of circulating monocytes. Genome-wide expression profiles and counts of atherosclerotic plaques in carotid arteries were collected in 248 smokers and 688 non-smokers from the general population. Patterns of co-expressed genes were identified by Independent Component Analysis (ICA) and network structure of the pattern-specific gene modules was inferred by the PC-algorithm. A likelihood-based causality test was implemented to select patterns that fit models containing a path “smoking→gene expression→plaques”. Robustness of the causal inference was assessed by bootstrapping. At a FDR ≤0.10, 3,368 genes were associated to smoking or plaques, of which 93% were associated to smoking only. SASH1 showed the strongest association to smoking and PPARG the strongest association to plaques. Twenty-nine gene patterns were identified by ICA. Modules containing SASH1 and PPARG did not show evidence for the “smoking→gene expression→plaques” causality model. Conversely, three modules had good support for causal effects and exhibited a network topology consistent with gene expression mediating the relation between smoking and plaques. The network with the strongest support for causal effects was connected to plaques through SLC39A8, a gene with known association to HDL-cholesterol and cellular uptake of cadmium from tobacco, while smoking was directly connected to GAS6, a gene reported to have anti-inflammatory effects in atherosclerosis and to be up-regulated in the placenta of women smoking during pregnancy. Our analysis of the transcriptome of monocytes recovered genes relevant for association to smoking and atherosclerosis, and connected genes that before, were only studied in separate contexts. Inspection of correlation structure revealed candidates that would be missed by expression-phenotype association analysis alone. PMID:23372645
Zhang, Yuji
2015-01-01
Molecular networks act as the backbone of molecular activities within cells, offering a unique opportunity to better understand the mechanism of diseases. While network data usually constitute only static network maps, integrating them with time course gene expression information can provide clues to the dynamic features of these networks and unravel the mechanistic driver genes characterizing cellular responses. Time course gene expression data allow us to broadly "watch" the dynamics of the system. However, one challenge in the analysis of such data is to establish and characterize the interplay among genes that are altered at different time points in the context of a biological process or functional category. Integrative analysis of these data sources will lead us a more complete understanding of how biological entities (e.g., genes and proteins) coordinately perform their biological functions in biological systems. In this paper, we introduced a novel network-based approach to extract functional knowledge from time-dependent biological processes at a system level using time course mRNA sequencing data in zebrafish embryo development. The proposed method was applied to investigate 1α, 25(OH)2D3-altered mechanisms in zebrafish embryo development. We applied the proposed method to a public zebrafish time course mRNA-Seq dataset, containing two different treatments along four time points. We constructed networks between gene ontology biological process categories, which were enriched in differential expressed genes between consecutive time points and different conditions. The temporal propagation of 1α, 25-Dihydroxyvitamin D3-altered transcriptional changes started from a few genes that were altered initially at earlier stage, to large groups of biological coherent genes at later stages. The most notable biological processes included neuronal and retinal development and generalized stress response. In addition, we also investigated the relationship among biological processes enriched in co-expressed genes under different conditions. The enriched biological processes include translation elongation, nucleosome assembly, and retina development. These network dynamics provide new insights into the impact of 1α, 25-Dihydroxyvitamin D3 treatment in bone and cartilage development. We developed a network-based approach to analyzing the DEGs at different time points by integrating molecular interactions and gene ontology information. These results demonstrate that the proposed approach can provide insight on the molecular mechanisms taking place in vertebrate embryo development upon treatment with 1α, 25(OH)2D3. Our approach enables the monitoring of biological processes that can serve as a basis for generating new testable hypotheses. Such network-based integration approach can be easily extended to any temporal- or condition-dependent genomic data analyses.
Estimation of Dynamic Systems for Gene Regulatory Networks from Dependent Time-Course Data.
Kim, Yoonji; Kim, Jaejik
2018-06-15
Dynamic system consisting of ordinary differential equations (ODEs) is a well-known tool for describing dynamic nature of gene regulatory networks (GRNs), and the dynamic features of GRNs are usually captured through time-course gene expression data. Owing to high-throughput technologies, time-course gene expression data have complex structures such as heteroscedasticity, correlations between genes, and time dependence. Since gene experiments typically yield highly noisy data with small sample size, for a more accurate prediction of the dynamics, the complex structures should be taken into account in ODE models. Hence, this study proposes an ODE model considering such data structures and a fast and stable estimation method for the ODE parameters based on the generalized profiling approach with data smoothing techniques. The proposed method also provides statistical inference for the ODE estimator and it is applied to a zebrafish retina cell network.
Network of proteins, enzymes and genes linked to biomass degradation shared by Trichoderma species.
Horta, Maria Augusta Crivelente; Filho, Jaire Alves Ferreira; Murad, Natália Faraj; de Oliveira Santos, Eidy; Dos Santos, Clelton Aparecido; Mendes, Juliano Sales; Brandão, Marcelo Mendes; Azzoni, Sindelia Freitas; de Souza, Anete Pereira
2018-01-22
Understanding relationships between genes responsible for enzymatic hydrolysis of cellulose and synergistic reactions is fundamental for improving biomass biodegradation technologies. To reveal synergistic reactions, the transcriptome, exoproteome, and enzymatic activities of extracts from Trichoderma harzianum, Trichoderma reesei and Trichoderma atroviride under biodegradation conditions were examined. This work revealed co-regulatory networks across carbohydrate-active enzyme (CAZy) genes and secreted proteins in extracts. A set of 80 proteins and respective genes that might correspond to a common system for biodegradation from the studied species were evaluated to elucidate new co-regulated genes. Differences such as one unique base pair between fungal genomes might influence enzyme-substrate binding sites and alter fungal gene expression responses, explaining the enzymatic activities specific to each species observed in the corresponding extracts. These differences are also responsible for the different architectures observed in the co-expression networks.
Insights into the Ecology and Evolution of Polyploid Plants through Network Analysis.
Gallagher, Joseph P; Grover, Corrinne E; Hu, Guanjing; Wendel, Jonathan F
2016-06-01
Polyploidy is a widespread phenomenon throughout eukaryotes, with important ecological and evolutionary consequences. Although genes operate as components of complex pathways and networks, polyploid changes in genes and gene expression have typically been evaluated as either individual genes or as a part of broad-scale analyses. Network analysis has been fruitful in associating genomic and other 'omic'-based changes with phenotype for many systems. In polyploid species, network analysis has the potential not only to facilitate a better understanding of the complex 'omic' underpinnings of phenotypic and ecological traits common to polyploidy, but also to provide novel insight into the interaction among duplicated genes and genomes. This adds perspective to the global patterns of expression (and other 'omic') change that accompany polyploidy and to the patterns of recruitment and/or loss of genes following polyploidization. While network analysis in polyploid species faces challenges common to other analyses of duplicated genomes, present technologies combined with thoughtful experimental design provide a powerful system to explore polyploid evolution. Here, we demonstrate the utility and potential of network analysis to questions pertaining to polyploidy with an example involving evolution of the transgressively superior cotton fibres found in polyploid Gossypium hirsutum. By combining network analysis with prior knowledge, we provide further insights into the role of profilins in fibre domestication and exemplify the potential for network analysis in polyploid species. © 2016 John Wiley & Sons Ltd.
Regulatory divergence between parental alleles determines gene expression patterns in hybrids.
Combes, Marie-Christine; Hueber, Yann; Dereeper, Alexis; Rialle, Stéphanie; Herrera, Juan-Carlos; Lashermes, Philippe
2015-03-29
Both hybridization and allopolyploidization generate novel phenotypes by conciliating divergent genomes and regulatory networks in the same cellular context. To understand the rewiring of gene expression in hybrids, the total expression of 21,025 genes and the allele-specific expression of over 11,000 genes were quantified in interspecific hybrids and their parental species, Coffea canephora and Coffea eugenioides using RNA-seq technology. Between parental species, cis- and trans-regulatory divergences affected around 32% and 35% of analyzed genes, respectively, with nearly 17% of them showing both. The relative importance of trans-regulatory divergences between both species could be related to their low genetic divergence and perennial habit. In hybrids, among divergently expressed genes between parental species and hybrids, 77% was expressed like one parent (expression level dominance), including 65% like C. eugenioides. Gene expression was shown to result from the expression of both alleles affected by intertwined parental trans-regulatory factors. A strong impact of C. eugenioides trans-regulatory factors on the upregulation of C. canephora alleles was revealed. The gene expression patterns appeared determined by complex combinations of cis- and trans-regulatory divergences. In particular, the observed biased expression level dominance seemed to be derived from the asymmetric effects of trans-regulatory parental factors on regulation of alleles. More generally, this study illustrates the effects of divergent trans-regulatory parental factors on the gene expression pattern in hybrids. The characteristics of the transcriptional response to hybridization appear to be determined by the compatibility of gene regulatory networks and therefore depend on genetic divergences between the parental species and their evolutionary history. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
De Cegli, Rossella; Iacobacci, Simona; Flore, Gemma; Gambardella, Gennaro; Mao, Lei; Cutillo, Luisa; Lauria, Mario; Klose, Joachim; Illingworth, Elizabeth; Banfi, Sandro; di Bernardo, Diego
2013-01-01
Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology 'reverse engineering' approaches. We 'reverse engineered' an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression ('hubs'). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central 'hub' of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.
The Transcriptome of the Reference Potato Genome Solanum tuberosum Group Phureja Clone DM1-3 516R44
Massa, Alicia N.; Childs, Kevin L.; Lin, Haining; Bryan, Glenn J.; Giuliano, Giovanni; Buell, C. Robin
2011-01-01
Advances in molecular breeding in potato have been limited by its complex biological system, which includes vegetative propagation, autotetraploidy, and extreme heterozygosity. The availability of the potato genome and accompanying gene complement with corresponding gene structure, location, and functional annotation are powerful resources for understanding this complex plant and advancing molecular breeding efforts. Here, we report a reference for the potato transcriptome using 32 tissues and growth conditions from the doubled monoploid Solanum tuberosum Group Phureja clone DM1-3 516R44 for which a genome sequence is available. Analysis of greater than 550 million RNA-Seq reads permitted the detection and quantification of expression levels of over 22,000 genes. Hierarchical clustering and principal component analyses captured the biological variability that accounts for gene expression differences among tissues suggesting tissue-specific gene expression, and genes with tissue or condition restricted expression. Using gene co-expression network analysis, we identified 18 gene modules that represent tissue-specific transcriptional networks of major potato organs and developmental stages. This information provides a powerful resource for potato research as well as studies on other members of the Solanaceae family. PMID:22046362
Sibout, Richard; Proost, Sebastian; Hansen, Bjoern Oest; Vaid, Neha; Giorgi, Federico M; Ho-Yue-Kuang, Severine; Legée, Frédéric; Cézart, Laurent; Bouchabké-Coussa, Oumaya; Soulhat, Camille; Provart, Nicholas; Pasha, Asher; Le Bris, Philippe; Roujol, David; Hofte, Herman; Jamet, Elisabeth; Lapierre, Catherine; Persson, Staffan; Mutwil, Marek
2017-08-01
While Brachypodium distachyon (Brachypodium) is an emerging model for grasses, no expression atlas or gene coexpression network is available. Such tools are of high importance to provide insights into the function of Brachypodium genes. We present a detailed Brachypodium expression atlas, capturing gene expression in its major organs at different developmental stages. The data were integrated into a large-scale coexpression database ( www.gene2function.de), enabling identification of duplicated pathways and conserved processes across 10 plant species, thus allowing genome-wide inference of gene function. We highlight the importance of the atlas and the platform through the identification of duplicated cell wall modules, and show that a lignin biosynthesis module is conserved across angiosperms. We identified and functionally characterised a putative ferulate 5-hydroxylase gene through overexpression of it in Brachypodium, which resulted in an increase in lignin syringyl units and reduced lignin content of mature stems, and led to improved saccharification of the stem biomass. Our Brachypodium expression atlas thus provides a powerful resource to reveal functionally related genes, which may advance our understanding of important biological processes in grasses. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Roy, Janine; Aust, Daniela; Knösel, Thomas; Rümmele, Petra; Jahnke, Beatrix; Hentrich, Vera; Rückert, Felix; Niedergethmann, Marco; Weichert, Wilko; Bahra, Marcus; Schlitt, Hans J.; Settmacher, Utz; Friess, Helmut; Büchler, Markus; Saeger, Hans-Detlev; Schroeder, Michael; Pilarsky, Christian; Grützmann, Robert
2012-01-01
Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice. PMID:22615549
Identification of Causal Genes, Networks, and Transcriptional Regulators of REM Sleep and Wake
Millstein, Joshua; Winrow, Christopher J.; Kasarskis, Andrew; Owens, Joseph R.; Zhou, Lili; Summa, Keith C.; Fitzpatrick, Karrie; Zhang, Bin; Vitaterna, Martha H.; Schadt, Eric E.; Renger, John J.; Turek, Fred W.
2011-01-01
Study Objective: Sleep-wake traits are well-known to be under substantial genetic control, but the specific genes and gene networks underlying primary sleep-wake traits have largely eluded identification using conventional approaches, especially in mammals. Thus, the aim of this study was to use systems genetics and statistical approaches to uncover the genetic networks underlying 2 primary sleep traits in the mouse: 24-h duration of REM sleep and wake. Design: Genome-wide RNA expression data from 3 tissues (anterior cortex, hypothalamus, thalamus/midbrain) were used in conjunction with high-density genotyping to identify candidate causal genes and networks mediating the effects of 2 QTL regulating the 24-h duration of REM sleep and one regulating the 24-h duration of wake. Setting: Basic sleep research laboratory. Patients or Participants: Male [C57BL/6J × (BALB/cByJ × C57BL/6J*) F1] N2 mice (n = 283). Interventions: None. Measurements and Results: The genetic variation of a mouse N2 mapping cross was leveraged against sleep-state phenotypic variation as well as quantitative gene expression measurement in key brain regions using integrative genomics approaches to uncover multiple causal sleep-state regulatory genes, including several surprising novel candidates, which interact as components of networks that modulate REM sleep and wake. In particular, it was discovered that a core network module, consisting of 20 genes, involved in the regulation of REM sleep duration is conserved across the cortex, hypothalamus, and thalamus. A novel application of a formal causal inference test was also used to identify those genes directly regulating sleep via control of expression. Conclusion: Systems genetics approaches reveal novel candidate genes, complex networks and specific transcriptional regulators of REM sleep and wake duration in mammals. Citation: Millstein J; Winrow CJ; Kasarskis A; Owens JR; Zhou L; Summa KC; Fitzpatrick K; Zhang B; Vitaterna MH; Schadt EE; Renger JJ; Turek FW. Identification of causal genes, networks, and transcriptional regulators of REM sleep and wake. SLEEP 2011;34(11):1469-1477. PMID:22043117
Sangha, Susan; Ilenseer, Jasmin; Sosulina, Ludmila; Lesting, Jörg; Pape, Hans-Christian
2012-04-17
Extinction reduces fear to stimuli that were once associated with an aversive event by no longer coupling the stimulus with the aversive event. Extinction learning is supported by a network comprising the amygdala, hippocampus, and prefrontal cortex. Previous studies implicate a critical role of GABA in extinction learning, specifically the GAD65 isoform of the GABA synthesizing enzyme glutamic acid decarboxylase (GAD). However, a detailed analysis of changes in gene expression of GAD in the subregions comprising the extinction network has not been undertaken. Here, we report changes in gene expression of the GAD65 and GAD67 isoforms of GAD, as measured by relative quantitative real-time RT-PCR, in subregions of the amygdala, hippocampus, and prefrontal cortex 24-26 h after extinction of a recent (1-d) or intermediate (14-d) fear memory. Our results show that extinction of a recent memory induces a down-regulation of Gad65 gene expression in the hippocampus (CA1, dentate gyrus) and an up-regulation of Gad67 gene expression in the infralimbic cortex. Extinguishing an intermediate memory increased Gad65 gene expression in the central amygdala. These results indicate a differential regulation of Gad gene expression after extinction of a recent memory vs. intermediate memory.
Ruzicka, W Brad; Subburaju, Sivan; Benes, Francine M
2015-06-01
Dysfunction related to γ-aminobutyric acid (GABA)-ergic neurotransmission in the pathophysiology of major psychosis has been well established by the work of multiple groups across several decades, including the widely replicated downregulation of GAD1. Prior gene expression and network analyses within the human hippocampus implicate a broader network of genes, termed the GAD1 regulatory network, in regulation of GAD1 expression. Several genes within this GAD1 regulatory network show diagnosis- and sector-specific expression changes within the circuitry of the hippocampus, influencing abnormal GAD1 expression in schizophrenia and bipolar disorder. To investigate the hypothesis that aberrant DNA methylation contributes to circuit- and diagnosis-specific abnormal expression of GAD1 regulatory network genes in psychotic illness. This epigenetic association study targeting GAD1 regulatory network genes was conducted between July 1, 2012, and June 30, 2014. Postmortem human hippocampus tissue samples were obtained from 8 patients with schizophrenia, 8 patients with bipolar disorder, and 8 healthy control participants matched for age, sex, postmortem interval, and other potential confounds from the Harvard Brain Tissue Resource Center, McLean Hospital, Belmont, Massachusetts. We extracted DNA from laser-microdissected stratum oriens tissue of cornu ammonis 2/3 (CA2/3) and CA1 postmortem human hippocampus, bisulfite modified it, and assessed it with the Infinium HumanMethylation450 BeadChip (Illumina, Inc). The subset of CpG loci associated with GAD1 regulatory network genes was analyzed in R version 3.1.0 software (R Foundation) using the minfi package. Findings were validated using bisulfite pyrosequencing. Methylation levels at 1308 GAD1 regulatory network-associated CpG loci were assessed both as individual sites to identify differentially methylated positions and by sharing information among colocalized probes to identify differentially methylated regions. A total of 146 differentially methylated positions with a false detection rate lower than 0.05 were identified across all 6 groups (2 circuit locations in each of 3 diagnostic categories), and 54 differentially methylated regions with P < .01 were identified in single-group comparisons. Methylation changes were enriched in MSX1, CCND2, and DAXX at specific loci within the hippocampus of patients with schizophrenia and bipolar disorder. This work demonstrates diagnosis- and circuit-specific DNA methylation changes at a subset of GAD1 regulatory network genes in the human hippocampus in schizophrenia and bipolar disorder. These genes participate in chromatin regulation and cell cycle control, supporting the concept that the established GABAergic dysfunction in these disorders is related to disruption of GABAergic interneuron physiology at specific circuit locations within the human hippocampus.
Effects of threshold on the topology of gene co-expression networks.
Couto, Cynthia Martins Villar; Comin, César Henrique; Costa, Luciano da Fontoura
2017-09-26
Several developments regarding the analysis of gene co-expression profiles using complex network theory have been reported recently. Such approaches usually start with the construction of an unweighted gene co-expression network, therefore requiring the selection of a suitable threshold defining which pairs of vertices will be connected. We aimed at addressing such an important problem by suggesting and comparing five different approaches for threshold selection. Each of the methods considers a respective biologically-motivated criterion for electing a potentially suitable threshold. A set of 21 microarray experiments from different biological groups was used to investigate the effect of applying the five proposed criteria to several biological situations. For each experiment, we used the Pearson correlation coefficient to measure the relationship between each gene pair, and the resulting weight matrices were thresholded considering several values, generating respective adjacency matrices (co-expression networks). Each of the five proposed criteria was then applied in order to select the respective threshold value. The effects of these thresholding approaches on the topology of the resulting networks were compared by using several measurements, and we verified that, depending on the database, the impact on the topological properties can be large. However, a group of databases was verified to be similarly affected by most of the considered criteria. Based on such results, it can be suggested that when the generated networks present similar measurements, the thresholding method can be chosen with greater freedom. If the generated networks are markedly different, the thresholding method that better suits the interests of each specific research study represents a reasonable choice.
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Mallik, Saurav; Sen, Sagnik; Maulik, Ujjwal
2016-07-15
Involvement of intrinsically disordered proteins (IDPs) with various dreadful diseases like cancer is an interesting research topic. In order to gain novel insights into the regulation of IDPs, in this article, we perform a transcriptomic analysis of mRNAs (genes) for transcripts encoding IDPs on a human multi-omics prostate carcinoma dataset having both gene expression and methylation data. In this regard, firstly the genes that consist of both the expression and methylation data, and that are corresponding to the cancer-related prostate-tissue-specific disordered proteins of MobiDb database, are selected. We apply standard t-test for determining differentially expressed genes as well as differentially methylated genes. A network having these genes and their targeter miRNAs from Diana Tarbase v7.0 database and corresponding Transcription Factors from TRANSFAC and ITFP databases, is then built. Thereafter, we perform literature search, and KEGG pathway and Gene Ontology analyses using DAVID database. Finally, we report several significant potential gene-markers (with the corresponding IDPs) that have inverse relationship between differential expression and methylation patterns, and that are hub genes of the TF-miRNA-gene network. Copyright © 2016 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ray, Anamika; Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078; Liu Jing
2010-10-15
Chlorpyrifos (CPF) is a widely used organophosphorus insecticide (OP) and putative developmental neurotoxicant in humans. The acute toxicity of CPF is elicited by acetylcholinesterase (AChE) inhibition. We characterized dose-related (0.1, 0.5, 1 and 2 mg/kg) gene expression profiles and changes in cell signaling pathways 24 h following acute CPF exposure in 7-day-old rats. Microarray experiments indicated that approximately 9% of the 44,000 genes were differentially expressed following either one of the four CPF dosages studied (546, 505, 522, and 3,066 genes with 0.1, 0.5, 1.0 and 2.0 mg/kg CPF). Genes were grouped according to dose-related expression patterns using K-means clusteringmore » while gene networks and canonical pathways were evaluated using Ingenuity Pathway Analysis (registered) . Twenty clusters were identified and differential expression of selected genes was verified by RT-PCR. The four largest clusters (each containing from 276 to 905 genes) constituted over 50% of all differentially expressed genes and exhibited up-regulation following exposure to the highest dosage (2 mg/kg CPF). The total number of gene networks affected by CPF also rose sharply with the highest dosage of CPF (18, 16, 18 and 50 with 0.1, 0.5, 1 and 2 mg/kg CPF). Forebrain cholinesterase (ChE) activity was significantly reduced (26%) only in the highest dosage group. Based on magnitude of dose-related changes in differentially expressed genes, relative numbers of gene clusters and signaling networks affected, and forebrain ChE inhibition only at 2 mg/kg CPF, we focused subsequent analyses on this treatment group. Six canonical pathways were identified that were significantly affected by 2 mg/kg CPF (MAPK, oxidative stress, NF{Kappa}B, mitochondrial dysfunction, arylhydrocarbon receptor and adrenergic receptor signaling). Evaluation of different cellular functions of the differentially expressed genes suggested changes related to olfactory receptors, cell adhesion/migration, synapse/synaptic transmission and transcription/translation. Nine genes were differentially affected in all four CPF dosing groups. We conclude that the most robust, consistent changes in differential gene expression in neonatal forebrain across a range of acute CPF dosages occurred at an exposure level associated with the classical marker of OP toxicity, AChE inhibition. Disruption of multiple cellular pathways, in particular cell adhesion, may contribute to the developmental neurotoxicity potential of this pesticide.« less
Genes and gene networks implicated in aggression related behaviour.
Malki, Karim; Pain, Oliver; Du Rietz, Ebba; Tosto, Maria Grazia; Paya-Cano, Jose; Sandnabba, Kenneth N; de Boer, Sietse; Schalkwyk, Leonard C; Sluyter, Frans
2014-10-01
Aggressive behaviour is a major cause of mortality and morbidity. Despite of moderate heritability estimates, progress in identifying the genetic factors underlying aggressive behaviour has been limited. There are currently three genetic mouse models of high and low aggression created using selective breeding. This is the first study to offer a global transcriptomic characterization of the prefrontal cortex across all three genetic mouse models of aggression. A systems biology approach has been applied to transcriptomic data across the three pairs of selected inbred mouse strains (Turku Aggressive (TA) and Turku Non-Aggressive (TNA), Short Attack Latency (SAL) and Long Attack Latency (LAL) mice and North Carolina Aggressive (NC900) and North Carolina Non-Aggressive (NC100)), providing novel insight into the neurobiological mechanisms and genetics underlying aggression. First, weighted gene co-expression network analysis (WGCNA) was performed to identify modules of highly correlated genes associated with aggression. Probe sets belonging to gene modules uncovered by WGCNA were carried forward for network analysis using ingenuity pathway analysis (IPA). The RankProd non-parametric algorithm was then used to statistically evaluate expression differences across the genes belonging to modules significantly associated with aggression. IPA uncovered two pathways, involving NF-kB and MAPKs. The secondary RankProd analysis yielded 14 differentially expressed genes, some of which have previously been implicated in pathways associated with aggressive behaviour, such as Adrbk2. The results highlighted plausible candidate genes and gene networks implicated in aggression-related behaviour.
2014-01-01
Background Network inference of gene expression data is an important challenge in systems biology. Novel algorithms may provide more detailed gene regulatory networks (GRN) for complex, chronic inflammatory diseases such as rheumatoid arthritis (RA), in which activated synovial fibroblasts (SFBs) play a major role. Since the detailed mechanisms underlying this activation are still unclear, simultaneous investigation of multi-stimuli activation of SFBs offers the possibility to elucidate the regulatory effects of multiple mediators and to gain new insights into disease pathogenesis. Methods A GRN was therefore inferred from RA-SFBs treated with 4 different stimuli (IL-1 β, TNF- α, TGF- β, and PDGF-D). Data from time series microarray experiments (0, 1, 2, 4, 12 h; Affymetrix HG-U133 Plus 2.0) were batch-corrected applying ‘ComBat’, analyzed for differentially expressed genes over time with ‘Limma’, and used for the inference of a robust GRN with NetGenerator V2.0, a heuristic ordinary differential equation-based method with soft integration of prior knowledge. Results Using all genes differentially expressed over time in RA-SFBs for any stimulus, and selecting the genes belonging to the most significant gene ontology (GO) term, i.e., ‘cartilage development’, a dynamic, robust, moderately complex multi-stimuli GRN was generated with 24 genes and 57 edges in total, 31 of which were gene-to-gene edges. Prior literature-based knowledge derived from Pathway Studio or manual searches was reflected in the final network by 25/57 confirmed edges (44%). The model contained known network motifs crucial for dynamic cellular behavior, e.g., cross-talk among pathways, positive feed-back loops, and positive feed-forward motifs (including suppression of the transcriptional repressor OSR2 by all 4 stimuli. Conclusion A multi-stimuli GRN highly concordant with literature data was successfully generated by network inference from the gene expression of stimulated RA-SFBs. The GRN showed high reliability, since 10 predicted edges were independently validated by literature findings post network inference. The selected GO term ‘cartilage development’ contained a number of differentiation markers, growth factors, and transcription factors with potential relevance for RA. Finally, the model provided new insight into the response of RA-SFBs to multiple stimuli implicated in the pathogenesis of RA, in particular to the ‘novel’ potent growth factor PDGF-D. PMID:24989895
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling
Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K
2006-01-01
Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705
Cross-platform method for identifying candidate network biomarkers for prostate cancer.
Jin, G; Zhou, X; Cui, K; Zhang, X-S; Chen, L; Wong, S T C
2009-11-01
Discovering biomarkers using mass spectrometry (MS) and microarray expression profiles is a promising strategy in molecular diagnosis. Here, the authors proposed a new pipeline for biomarker discovery that integrates disease information for proteins and genes, expression profiles in both genomic and proteomic levels, and protein-protein interactions (PPIs) to discover high confidence network biomarkers. Using this pipeline, a total of 474 molecules (genes and proteins) related to prostate cancer were identified and a prostate-cancer-related network (PCRN) was derived from the integrative information. Thus, a set of candidate network biomarkers were identified from multiple expression profiles composed by eight microarray datasets and one proteomics dataset. The network biomarkers with PPIs can accurately distinguish the prostate patients from the normal ones, which potentially provide more reliable hits of biomarker candidates than conventional biomarker discovery methods.
Lu, Tao
2016-01-01
The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.
A stele-enriched gene regulatory network in the Arabidopsis root
Brady, Siobhan M; Zhang, Lifang; Megraw, Molly; Martinez, Natalia J; Jiang, Eric; Yi, Charles S; Liu, Weilin; Zeng, Anna; Taylor-Teeples, Mallorie; Kim, Dahae; Ahnert, Sebastian; Ohler, Uwe; Ware, Doreen; Walhout, Albertha J M; Benfey, Philip N
2011-01-01
Tightly controlled gene expression is a hallmark of multicellular development and is accomplished by transcription factors (TFs) and microRNAs (miRNAs). Although many studies have focused on identifying downstream targets of these molecules, less is known about the factors that regulate their differential expression. We used data from high spatial resolution gene expression experiments and yeast one-hybrid (Y1H) and two-hybrid (Y2H) assays to delineate a subset of interactions occurring within a gene regulatory network (GRN) that determines tissue-specific TF and miRNA expression in plants. We find that upstream TFs are expressed in more diverse cell types than their targets and that promoters that are bound by a relatively large number of TFs correspond to key developmental regulators. The regulatory consequence of many TFs for their target was experimentally determined using genetic analysis. Remarkably, molecular phenotypes were identified for 65% of the TFs, but morphological phenotypes were associated with only 16%. This indicates that the GRN is robust, and that gene expression changes may be canalized or buffered. PMID:21245844
Khan, Faheem Ahmed; Liu, Hui; Zhou, Hao; Wang, Kai; Qamar, Muhammad Tahir Ul; Pandupuspitasari, Nuruliarizki Shinta; Shujun, Zhang
2017-01-01
The biology of sperm, its capability of fertilizing an egg and its role in sex ratio are the major biological questions in reproductive biology. To answer these question we integrated X and Y chromosome transcriptome across different species: Bos taurus and Sus scrofa and identified reproductive driver genes based on Weighted Gene Co-Expression Network Analysis (WGCNA) algorithm. Our strategy resulted in 11007 and 10445 unique genes consisting of 9 and 11 reproductive modules in Bos taurus and Sus scrofa, respectively. The consensus module calculation yields an overall 167 overlapped genes which were mapped to 846 DEGs in Bos taurus to finally get a list of 67 dual feature genes. We develop gene co-expression network of selected 67 genes that consists of 58 nodes (27 down-regulated and 31 up-regulated genes) enriched to 66 GO biological process (BP) including 6 GO annotations related to reproduction and two KEGG pathways. Moreover, we searched significantly related TF (ISRE, AP1FJ, RP58, CREL) and miRNAs (bta-miR-181a, bta-miR-17-5p, bta-miR-146b, bta-miR-146a) which targeted the genes in co-expression network. In addition we performed genetic analysis including phylogenetic, functional domain identification, epigenetic modifications, mutation analysis of the most important reproductive driver genes PRM1, PPP2R2B and PAFAH1B1 and finally performed a protein docking analysis to visualize their therapeutic and gene expression regulation ability. PMID:28903352
Homoeolog-specific transcriptional bias in allopolyploid wheat
2010-01-01
Background Interaction between parental genomes is accompanied by global changes in gene expression which, eventually, contributes to growth vigor and the broader phenotypic diversity of allopolyploid species. In order to gain a better understanding of the effects of allopolyploidization on the regulation of diverged gene networks, we performed a genome-wide analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat created by the hybridization of a tetraploid derivative of hexaploid wheat with the diploid ancestor of the wheat D genome Ae. tauschii. Results Affymetrix wheat genome arrays were used for both the discovery of divergent homoeolog-specific mutations and analysis of homoeolog-specific gene expression in re-synthesized allohexaploid wheat. More than 34,000 detectable parent-specific features (PSF) distributed across the wheat genome were used to assess AB genome (could not differentiate A and B genome contributions) and D genome parental expression in the allopolyploid transcriptome. In re-synthesized polyploid 81% of PSFs detected mid-parent levels of gene expression, and only 19% of PSFs showed the evidence of non-additive expression. Non-additive expression in both AB and D genomes was strongly biased toward up-regulation of parental type of gene expression with only 6% and 11% of genes, respectively, being down-regulated. Of all the non-additive gene expression, 84% can be explained by differences in the parental genotypes used to make the allopolyploid. Homoeolog-specific co-regulation of several functional gene categories was found, particularly genes involved in photosynthesis and protein biosynthesis in wheat. Conclusions Here, we have demonstrated that the establishment of interactions between the diverged regulatory networks in allopolyploids is accompanied by massive homoeolog-specific up- and down-regulation of gene expression. This study provides insights into interactions between homoeologous genomes and their role in growth vigor, development, and fertility of allopolyploid species. PMID:20849627
Analysis of the dynamic co-expression network of heart regeneration in the zebrafish
Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco
2016-01-01
The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320
Analysis of the dynamic co-expression network of heart regeneration in the zebrafish
NASA Astrophysics Data System (ADS)
Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco
2016-05-01
The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration.
Angelovici, Ruthie; Fait, Aaron; Zhu, Xiaohong; Szymanski, Jedrzej; Feldmesser, Ester; Fernie, Alisdair R; Galili, Gad
2009-12-01
In order to elucidate transcriptional and metabolic networks associated with lysine (Lys) metabolism, we utilized developing Arabidopsis (Arabidopsis thaliana) seeds as a system in which Lys synthesis could be stimulated developmentally without application of chemicals and coupled this to a T-DNA insertion knockout mutation impaired in Lys catabolism. This seed-specific metabolic perturbation stimulated Lys accumulation starting from the initiation of storage reserve accumulation. Our results revealed that the response of seed metabolism to the inducible alteration of Lys metabolism was relatively minor; however, that which was observable operated in a modular manner. They also demonstrated that Lys metabolism is strongly associated with the operation of the tricarboxylic acid cycle while largely disconnected from other metabolic networks. In contrast, the inducible alteration of Lys metabolism was strongly associated with gene networks, stimulating the expression of hundreds of genes controlling anabolic processes that are associated with plant performance and vigor while suppressing a small number of genes associated with plant stress interactions. The most pronounced effect of the developmentally inducible alteration of Lys metabolism was an induction of expression of a large set of genes encoding ribosomal proteins as well as genes encoding translation initiation and elongation factors, all of which are associated with protein synthesis. With respect to metabolic regulation, the inducible alteration of Lys metabolism was primarily associated with altered expression of genes belonging to networks of amino acids and sugar metabolism. The combined data are discussed within the context of network interactions both between and within metabolic and transcriptional control systems.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
2012-01-01
Visualization and analysis of molecular networks are both central to systems biology. However, there still exists a large technological gap between them, especially when assessing multiple network levels or hierarchies. Here we present RedeR, an R/Bioconductor package combined with a Java core engine for representing modular networks. The functionality of RedeR is demonstrated in two different scenarios: hierarchical and modular organization in gene co-expression networks and nested structures in time-course gene expression subnetworks. Our results demonstrate RedeR as a new framework to deal with the multiple network levels that are inherent to complex biological systems. RedeR is available from http://bioconductor.org/packages/release/bioc/html/RedeR.html. PMID:22531049
Balazadeh, Salma; Siddiqui, Hamad; Allu, Annapurna D; Matallana-Ramirez, Lilian P; Caldana, Camila; Mehrnia, Mohammad; Zanor, Maria-Inés; Köhler, Barbara; Mueller-Roeber, Bernd
2010-04-01
The onset and progression of senescence are under genetic and environmental control. The Arabidopsis thaliana NAC transcription factor ANAC092 (also called AtNAC2 and ORE1) has recently been shown to control age-dependent senescence, but its mode of action has not been analysed yet. To explore the regulatory network administered by ANAC092 we performed microarray-based expression profiling using estradiol-inducible ANAC092 overexpression lines. Approximately 46% of the 170 genes up-regulated upon ANAC092 induction are known senescence-associated genes, suggesting that the NAC factor exerts its role in senescence through a regulatory network that includes many of the genes previously reported to be senescence regulated. We selected 39 candidate genes and confirmed their time-dependent response to enhanced ANAC092 expression by quantitative RT-PCR. We also found that the majority of them (24 genes) are up-regulated by salt stress, a major promoter of plant senescence, in a manner similar to that of ANAC092, which itself is salt responsive. Furthermore, 24 genes like ANAC092 turned out to be stage-dependently expressed during seed growth with low expression at early and elevated expression at late stages of seed development. Disruption of ANAC092 increased the rate of seed germination under saline conditions, whereas the opposite occurred in respective overexpression plants. We also detected a delay of salinity-induced chlorophyll loss in detached anac092-1 mutant leaves. Promoter-reporter (GUS) studies revealed transcriptional control of ANAC092 expression during leaf and flower ageing and in response to salt stress. We conclude that ANAC092 exerts its functions during senescence and seed germination through partly overlapping target gene sets.
Key genes and pathways in measles and their interaction with environmental chemicals.
Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing
2018-06-01
The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles.
Concerted Perturbation Observed in a Hub Network in Alzheimer’s Disease
Liang, Dapeng; Han, Guangchun; Feng, Xuemei; Sun, Jiya; Duan, Yong; Lei, Hongxing
2012-01-01
Alzheimer’s disease (AD) is a progressive neurodegenerative disease involving the alteration of gene expression at the whole genome level. Genome-wide transcriptional profiling of AD has been conducted by many groups on several relevant brain regions. However, identifying the most critical dys-regulated genes has been challenging. In this work, we addressed this issue by deriving critical genes from perturbed subnetworks. Using a recent microarray dataset on six brain regions, we applied a heaviest induced subgraph algorithm with a modular scoring function to reveal the significantly perturbed subnetwork in each brain region. These perturbed subnetworks were found to be significantly overlapped with each other. Furthermore, the hub genes from these perturbed subnetworks formed a connected hub network consisting of 136 genes. Comparison between AD and several related diseases demonstrated that the hub network was robustly and specifically perturbed in AD. In addition, strong correlation between the expression level of these hub genes and indicators of AD severity suggested that this hub network can partially reflect AD progression. More importantly, this hub network reflected the adaptation of neurons to the AD-specific microenvironment through a variety of adjustments, including reduction of neuronal and synaptic activities and alteration of survival signaling. Therefore, it is potentially useful for the development of biomarkers and network medicine for AD. PMID:22815752
Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T
2014-12-01
Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).
Identification of transcription regulatory relationships in rheumatoid arthritis and osteoarthritis.
Li, Guofeng; Han, Ning; Li, Zengchun; Lu, Qingyou
2013-05-01
Rheumatoid arthritis (RA) is recognized as the most crippling or disabling type of arthritis, and osteoarthritis (OA) is the most common form of arthritis. These diseases severely reduce the quality of life, and cause high socioeconomic burdens. However, the molecular mechanisms of RA and OA development remain elusive despite intensive research efforts. In this study, we aimed to identify the potential transcription regulatory relationships between transcription factors (TFs) and differentially co-expressed genes (DCGs) in RA and OA, respectively. We downloaded the gene expression profiles of RA and OA from the Gene Expression Omnibus and analyzed the gene expression using computational methods. We identified a set of 4,076 DCGs in pairwise comparisons between RA and OA patients, RA and normal donors (NDs), or OA and ND. After regulatory network construction and regulatory impact factor analysis, we found that EGR1, NFE2L1, and NFYA were crucial TFs in the regulatory network of RA and NFYA, CBFB, CREB1, YY1 and PATZ1 were crucial TFs in the regulatory network of OA. These TFs could regulate the DCGs expression to involve RA and OA by promoting or inhibiting their expression. Altogether, our work may extend our understanding of disease mechanisms and may lead to an improved diagnosis. However, further experiments are still needed to confirm these observations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Acquaah-Mensah, George K.; Taylor, Ronald C.
Microarray data have been a valuable resource for identifying transcriptional regulatory relationships among genes. As an example, brain region-specific transcriptional regulatory events have the potential of providing etiological insights into Alzheimer Disease (AD). However, there is often a paucity of suitable brain-region specific expression data obtained via microarrays or other high throughput means. The Allen Brain Atlas in situ hybridization (ISH) data sets (Jones et al., 2009) represent a potentially valuable alternative source of high-throughput brain region-specific gene expression data for such purposes. In this study, Allen BrainAtlasmouse ISH data in the hippocampal fields were extracted, focusing on 508 genesmore » relevant to neurodegeneration. Transcriptional regulatory networkswere learned using three high-performing network inference algorithms. Only 17% of regulatory edges from a network reverse-engineered based on brain region-specific ISH data were also found in a network constructed upon gene expression correlations inmousewhole brain microarrays, thus showing the specificity of gene expression within brain sub-regions. Furthermore, the ISH data-based networks were used to identify instructive transcriptional regulatory relationships. Ncor2, Sp3 and Usf2 form a unique three-party regulatory motif, potentially affecting memory formation pathways. Nfe2l1, Egr1 and Usf2 emerge among regulators of genes involved in AD (e.g. Dhcr24, Aplp2, Tia1, Pdrx1, Vdac1, andSyn2). Further, Nfe2l1, Egr1 and Usf2 are sensitive to dietary factors and could be among links between dietary influences and genes in the AD etiology. Thus, this approach of harnessing brain region-specific ISH data represents a rare opportunity for gleaning unique etiological insights for diseases such as AD.« less
Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra
NASA Astrophysics Data System (ADS)
Schadt, Eric
2005-03-01
The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.
A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo
NASA Technical Reports Server (NTRS)
Davidson, Eric H.; Rast, Jonathan P.; Oliveri, Paola; Ransick, Andrew; Calestani, Cristina; Yuh, Chiou-Hwa; Minokawa, Takuya; Amore, Gabriele; Hinman, Veronica; Arenas-Mena, Cesar;
2002-01-01
We present the current form of a provisional DNA sequence-based regulatory gene network that explains in outline how endomesodermal specification in the sea urchin embryo is controlled. The model of the network is in a continuous process of revision and growth as new genes are added and new experimental results become available; see http://www.its.caltech.edu/mirsky/endomeso.htm (End-mes Gene Network Update) for the latest version. The network contains over 40 genes at present, many newly uncovered in the course of this work, and most encoding DNA-binding transcriptional regulatory factors. The architecture of the network was approached initially by construction of a logic model that integrated the extensive experimental evidence now available on endomesoderm specification. The internal linkages between genes in the network have been determined functionally, by measurement of the effects of regulatory perturbations on the expression of all relevant genes in the network. Five kinds of perturbation have been applied: (1) use of morpholino antisense oligonucleotides targeted to many of the key regulatory genes in the network; (2) transformation of other regulatory factors into dominant repressors by construction of Engrailed repressor domain fusions; (3) ectopic expression of given regulatory factors, from genetic expression constructs and from injected mRNAs; (4) blockade of the beta-catenin/Tcf pathway by introduction of mRNA encoding the intracellular domain of cadherin; and (5) blockade of the Notch signaling pathway by introduction of mRNA encoding the extracellular domain of the Notch receptor. The network model predicts the cis-regulatory inputs that link each gene into the network. Therefore, its architecture is testable by cis-regulatory analysis. Strongylocentrotus purpuratus and Lytechinus variegatus genomic BAC recombinants that include a large number of the genes in the network have been sequenced and annotated. Tests of the cis-regulatory predictions of the model are greatly facilitated by interspecific computational sequence comparison, which affords a rapid identification of likely cis-regulatory elements in advance of experimental analysis. The network specifies genomically encoded regulatory processes between early cleavage and gastrula stages. These control the specification of the micromere lineage and of the initial veg(2) endomesodermal domain; the blastula-stage separation of the central veg(2) mesodermal domain (i.e., the secondary mesenchyme progenitor field) from the peripheral veg(2) endodermal domain; the stabilization of specification state within these domains; and activation of some downstream differentiation genes. Each of the temporal-spatial phases of specification is represented in a subelement of the network model, that treats regulatory events within the relevant embryonic nuclei at particular stages. (c) 2002 Elsevier Science (USA).
2013-01-01
Background A co-ordinated tissue-independent gene expression profile associated with growth is present in rodent models and this is hypothesised to extend to all mammals. Growth in humans has similarities to other mammals but the return to active long bone growth in the pubertal growth spurt is a distinctly human growth event. The aim of this study was to describe gene expression and biological pathways associated with stages of growth in children and to assess tissue-independent expression patterns in relation to human growth. Results We conducted gene expression analysis on a library of datasets from normal children with age annotation, collated from the NCBI Gene Expression Omnibus (GEO) and EBI Arrayexpress databases. A primary data set was generated using cells of lymphoid origin from normal children; the expression of 688 genes (ANOVA false discovery rate modified p-value, q < 0.1) was associated with age, and subsets of these genes formed clusters that correlated with the phases of growth – infancy, childhood, puberty and final height. Network analysis on these clusters identified evolutionarily conserved growth pathways (NOTCH, VEGF, TGFB, WNT and glucocorticoid receptor – Hyper-geometric test, q < 0.05). The greatest degree of network ‘connectivity’ and hence functional significance was present in infancy (Wilcoxon test, p < 0.05), which then decreased through to adulthood. These observations were confirmed in a separate validation data set from lymphoid tissue. Similar biological pathways were observed to be associated with development-related gene expression in other tissues (conjunctival epithelia, temporal lobe brain tissue and bone marrow) suggesting the existence of a tissue-independent genetic program for human growth and maturation. Conclusions Similar evolutionarily conserved pathways have been associated with gene expression and child growth in multiple tissues. These expression profiles associate with the developmental phases of growth including the return to active long bone growth in puberty, a distinctly human event. These observations also have direct medical relevance to pathological changes that induce disease in children. Taking into account development-dependent gene expression profiles for normal children will be key to the appropriate selection of genes and pathways as potential biomarkers of disease or as drug targets. PMID:23941278
Multiple hot-deck imputation for network inference from RNA sequencing data.
Imbert, Alyssa; Valsesia, Armand; Le Gall, Caroline; Armenise, Claudia; Lefebvre, Gregory; Gourraud, Pierre-Antoine; Viguerie, Nathalie; Villa-Vialaneix, Nathalie
2018-05-15
Network inference provides a global view of the relations existing between gene expression in a given transcriptomic experiment (often only for a restricted list of chosen genes). However, it is still a challenging problem: even if the cost of sequencing techniques has decreased over the last years, the number of samples in a given experiment is still (very) small compared to the number of genes. We propose a method to increase the reliability of the inference when RNA-seq expression data have been measured together with an auxiliary dataset that can provide external information on gene expression similarity between samples. Our statistical approach, hd-MI, is based on imputation for samples without available RNA-seq data that are considered as missing data but are observed on the secondary dataset. hd-MI can improve the reliability of the inference for missing rates up to 30% and provides more stable networks with a smaller number of false positive edges. On a biological point of view, hd-MI was also found relevant to infer networks from RNA-seq data acquired in adipose tissue during a nutritional intervention in obese individuals. In these networks, novel links between genes were highlighted, as well as an improved comparability between the two steps of the nutritional intervention. Software and sample data are available as an R package, RNAseqNet, that can be downloaded from the Comprehensive R Archive Network (CRAN). alyssa.imbert@inra.fr or nathalie.villa-vialaneix@inra.fr. Supplementary data are available at Bioinformatics online.
DEEP--a tool for differential expression effector prediction.
Degenhardt, Jost; Haubrock, Martin; Dönitz, Jürgen; Wingender, Edgar; Crass, Torsten
2007-07-01
High-throughput methods for measuring transcript abundance, like SAGE or microarrays, are widely used for determining differences in gene expression between different tissue types, dignities (normal/malignant) or time points. Further analysis of such data frequently aims at the identification of gene interaction networks that form the causal basis for the observed properties of the systems under examination. To this end, it is usually not sufficient to rely on the measured gene expression levels alone; rather, additional biological knowledge has to be taken into account in order to generate useful hypotheses about the molecular mechanism leading to the realization of a certain phenotype. We present a method that combines gene expression data with biological expert knowledge on molecular interaction networks, as described by the TRANSPATH database on signal transduction, to predict additional--and not necessarily differentially expressed--genes or gene products which might participate in processes specific for either of the examined tissues or conditions. In a first step, significance values for over-expression in tissue/condition A or B are assigned to all genes in the expression data set. Genes with a significance value exceeding a certain threshold are used as starting points for the reconstruction of a graph with signaling components as nodes and signaling events as edges. In a subsequent graph traversal process, again starting from the previously identified differentially expressed genes, all encountered nodes 'inherit' all their starting nodes' significance values. In a final step, the graph is visualized, the nodes being colored according to a weighted average of their inherited significance values. Each node's, or sub-network's, predominant color, ranging from green (significant for tissue/condition A) over yellow (not significant for either tissue/condition) to red (significant for tissue/condition B), thus gives an immediate visual clue on which molecules--differentially expressed or not--may play pivotal roles in the tissues or conditions under examination. The described method has been implemented in Java as a client/server application and a web interface called DEEP (Differential Expression Effector Prediction). The client, which features an easy-to-use graphical interface, can freely be downloaded from the following URL: http://deep.bioinf.med.uni-goettingen.de.
Contreras-López, Orlando; Moyano, Tomás C; Soto, Daniela C; Gutiérrez, Rodrigo A
2018-01-01
The rapid increase in the availability of transcriptomics data generated by RNA sequencing represents both a challenge and an opportunity for biologists without bioinformatics training. The challenge is handling, integrating, and interpreting these data sets. The opportunity is to use this information to generate testable hypothesis to understand molecular mechanisms controlling gene expression and biological processes (Fig. 1). A successful strategy to generate tractable hypotheses from transcriptomics data has been to build undirected network graphs based on patterns of gene co-expression. Many examples of new hypothesis derived from network analyses can be found in the literature, spanning different organisms including plants and specific fields such as root developmental biology.In order to make the process of constructing a gene co-expression network more accessible to biologists, here we provide step-by-step instructions using published RNA-seq experimental data obtained from a public database. Similar strategies have been used in previous studies to advance root developmental biology. This guide includes basic instructions for the operation of widely used open source platforms such as Bio-Linux, R, and Cytoscape. Even though the data we used in this example was obtained from Arabidopsis thaliana, the workflow developed in this guide can be easily adapted to work with RNA-seq data from any organism.
Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering
NASA Technical Reports Server (NTRS)
Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland
2000-01-01
Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.
Regulation of behaviorally associated gene networks in worker honey bee ovaries
Wang, Ying; Kocher, Sarah D.; Linksvayer, Timothy A.; Grozinger, Christina M.; Page, Robert E.; Amdam, Gro V.
2012-01-01
SUMMARY Several lines of evidence support genetic links between ovary size and division of labor in worker honey bees. However, it is largely unknown how ovaries influence behavior. To address this question, we first performed transcriptional profiling on worker ovaries from two genotypes that differ in social behavior and ovary size. Then, we contrasted the differentially expressed ovarian genes with six sets of available brain transcriptomes. Finally, we probed behavior-related candidate gene networks in wild-type ovaries of different sizes. We found differential expression in 2151 ovarian transcripts in these artificially selected honey bee strains, corresponding to approximately 20.3% of the predicted gene set of honey bees. Differences in gene expression overlapped significantly with changes in the brain transcriptomes. Differentially expressed genes were associated with neural signal transmission (tyramine receptor, TYR) and ecdysteroid signaling; two independently tested nuclear hormone receptors (HR46 and ftz-f1) were also significantly correlated with ovary size in wild-type bees. We suggest that the correspondence between ovary and brain transcriptomes identified here indicates systemic regulatory networks among hormones (juvenile hormone and ecdysteroids), pheromones (queen mandibular pheromone), reproductive organs and nervous tissues in worker honey bees. Furthermore, robust correlations between ovary size and neuraland endocrine response genes are consistent with the hypothesized roles of the ovaries in honey bee behavioral regulation. PMID:22162860
He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian
2014-01-10
Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL. © 2013.
Lin, Ying; Sibanda, Vusumuzi Leroy; Zhang, Hong-Mei; Hu, Hui; Liu, Hui; Guo, An-Yuan
2015-04-13
Myocardial infarction (MI) is a leading cause of death in the world and many genes are involved in it. Transcription factor (TFs) and microRNAs (miRNAs) are key regulators of gene expression. We hypothesized that miRNAs and TFs might play combinatory regulatory roles in MI. After collecting MI candidate genes and miRNAs from various resources, we constructed a comprehensive MI-specific miRNA-TF co-regulatory network by integrating predicted and experimentally validated TF and miRNA targets. We found some hub nodes (e.g. miR-16 and miR-26) in this network are important regulators, and the network can be severed as a bridge to interpret the associations of previous results, which is shown by the case of miR-29 in this study. We also constructed a regulatory network for MI recurrence and found several important genes (e.g. DAB2, BMP6, miR-320 and miR-103), the abnormal expressions of which may be potential regulatory mechanisms and markers of MI recurrence. At last we proposed a cellular model to discuss major TF and miRNA regulators with signaling pathways in MI. This study provides more details on gene expression regulation and regulators involved in MI progression and recurrence. It also linked up and interpreted many previous results.
Tumor SHB gene expression affects disease characteristics in human acute myeloid leukemia.
Jamalpour, Maria; Li, Xiujuan; Cavelier, Lucia; Gustafsson, Karin; Mostoslavsky, Gustavo; Höglund, Martin; Welsh, Michael
2017-10-01
The mouse Shb gene coding for the Src Homology 2-domain containing adapter protein B has recently been placed in context of BCRABL1-induced myeloid leukemia in mice and the current study was performed in order to relate SHB to human acute myeloid leukemia (AML). Publicly available AML databases were mined for SHB gene expression and patient survival. SHB gene expression was determined in the Uppsala cohort of AML patients by qPCR. Cell proliferation was determined after SHB gene knockdown in leukemic cell lines. Despite a low frequency of SHB gene mutations, many tumors overexpressed SHB mRNA compared with normal myeloid blood cells. AML patients with tumors expressing low SHB mRNA displayed longer survival times. A subgroup of AML exhibiting a favorable prognosis, acute promyelocytic leukemia (APL) with a PMLRARA translocation, expressed less SHB mRNA than AML tumors in general. When examining genes co-expressed with SHB in AML tumors, four other genes ( PAX5, HDAC7, BCORL1, TET1) related to leukemia were identified. A network consisting of these genes plus SHB was identified that relates to certain phenotypic characteristics, such as immune cell, vascular and apoptotic features. SHB knockdown in the APL PMLRARA cell line NB4 and the monocyte/macrophage cell line MM6 adversely affected proliferation, linking SHB gene expression to tumor cell expansion and consequently to patient survival. It is concluded that tumor SHB gene expression relates to AML survival and its subgroup APL. Moreover, this gene is included in a network of genes that plays a role for an AML phenotype exhibiting certain immune cell, vascular and apoptotic characteristics.
Yan, Yan; Wang, Lianzhe; Ding, Zehong; Tie, Weiwei; Ding, Xupo; Zeng, Changying; Wei, Yunxie; Zhao, Hongliang; Peng, Ming; Hu, Wei
2016-01-01
Mitogen-activated protein kinases (MAPKs) play central roles in plant developmental processes, hormone signaling transduction, and responses to abiotic stress. However, no data are currently available about the MAPK family in cassava, an important tropical crop. Herein, 21 MeMAPK genes were identified from cassava. Phylogenetic analysis indicated that MeMAPKs could be classified into four subfamilies. Gene structure analysis demonstrated that the number of introns in MeMAPK genes ranged from 1 to 10, suggesting large variation among cassava MAPK genes. Conserved motif analysis indicated that all MeMAPKs had typical protein kinase domains. Transcriptomic analysis suggested that MeMAPK genes showed differential expression patterns in distinct tissues and in response to drought stress between wild subspecies and cultivated varieties. Interaction networks and co-expression analyses revealed that crucial pathways controlled by MeMAPK networks may be involved in the differential response to drought stress in different accessions of cassava. Expression of nine selected MAPK genes showed that these genes could comprehensively respond to osmotic, salt, cold, oxidative stressors, and abscisic acid (ABA) signaling. These findings yield new insights into the transcriptional control of MAPK gene expression, provide an improved understanding of abiotic stress responses and signaling transduction in cassava, and lead to potential applications in the genetic improvement of cassava cultivars. PMID:27625666
Zhang, J D; Berntenis, N; Roth, A; Ebeling, M
2014-06-01
Gene signatures of drug-induced toxicity are of broad interest, but they are often identified from small-scale, single-time point experiments, and are therefore of limited applicability. To address this issue, we performed multivariate analysis of gene expression, cell-based assays, and histopathological data in the TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation system) database. Data mining highlights four genes-EGR1, ATF3, GDF15 and FGF21-that are induced 2 h after drug administration in human and rat primary hepatocytes poised to eventually undergo cytotoxicity-induced cell death. Modelling and simulation reveals that these early stress-response genes form a functional network with evolutionarily conserved structure and intrinsic dynamics. This is underlined by the fact that early induction of this network in vivo predicts drug-induced liver and kidney pathology with high accuracy. Our findings demonstrate the value of early gene-expression signatures in predicting and understanding compound-induced toxicity. The identified network can empower first-line tests that reduce animal use and costs of safety evaluation.
Gap Gene Regulatory Dynamics Evolve along a Genotype Network
Crombach, Anton; Wotton, Karl R.; Jiménez-Guri, Eva; Jaeger, Johannes
2016-01-01
Developmental gene networks implement the dynamic regulatory mechanisms that pattern and shape the organism. Over evolutionary time, the wiring of these networks changes, yet the patterning outcome is often preserved, a phenomenon known as “system drift.” System drift is illustrated by the gap gene network—involved in segmental patterning—in dipteran insects. In the classic model organism Drosophila melanogaster and the nonmodel scuttle fly Megaselia abdita, early activation and placement of gap gene expression domains show significant quantitative differences, yet the final patterning output of the system is essentially identical in both species. In this detailed modeling analysis of system drift, we use gene circuits which are fit to quantitative gap gene expression data in M. abdita and compare them with an equivalent set of models from D. melanogaster. The results of this comparative analysis show precisely how compensatory regulatory mechanisms achieve equivalent final patterns in both species. We discuss the larger implications of the work in terms of “genotype networks” and the ways in which the structure of regulatory networks can influence patterns of evolutionary change (evolvability). PMID:26796549
NASA Astrophysics Data System (ADS)
Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.
2016-04-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Mukherjee, Shubhabrata; Russell, Joshua C; Carr, Daniel T; Burgess, Jeremy D; Allen, Mariet; Serie, Daniel J; Boehme, Kevin L; Kauwe, John S K; Naj, Adam C; Fardo, David W; Dickson, Dennis W; Montine, Thomas J; Ertekin-Taner, Nilufer; Kaeberlein, Matt R; Crane, Paul K
2017-10-01
We sought to determine whether a systems biology approach may identify novel late-onset Alzheimer's disease (LOAD) loci. We performed gene-wide association analyses and integrated results with human protein-protein interaction data using network analyses. We performed functional validation on novel genes using a transgenic Caenorhabditis elegans Aβ proteotoxicity model and evaluated novel genes using brain expression data from people with LOAD and other neurodegenerative conditions. We identified 13 novel candidate LOAD genes outside chromosome 19. Of those, RNA interference knockdowns of the C. elegans orthologs of UBC, NDUFS3, EGR1, and ATP5H were associated with Aβ toxicity, and NDUFS3, SLC25A11, ATP5H, and APP were differentially expressed in the temporal cortex. Network analyses identified novel LOAD candidate genes. We demonstrated a functional role for four of these in a C. elegans model and found enrichment of differentially expressed genes in the temporal cortex. Copyright © 2017 the Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
[Weighted gene co-expression network analysis in biomedicine research].
Liu, Wei; Li, Li; Ye, Hua; Tu, Wei
2017-11-25
High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.
Pan, Yu; Bradley, Glyn; Pyke, Kevin; Ball, Graham; Lu, Chungui; Fray, Rupert; Marshall, Alexandra; Jayasuta, Subhalai; Baxter, Charles; van Wijk, Rik; Boyden, Laurie; Cade, Rebecca; Chapman, Natalie H.; Fraser, Paul D.; Hodgman, Charlie; Seymour, Graham B.
2013-01-01
Carotenoids represent some of the most important secondary metabolites in the human diet, and tomato (Solanum lycopersicum) is a rich source of these health-promoting compounds. In this work, a novel and fruit-related regulator of pigment accumulation in tomato has been identified by artificial neural network inference analysis and its function validated in transgenic plants. A tomato fruit gene regulatory network was generated using artificial neural network inference analysis and transcription factor gene expression profiles derived from fruits sampled at various points during development and ripening. One of the transcription factor gene expression profiles with a sequence related to an Arabidopsis (Arabidopsis thaliana) ARABIDOPSIS PSEUDO RESPONSE REGULATOR2-LIKE gene (APRR2-Like) was up-regulated at the breaker stage in wild-type tomato fruits and, when overexpressed in transgenic lines, increased plastid number, area, and pigment content, enhancing the levels of chlorophyll in immature unripe fruits and carotenoids in red ripe fruits. Analysis of the transcriptome of transgenic lines overexpressing the tomato APPR2-Like gene revealed up-regulation of several ripening-related genes in the overexpression lines, providing a link between the expression of this tomato gene and the ripening process. A putative ortholog of the tomato APPR2-Like gene in sweet pepper (Capsicum annuum) was associated with pigment accumulation in fruit tissues. We conclude that the function of this gene is conserved across taxa and that it encodes a protein that has an important role in ripening. PMID:23292788
Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali
2013-01-01
Our goal of this study was to reconstruct a “genome-scale co-expression network” and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named “genome-scale co-expression network”. As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules. PMID:23874428
Transcriptional master regulator analysis in breast cancer genetic networks.
Tovar, Hugo; García-Herrera, Rodrigo; Espinal-Enríquez, Jesús; Hernández-Lemus, Enrique
2015-12-01
Gene regulatory networks account for the delicate mechanisms that control gene expression. Under certain circumstances, gene regulatory programs may give rise to amplification cascades. Such transcriptional cascades are events in which activation of key-responsive transcription factors called master regulators trigger a series of gene expression events. The action of transcriptional master regulators is then important for the establishment of certain programs like cell development and differentiation. However, such cascades have also been related with the onset and maintenance of cancer phenotypes. Here we present a systematic implementation of a series of algorithms aimed at the inference of a gene regulatory network and analysis of transcriptional master regulators in the context of primary breast cancer cells. Such studies were performed in a highly curated database of 880 microarray gene expression experiments on biopsy-captured tissue corresponding to primary breast cancer and healthy controls. Biological function and biochemical pathway enrichment analyses were also performed to study the role that the processes controlled - at the transcriptional level - by such master regulators may have in relation to primary breast cancer. We found that transcription factors such as AGTR2, ZNF132, TFDP3 and others are master regulators in this gene regulatory network. Sets of genes controlled by these regulators are involved in processes that are well-known hallmarks of cancer. This kind of analyses may help to understand the most upstream events in the development of phenotypes, in particular, those regarding cancer biology. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shchetynsky, Klementy; Diaz-Gallo, Lina-Marcella; Folkersen, Lasse; Hensvold, Aase Haj; Catrina, Anca Irinel; Berg, Louise; Klareskog, Lars; Padyukov, Leonid
2017-02-02
Here we integrate verified signals from previous genetic association studies with gene expression and pathway analysis for discovery of new candidate genes and signaling networks, relevant for rheumatoid arthritis (RA). RNA-sequencing-(RNA-seq)-based expression analysis of 377 genes from previously verified RA-associated loci was performed in blood cells from 5 newly diagnosed, non-treated patients with RA, 7 patients with treated RA and 12 healthy controls. Differentially expressed genes sharing a similar expression pattern in treated and untreated RA sub-groups were selected for pathway analysis. A set of "connector" genes derived from pathway analysis was tested for differential expression in the initial discovery cohort and validated in blood cells from 73 patients with RA and in 35 healthy controls. There were 11 qualifying genes selected for pathway analysis and these were grouped into two evidence-based functional networks, containing 29 and 27 additional connector molecules. The expression of genes, corresponding to connector molecules was then tested in the initial RNA-seq data. Differences in the expression of ERBB2, TP53 and THOP1 were similar in both treated and non-treated patients with RA and an additional nine genes were differentially expressed in at least one group of patients compared to healthy controls. The ERBB2, TP53. THOP1 expression profile was successfully replicated in RNA-seq data from peripheral blood mononuclear cells from healthy controls and non-treated patients with RA, in an independent collection of samples. Integration of RNA-seq data with findings from association studies, and consequent pathway analysis implicate new candidate genes, ERBB2, TP53 and THOP1 in the pathogenesis of RA.
Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.; ...
2016-09-29
Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.
Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan
2014-01-01
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
Ye, R; Carneiro, A M D; Han, Q; Airey, D; Sanders-Bush, E; Zhang, B; Lu, L; Williams, R; Blakely, R D
2014-03-01
Presynaptic serotonin (5-hydroxytryptamine, 5-HT) transporters (SERT) regulate 5-HT signaling via antidepressant-sensitive clearance of released neurotransmitter. Polymorphisms in the human SERT gene (SLC6A4) have been linked to risk for multiple neuropsychiatric disorders, including depression, obsessive-compulsive disorder and autism. Using BXD recombinant inbred mice, a genetic reference population that can support the discovery of novel determinants of complex traits, merging collective trait assessments with bioinformatics approaches, we examine phenotypic and molecular networks associated with SERT gene and protein expression. Correlational analyses revealed a network of genes that significantly associated with SERT mRNA levels. We quantified SERT protein expression levels and identified region- and gender-specific quantitative trait loci (QTLs), one of which associated with male midbrain SERT protein expression, centered on the protocadherin-15 gene (Pcdh15), overlapped with a QTL for midbrain 5-HT levels. Pcdh15 was also the only QTL-associated gene whose midbrain mRNA expression significantly associated with both SERT protein and 5-HT traits, suggesting an unrecognized role of the cell adhesion protein in the development or function of 5-HT neurons. To test this hypothesis, we assessed SERT protein and 5-HT traits in the Pcdh15 functional null line (Pcdh15(av-) (3J) ), studies that revealed a strong, negative influence of Pcdh15 on these phenotypes. Together, our findings illustrate the power of multidimensional profiling of recombinant inbred lines in the analysis of molecular networks that support synaptic signaling, and that, as in the case of Pcdh15, can reveal novel relationships that may underlie risk for mental illness. © 2014 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Predicting hepatocellular carcinoma through cross-talk genes identified by risk pathways
Shao, Zhuo; Huo, Diwei; Zhang, Denan; Xie, Hongbo; Yang, Jingbo; Liu, Qiuqi; Chen, Xiujie
2018-01-01
Hepatocellular carcinoma (HCC) is the most frequent type of liver cancer with poor survival rate and high mortality. Despite efforts on the mechanism of HCC, new molecular markers are needed for exact diagnosis, evaluation and treatment. Here, we combined transcriptome of HCC with networks and pathways to identify reliable molecular markers. Through integrating 249 differentially expressed genes with syncretic protein interaction networks, we constructed a HCC-specific network, from which we further extracted 480 pivotal genes. Based on the cross-talk between the enriched pathways of the pivotal genes, we finally identified a HCC signature of 45 genes, which could accurately distinguish HCC patients with normal individuals and reveal the prognosis of HCC patients. Among these 45 genes, 15 showed dysregulated expression patterns and a part have been reported to be associated with HCC and/or other cancers. These findings suggested that our identified 45 gene signature could be potential and valuable molecular markers for diagnosis and evaluation of HCC. PMID:29765536
Liu, Jie; Xie, Yaxiong; Ducharme, Danica M K; Shen, Jun; Diwan, Bhalchandra A; Merrick, B Alex; Grissom, Sherry F; Tucker, Charles J; Paules, Richard S; Tennant, Raymond; Waalkes, Michael P
2006-03-01
Our previous work has shown that exposure to inorganic arsenic in utero produces hepatocellular carcinoma (HCC) in adult male mice. To explore further the molecular mechanisms of transplacental arsenic hepatocarcinogenesis, we conducted a second arsenic transplacental carcinogenesis study and used a genomewide microarray to profile arsenic-induced aberrant gene expression more extensively. Briefly, pregnant C3H mice were given drinking water containing 85 ppm arsenic as sodium arsenite or unaltered water from days 8 to 18 of gestation. The incidence of HCC in adult male offspring was increased 4-fold and tumor multiplicity 3-fold after transplacental arsenic exposure. Samples of normal liver and liver tumors were taken at autopsy for genomic analysis. Arsenic exposure in utero resulted in significant alterations (p < 0.001) in the expression of 2,010 genes in arsenic-exposed liver samples and in the expression of 2,540 genes in arsenic-induced HCC. Ingenuity Pathway Analysis revealed that significant alterations in gene expression occurred in a number of biological networks, and Myc plays a critical role in one of the primary networks. Real-time reverse transcriptase-polymerase chain reaction and Western blot analysis of selected genes/proteins showed > 90% concordance. Arsenic-altered gene expression included activation of oncogenes and HCC biomarkers, and increased expression of cell proliferation-related genes, stress proteins, and insulin-like growth factors and genes involved in cell-cell communications. Liver feminization was evidenced by increased expression of estrogen-linked genes and altered expression of genes that encode gender-related metabolic enzymes. These novel findings are in agreement with the biology and histology of arsenic-induced HCC, thereby indicating that multiple genetic events are associated with transplacental arsenic hepatocarcinogenesis.
A network of heterochronic genes including Imp1 regulates temporal changes in stem cell properties
Nishino, Jinsuke; Kim, Sunjung; Zhu, Yuan; Zhu, Hao; Morrison, Sean J
2013-01-01
Stem cell properties change over time to match the changing growth and regeneration demands of tissues. We showed previously that adult forebrain stem cell function declines during aging because of increased expression of let-7 microRNAs, evolutionarily conserved heterochronic genes that reduce HMGA2 expression. Here we asked whether let-7 targets also regulate changes between fetal and adult stem cells. We found a second let-7 target, the RNA binding protein IMP1, that is expressed by fetal, but not adult, neural stem cells. IMP1 expression was promoted by Wnt signaling and Lin28a expression and opposed by let-7 microRNAs. Imp1-deficient neural stem cells were prematurely depleted in the dorsal telencephalon due to accelerated differentiation, impairing pallial expansion. IMP1 post-transcriptionally inhibited the expression of differentiation-associated genes while promoting the expression of self-renewal genes, including Hmga2. A network of heterochronic gene products including Lin28a, let-7, IMP1, and HMGA2 thus regulates temporal changes in stem cell properties. DOI: http://dx.doi.org/10.7554/eLife.00924.001 PMID:24192035
Functional clustering of time series gene expression data by Granger causality
2012-01-01
Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Hafemeister, Christoph; Nicotra, Adrienne B.; Jagadish, S.V. Krishna; Bonneau, Richard; Purugganan, Michael
2016-01-01
Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference. PMID:27655842
Slattery, Martha L; Pellatt, Daniel F; Mullany, Lila E; Wolff, Roger K
2015-01-01
Several diet and lifestyle factors may impact health by influencing oxidative stress levels. We hypothesize that level of cigarette smoking, alcohol, anti-inflammatory drugs, and diet alter gene expression. We analyzed RNA-seq data from 144 colon cancer patients who had information on recent cigarette smoking, recent alcohol consumption, diet, and recent aspirin/non-steroidal anti-inflammatory use. Using a false discovery rate of 0.1, we evaluated gene differential expression between high and low levels of exposure using DESeq2. Ingenuity Pathway Analysis (IPA) was used to determine networks associated with de-regulated genes in our data. We identified 46 deregulated genes associated with recent cigarette use; these genes enriched causal networks regulated by TEK and MAP2K3. Different differentially expressed genes were associated with type of alcohol intake; five genes were associated with total alcohol, six were associated with beer intake, six were associated with wine intake, and four were associated with liquor consumption. Recent use of aspirin and/or ibuprofen was associated with differential expression of TMC06, ST8SIA4, and STEAP3 while a summary oxidative balance score (OBS) was associated with SYCP3, HDX, and NRG4 (all up-regulated with greater oxidative balance). Of the dietary antioxidants and carotenoids evaluated only intake of beta carotene (1 gene), Lutein/Zeaxanthine (5 genes), and Vitamin E (4 genes) were associated with differential gene expression. There were similarities in biological function of de-regulated genes associated with various dietary and lifestyle factors. Our data support the hypothesis that diet and lifestyle factors associated with oxidative stress can alter gene expression. However genes altered were unique to type of alcohol and type of antioxidant. Because of potential differences in associations observed between platforms these findings need replication in other populations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zambon, Alexander C.; Zhang, Lingzhi; Minovitsky, Simon
Although a substantial number of hormones and drugs increase cellular cAMP levels, the global impact of cAMP and its major effector mechanism, protein kinase A (PKA), on gene expression is not known. Here we show that treatment of murine wild-type S49 lymphoma cells for 24 h with 8-(4-chlorophenylthio)-cAMP (8-CPTcAMP), a PKA-selective cAMP analog, alters the expression of approx equal to 4,500 of approx. equal to 13,600 unique genes. By contrast, gene expression was unaltered in Kin- S49 cells (that lack PKA) incubated with 8-CPTcAMP. Changes in mRNA and protein expression of several cell cycle regulators accompanied cAMP-induced G1-phase cell-cycle arrestmore » of wild-type S49 cells. Within 2h, 8-CPT-cAMP altered expression of 152 genes that contain evolutionarily conserved cAMP-response elements within 5 kb of transcriptional start sites, including the circadian clock gene Per1. Thus, cAMP through its activation of PKA produces extensive transcriptional regulation in eukaryotic cells. These transcriptional networks include a primary group of cAMP-response element-containing genes and secondary networks that include the circadian clock.« less
Williamson, Cait M.; Franks, Becca; Curley, James P.
2016-01-01
Laboratory studies of social behavior have typically focused on dyadic interactions occurring within a limited spatiotemporal context. However, this strategy prevents analyses of the dynamics of group social behavior and constrains identification of the biological pathways mediating individual differences in behavior. In the current study, we aimed to identify the spatiotemporal dynamics and hierarchical organization of a large social network of male mice. We also sought to determine if standard assays of social and exploratory behavior are predictive of social behavior in this social network and whether individual network position was associated with the mRNA expression of two plasticity-related genes, DNA methyltransferase 1 and 3a. Mice were observed to form a hierarchically organized social network and self-organized into two separate social network communities. Members of both communities exhibited distinct patterns of socio-spatial organization within the vivaria that was not limited to only agonistic interactions. We further established that exploratory and social behaviors in standard behavioral assays conducted prior to placing the mice into the large group was predictive of initial network position and behavior but were not associated with final social network position. Finally, we determined that social network position is associated with variation in mRNA levels of two neural plasticity genes, DNMT1 and DNMT3a, in the hippocampus but not the mPOA. This work demonstrates the importance of understanding the role of social context and complex social dynamics in determining the relationship between individual differences in social behavior and brain gene expression. PMID:27540359
Protein interaction networks from literature mining
NASA Astrophysics Data System (ADS)
Ihara, Sigeo
2005-03-01
The ability to accurately predict and understand physiological changes in the biological network system in response to disease or drug therapeutics is of crucial importance in life science. The extensive amount of gene expression data generated from even a single microarray experiment often proves difficult to fully interpret and comprehend the biological significance. An increasing knowledge of protein interactions stored in the PubMed database, as well as the advancement of natural language processing, however, makes it possible to construct protein interaction networks from the gene expression information that are essential for understanding the biological meaning. From the in house literature mining system we have developed, the protein interaction network for humans was constructed. By analysis based on the graph-theoretical characterization of the total interaction network in literature, we found that the network is scale-free and semantic long-ranged interactions (i.e. inhibit, induce) between proteins dominate in the total interaction network, reducing the degree exponent. Interaction networks generated based on scientific text in which the interaction event is ambiguously described result in disconnected networks. In contrast interaction networks based on text in which the interaction events are clearly stated result in strongly connected networks. The results of protein-protein interaction networks obtained in real applications from microarray experiments are discussed: For example, comparisons of the gene expression data indicative of either a good or a poor prognosis for acute lymphoblastic leukemia with MLL rearrangements, using our system, showed newly discovered signaling cross-talk.
Begum, Tina; Ghosh, Tapash Chandra
2014-10-05
To date, numerous studies have been attempted to determine the extent of variation in evolutionary rates between human disease and nondisease (ND) genes. In our present study, we have considered human autosomal monogenic (Mendelian) disease genes, which were classified into two groups according to the number of phenotypic defects, that is, specific disease (SPD) gene (one gene: one defect) and shared disease (SHD) gene (one gene: multiple defects). Here, we have compared the evolutionary rates of these two groups of genes, that is, SPD genes and SHD genes with respect to ND genes. We observed that the average evolutionary rates are slow in SHD group, intermediate in SPD group, and fast in ND group. Group-to-group evolutionary rate differences remain statistically significant regardless of their gene expression levels and number of defects. We demonstrated that disease genes are under strong selective constraint if they emerge through edgetic perturbation or drug-induced perturbation of the interactome network, show tissue-restricted expression, and are involved in transmembrane transport. Among all the factors, our regression analyses interestingly suggest the independent effects of 1) drug-induced perturbation and 2) the interaction term of expression breadth and transmembrane transport on protein evolutionary rates. We reasoned that the drug-induced network disruption is a combination of several edgetic perturbations and, thus, has more severe effect on gene phenotypes. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Network Security via Biometric Recognition of Patterns of Gene Expression
NASA Technical Reports Server (NTRS)
Shaw, Harry C.
2016-01-01
Molecular biology provides the ability to implement forms of information and network security completely outside the bounds of legacy security protocols and algorithms. This paper addresses an approach which instantiates the power of gene expression for security. Molecular biology provides a rich source of gene expression and regulation mechanisms, which can be adopted to use in the information and electronic communication domains. Conventional security protocols are becoming increasingly vulnerable due to more intensive, highly capable attacks on the underlying mathematics of cryptography. Security protocols are being undermined by social engineering and substandard implementations by IT (Information Technology) organizations. Molecular biology can provide countermeasures to these weak points with the current security approaches. Future advances in instruments for analyzing assays will also enable this protocol to advance from one of cryptographic algorithms to an integrated system of cryptographic algorithms and real-time assays of gene expression products.
METscout: a pathfinder exploring the landscape of metabolites, enzymes and transporters.
Geffers, Lars; Tetzlaff, Benjamin; Cui, Xiao; Yan, Jun; Eichele, Gregor
2013-01-01
METscout (http://metscout.mpg.de) brings together metabolism and gene expression landscapes. It is a MySQL relational database linking biochemical pathway information with 3D patterns of gene expression determined by robotic in situ hybridization in the E14.5 mouse embryo. The sites of expression of ∼1500 metabolic enzymes and of ∼350 solute carriers (SLCs) were included and are accessible as single cell resolution images and in the form of semi-quantitative image abstractions. METscout provides several graphical web-interfaces allowing navigation through complex anatomical and metabolic information. Specifically, the database shows where in the organism each of the many metabolic reactions take place and where SLCs transport metabolites. To link enzymatic reactions and transport, the KEGG metabolic reaction network was extended to include metabolite transport. This network in conjunction with spatial expression pattern of the network genes allows for a tracing of metabolic reactions and transport processes across the entire body of the embryo.
Yunoki, Tatsuya; Tabuchi, Yoshiaki; Hayashi, Atsushi; Kondo, Takashi
2016-07-01
BCL2-associated athanogene 3 (BAG3), a co-chaperone of the heat shock 70 kDa protein (HSPA) family of proteins, is a cytoprotective protein that acts against various stresses, including heat stress. The aim of the present study was to identify gene networks involved in the enhancement of hyperthermia (HT) sensitivity by the knockdown (KD) of BAG3 in human oral squamous cell carcinoma (OSCC) cells. Although a marked elevation in the protein expression of BAG3 was detected in human the OSCC HSC-3 cells exposed to HT at 44˚C for 90 min, its expression was almost completely suppressed in the cells transfected with small interfering RNA against BAG3 (siBAG) under normal and HT conditions. The silencing of BAG3 also enhanced the cell death that was increased in the HSC-3 cells by exposure to HT. Global gene expression analysis revealed many genes that were differentially expressed by >2-fold in the cells exposed to HT and transfected with siBAG. Moreover, Ingenuity® pathways analysis demonstrated two unique gene networks, designated as Pro-cell death and Anti-cell death, which were obtained from upregulated genes and were mainly associated with the biological functions of induction and the prevention of cell death, respectively. Of note, the expression levels of genes in the Pro-cell death and Anti-cell death gene networks were significantly elevated and reduced in the HT + BAG3-KD group compared to those in the HT control group, respectively. These results provide further insight into the molecular mechanisms involved in the enhancement of HT sensitivity by the silencing of BAG3 in human OSCC cells.
2012-01-01
Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM). Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF). A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090), which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene). The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070) and constans-like (COL: At2g21320), were identified as positive regulators of starch synthase 4 (SS4: At4g18240). The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray analysis to discover the transcriptional regulatory network of starch metabolism in Arabidopsis leaves. With this inference method, the starch regulatory network of Arabidopsis was found to be strongly associated with clock genes and TFs, of which AtIDD5 and COL were evidenced to control SS4 gene expression and starch granule formation in chloroplasts. PMID:22898356
Ruzicka, W. Brad; Subburaju, Sivan; Benes, Francine M.
2017-01-01
IMPORTANCE Dysfunction related to γ-aminobutyric acid (GABA)–ergic neurotransmission in the pathophysiology of major psychosis has been well established by the work of multiple groups across several decades, including the widely replicated downregulation of GAD1. Prior gene expression and network analyses within the human hippocampus implicate a broader network of genes, termed the GAD1 regulatory network, in regulation of GAD1 expression. Several genes within this GAD1 regulatory network show diagnosis- and sector-specific expression changes within the circuitry of the hippocampus, influencing abnormal GAD1 expression in schizophrenia and bipolar disorder. OBJECTIVE To investigate the hypothesis that aberrant DNA methylation contributes to circuit- and diagnosis-specific abnormal expression of GAD1 regulatory network genes in psychotic illness. DESIGN, SETTING, AND PARTICIPANTS This epigenetic association study targeting GAD1 regulatory network genes was conducted between July 1, 2012, and June 30, 2014. Postmortem human hippocampus tissue samples were obtained from 8patients with schizophrenia, 8 patients with bipolar disorder, and 8 healthy control participants matched for age, sex, postmortem interval, and other potential confounds from the Harvard Brain Tissue Resource Center, McLean Hospital, Belmont,Massachusetts. We extracted DNA from laser-microdissected stratum oriens tissue of cornu ammonis 2/3 (CA2/3) and CA1 postmortem human hippocampus, bisulfite modified it, and assessed it with the Infinium HumanMethylation450 BeadChip (Illumina, Inc). The subset of CpG loci associated with GAD1 regulatory network genes was analyzed in R version 3.1.0 software (R Foundation) using the minfi package. Findings were validated using bisulfite pyrosequencing. MAIN OUTCOMES AND MEASURES Methylation levels at 1308 GAD1 regulatory network–associated CpG loci were assessed both as individual sites to identify differentially methylated positions and by sharing information among colocalized probes to identify differentially methylated regions. RESULTS A total of 146 differentially methylated positions with a false detection rate lower than 0.05 were identified across all 6 groups (2 circuit locations in each of 3 diagnostic categories), and 54 differentially methylated regions with P < .01 were identified in single-group comparisons. Methylation changes were enriched in MSX1, CCND2, and DAXX at specific loci within the hippocampus of patients with schizophrenia and bipolar disorder. CONCLUSIONS AND RELEVANCE This work demonstrates diagnosis- and circuit-specific DNA methylation changes at a subset of GAD1 regulatory network genes in the human hippocampus in schizophrenia and bipolar disorder. These genes participate in chromatin regulation and cell cycle control, supporting the concept that the established GABAergic dysfunction in these disorders is related to disruption of GABAergic interneuron physiology at specific circuit locations within the human hippocampus. PMID:25738424
Modeling Bi-modality Improves Characterization of Cell Cycle on Gene Expression in Single Cells
Danaher, Patrick; Finak, Greg; Krouse, Michael; Wang, Alice; Webster, Philippa; Beechem, Joseph; Gottardo, Raphael
2014-01-01
Advances in high-throughput, single cell gene expression are allowing interrogation of cell heterogeneity. However, there is concern that the cell cycle phase of a cell might bias characterizations of gene expression at the single-cell level. We assess the effect of cell cycle phase on gene expression in single cells by measuring 333 genes in 930 cells across three phases and three cell lines. We determine each cell's phase non-invasively without chemical arrest and use it as a covariate in tests of differential expression. We observe bi-modal gene expression, a previously-described phenomenon, wherein the expression of otherwise abundant genes is either strongly positive, or undetectable within individual cells. This bi-modality is likely both biologically and technically driven. Irrespective of its source, we show that it should be modeled to draw accurate inferences from single cell expression experiments. To this end, we propose a semi-continuous modeling framework based on the generalized linear model, and use it to characterize genes with consistent cell cycle effects across three cell lines. Our new computational framework improves the detection of previously characterized cell-cycle genes compared to approaches that do not account for the bi-modality of single-cell data. We use our semi-continuous modelling framework to estimate single cell gene co-expression networks. These networks suggest that in addition to having phase-dependent shifts in expression (when averaged over many cells), some, but not all, canonical cell cycle genes tend to be co-expressed in groups in single cells. We estimate the amount of single cell expression variability attributable to the cell cycle. We find that the cell cycle explains only 5%–17% of expression variability, suggesting that the cell cycle will not tend to be a large nuisance factor in analysis of the single cell transcriptome. PMID:25032992
Effect of Temperature on Synthetic Positive and Negative Feedback Gene Networks
NASA Astrophysics Data System (ADS)
Charlebois, Daniel A.; Marshall, Sylvia; Balazsi, Gabor
Synthetic biological systems are built and tested under well controlled laboratory conditions. How altering the environment, such as the ambient temperature affects their function is not well understood. To address this question for synthetic gene networks with positive and negative feedback, we used mathematical modeling coupled with experiments in the budding yeast Saccharomyces cerevisiae. We found that cellular growth rates and gene expression dose responses change significantly at temperatures above and below the physiological optimum for yeast. Gene expression distributions for the negative feedback-based circuit changed from unimodal to bimodal at high temperature, while the bifurcation point of the positive feedback circuit shifted up with temperature. These results demonstrate that synthetic gene network function is context-dependent. Temperature effects should thus be tested and incorporated into their design and validation for real-world applications. NSERC Postdoctoral Fellowship (Grant No. PDF-453977-2014).
Comparison of co-expression measures: mutual information, correlation, and model based indices.
Song, Lin; Langfelder, Peter; Horvath, Steve
2012-12-09
Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships. Our results indicate that MI networks can safely be replaced by correlation networks when it comes to measuring co-expression relationships in stationary data.
Liu, Yonghong; Liu, Yuanyuan; Wu, Jiaming; Roizman, Bernard; Zhou, Grace Guoying
2018-04-03
Analyses of the levels of mRNAs encoding IFIT1, IFI16, RIG-1, MDA5, CXCL10, LGP2, PUM1, LSD1, STING, and IFNβ in cell lines from which the gene encoding LGP2, LSD1, PML, HDAC4, IFI16, PUM1, STING, MDA5, IRF3, or HDAC 1 had been knocked out, as well as the ability of these cell lines to support the replication of HSV-1, revealed the following: ( i ) Cell lines lacking the gene encoding LGP2, PML, or HDAC4 (cluster 1) exhibited increased levels of expression of partially overlapping gene networks. Concurrently, these cell lines produced from 5 fold to 12 fold lower yields of HSV-1 than the parental cells. ( ii ) Cell lines lacking the genes encoding STING, LSD1, MDA5, IRF3, or HDAC 1 (cluster 2) exhibited decreased levels of mRNAs of partially overlapping gene networks. Concurrently, these cell lines produced virus yields that did not differ from those produced by the parental cell line. The genes up-regulated in cell lines forming cluster 1, overlapped in part with genes down-regulated in cluster 2. The key conclusions are that gene knockouts and subsequent selection for growth causes changes in expression of multiple genes, and hence the phenotype of the cell lines cannot be ascribed to a single gene; the patterns of gene expression may be shared by multiple knockouts; and the enhanced immunity to viral replication by cluster 1 knockout cell lines but not by cluster 2 cell lines suggests that in parental cells, the expression of innate resistance to infection is specifically repressed.
Ludovini, Vienna; Bianconi, Fortunato; Siggillino, Annamaria; Piobbico, Danilo; Vannucci, Jacopo; Metro, Giulio; Chiari, Rita; Bellezza, Guido; Puma, Francesco; Della Fazia, Maria Agnese; Servillo, Giuseppe; Crinò, Lucio
2016-05-24
Risk assessment and treatment choice remains a challenge in early non-small-cell lung cancer (NSCLC). The aim of this study was to identify novel genes involved in the risk of early relapse (ER) compared to no relapse (NR) in resected lung adenocarcinoma (AD) patients using a combination of high throughput technology and computational analysis. We identified 18 patients (n.13 NR and n.5 ER) with stage I AD. Frozen samples of patients in ER, NR and corresponding normal lung (NL) were subjected to Microarray technology and quantitative-PCR (Q-PCR). A gene network computational analysis was performed to select predictive genes. An independent set of 79 ADs stage I samples was used to validate selected genes by Q-PCR.From microarray analysis we selected 50 genes, using the fold change ratio of ER versus NR. They were validated both in pool and individually in patient samples (ER and NR) by Q-PCR. Fourteen increased and 25 decreased genes showed a concordance between two methods. They were used to perform a computational gene network analysis that identified 4 increased (HOXA10, CLCA2, AKR1B10, FABP3) and 6 decreased (SCGB1A1, PGC, TFF1, PSCA, SPRR1B and PRSS1) genes. Moreover, in an independent dataset of ADs samples, we showed that both high FABP3 expression and low SCGB1A1 expression was associated with a worse disease-free survival (DFS).Our results indicate that it is possible to define, through gene expression and computational analysis, a characteristic gene profiling of patients with an increased risk of relapse that may become a tool for patient selection for adjuvant therapy.
Inference of cancer-specific gene regulatory networks using soft computing rules.
Wang, Xiaosheng; Gotoh, Osamu
2010-03-24
Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Chamber Specific Gene Expression Landscape of the Zebrafish Heart
Singh, Angom Ramcharan; Sivadas, Ambily; Sabharwal, Ankit; Vellarikal, Shamsudheen Karuthedath; Jayarajan, Rijith; Verma, Ankit; Kapoor, Shruti; Joshi, Adita; Scaria, Vinod; Sivasubbu, Sridhar
2016-01-01
The organization of structure and function of cardiac chambers in vertebrates is defined by chamber-specific distinct gene expression. This peculiarity and uniqueness of the genetic signatures demonstrates functional resolution attributed to the different chambers of the heart. Altered expression of the cardiac chamber genes can lead to individual chamber related dysfunctions and disease patho-physiologies. Information on transcriptional repertoire of cardiac compartments is important to understand the spectrum of chamber specific anomalies. We have carried out a genome wide transcriptome profiling study of the three cardiac chambers in the zebrafish heart using RNA sequencing. We have captured the gene expression patterns of 13,396 protein coding genes in the three cardiac chambers—atrium, ventricle and bulbus arteriosus. Of these, 7,260 known protein coding genes are highly expressed (≥10 FPKM) in the zebrafish heart. Thus, this study represents nearly an all-inclusive information on the zebrafish cardiac transcriptome. In this study, a total of 96 differentially expressed genes across the three cardiac chambers in zebrafish were identified. The atrium, ventricle and bulbus arteriosus displayed 20, 32 and 44 uniquely expressing genes respectively. We validated the expression of predicted chamber-restricted genes using independent semi-quantitative and qualitative experimental techniques. In addition, we identified 23 putative novel protein coding genes that are specifically restricted to the ventricle and not in the atrium or bulbus arteriosus. In our knowledge, these 23 novel genes have either not been investigated in detail or are sparsely studied. The transcriptome identified in this study includes 68 differentially expressing zebrafish cardiac chamber genes that have a human ortholog. We also carried out spatiotemporal gene expression profiling of the 96 differentially expressed genes throughout the three cardiac chambers in 11 developmental stages and 6 tissue types of zebrafish. We hypothesize that clustering the differentially expressed genes with both known and unknown functions will deliver detailed insights on fundamental gene networks that are important for the development and specification of the cardiac chambers. It is also postulated that this transcriptome atlas will help utilize zebrafish in a better way as a model for studying cardiac development and to explore functional role of gene networks in cardiac disease pathogenesis. PMID:26815362
Hoffman, Robert W; Merrill, Joan T; Alarcón-Riquelme, Marta M E; Petri, Michelle; Dow, Ernst R; Nantz, Eric; Nisenbaum, Laura K; Schroeder, Krista M; Komocsar, Wendy J; Perumal, Narayanan B; Linnik, Matthew D; Airey, David C; Liu, Yushi; Rocha, Guilherme V; Higgs, Richard E
2017-03-01
To characterize baseline gene expression and pharmacodynamically induced changes in whole blood gene expression in 1,760 systemic lupus erythematosus (SLE) patients from 2 phase III, 52-week, randomized, placebo-controlled, double-blind studies in which patients were treated with the BAFF-blocking IgG4 monoclonal antibody tabalumab. Patient samples were obtained from SLE patients from the ILLUMINATE-1 and ILLUMINATE-2 studies, and control samples were obtained from healthy donors. Blood was collected in Tempus tubes at baseline, week 16, and week 52. RNA was analyzed using Affymetrix Human Transcriptome Array 2.0 and NanoString. At baseline, expression of the interferon (IFN) response gene was elevated in patients compared with controls, with 75% of patients being positive for this IFN response gene signature. There was, however, substantial heterogeneity of IFN response gene expression and complex relationships among gene networks. The IFN response gene signature was a predictor of time to disease flare, independent of anti-double-stranded DNA (anti-dsDNA) antibody and C3 and C4 levels, and overall disease activity. Pharmacodynamically induced changes in gene expression following tabalumab treatment were extensive, occurring predominantly in B cell-related and immunoglobulin genes, and were consistent with other pharmacodynamic changes including anti-dsDNA antibody, C3, and immunoglobulin levels. SLE patients demonstrated increased expression of an IFN response gene signature (75% of patients had an elevated IFN response gene signature) at baseline in ILLUMINATE-1 and ILLUMINATE-2. Substantial heterogeneity of gene expression was detected among individual patients and in gene networks. The IFN response gene signature was an independent risk factor for future disease flares. Pharmacodynamic changes in gene expression were consistent with the mechanism of BAFF blockade by tabalumab. © 2016, American College of Rheumatology.
Functional networks inference from rule-based machine learning models.
Lazzarini, Nicola; Widera, Paweł; Williamson, Stuart; Heer, Rakesh; Krasnogor, Natalio; Bacardit, Jaume
2016-01-01
Functional networks play an important role in the analysis of biological processes and systems. The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, the similarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumes a functional relationship between genes which are expressed at similar levels across different samples. An alternative to this paradigm is the inference of relationships from the structure of machine learning models. These models are able to capture complex relationships between variables, that often are different/complementary to the similarity-based methods. We propose a protocol to infer functional networks from machine learning models, called FuNeL. It assumes, that genes used together within a rule-based machine learning model to classify the samples, might also be functionally related at a biological level. The protocol is first tested on synthetic datasets and then evaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from the real-world data are compared against gene co-expression networks of equal size, generated with 3 different methods. The comparison is performed from two different points of view. We analyse the enriched biological terms in the set of network nodes and the relationships between known disease-associated genes in a context of the network topology. The comparison confirms both the biological relevance and the complementary character of the knowledge captured by the FuNeL networks in relation to similarity-based methods and demonstrates its potential to identify known disease associations as core elements of the network. Finally, using a prostate cancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant to the disease and consistent with the specialised literature and with an independent dataset not used in the inference process. The implementation of our network inference protocol is available at: http://ico2s.org/software/funel.html.
Preservation affinity in consensus modules among stages of HIV-1 progression.
Mosaddek Hossain, Sk Md; Ray, Sumanta; Mukhopadhyay, Anirban
2017-03-20
Analysis of gene expression data provides valuable insights into disease mechanism. Investigating relationship among co-expression modules of different stages is a meaningful tool to understand the way in which a disease progresses. Identifying topological preservation of modular structure also contributes to that understanding. HIV-1 disease provides a well-documented progression pattern through three stages of infection: acute, chronic and non-progressor. In this article, we have developed a novel framework to describe the relationship among the consensus (or shared) co-expression modules for each pair of HIV-1 infection stages. The consensus modules are identified to assess the preservation of network properties. We have investigated the preservation patterns of co-expression networks during HIV-1 disease progression through an eigengene-based approach. We discovered that the expression patterns of consensus modules have a strong preservation during the transitions of three infection stages. In particular, it is noticed that between acute and non-progressor stages the preservation is slightly more than the other pair of stages. Moreover, we have constructed eigengene networks for the identified consensus modules and observed the preservation structure among them. Some consensus modules are marked as preserved in two pairs of stages and are analyzed further to form a higher order meta-network consisting of a group of preserved modules. Additionally, we observed that module membership (MM) values of genes within a module are consistent with the preservation characteristics. The MM values of genes within a pair of preserved modules show strong correlation patterns across two infection stages. We have performed an extensive analysis to discover preservation pattern of co-expression network constructed from microarray gene expression data of three different HIV-1 progression stages. The preservation pattern is investigated through identification of consensus modules in each pair of infection stages. It is observed that the preservation of the expression pattern of consensus modules remains more prominent during the transition of infection from acute stage to non-progressor stage. Additionally, we observed that the module membership values of genes are coherent with preserved modules across the HIV-1 progression stages.
Scholz, Birger; Doidge, Amie N.; Barnes, Philip; Hall, Jeremy; Wilkinson, Lawrence S.; Thomas, Kerrie L.
2016-01-01
We investigated the distinctiveness of gene regulatory networks in CA1 associated with the extinction of contextual fear memory (CFM) after recall using Affymetrix GeneChip Rat Genome 230 2.0 Arrays. These data were compared to previously published retrieval and reconsolidation-attributed, and consolidation datasets. A stringent dual normalization and pareto-scaled orthogonal partial least-square discriminant multivariate analysis together with a jack-knifing-based cross-validation approach was used on all datasets to reduce false positives. Consolidation, retrieval and extinction were correlated with distinct patterns of gene expression 2 hours later. Extinction-related gene expression was most distinct from the profile accompanying consolidation. A highly specific feature was the discrete regulation of neuroimmunological gene expression associated with retrieval and extinction. Immunity–associated genes of the tyrosine kinase receptor TGFβ and PDGF, and TNF families’ characterized extinction. Cytokines and proinflammatory interleukins of the IL-1 and IL-6 families were enriched with the no-extinction retrieval condition. We used comparative genomics to predict transcription factor binding sites in proximal promoter regions of the retrieval-regulated genes. Retrieval that does not lead to extinction was associated with NF-κB-mediated gene expression. We confirmed differential NF-κBp65 expression, and activity in all of a representative sample of our candidate genes in the no-extinction condition. The differential regulation of cytokine networks after the acquisition and retrieval of CFM identifies the important contribution that neuroimmune signalling plays in normal hippocampal function. Further, targeting cytokine signalling upon retrieval offers a therapeutic strategy to promote extinction mechanisms in human disorders characterised by dysregulation of associative memory. PMID:27224427
Co-expression Network Approach to Studying the Effects of Botulinum Neurotoxin-A.
Mukund, Kavitha; Ward, Samuel R; Lieber, Richard L; Subramaniam, Shankar
2017-10-16
Botulinum Neurotoxin A (BoNT-A) is a potent neurotoxin with several clinical applications.The goal of this study was to utilize co-expression network theory to analyze temporal transcriptional data from skeletal muscle after BoNT-A treatment. Expression data for 2000 genes (extracted using a ranking heuristic) served as the basis for this analysis. Using weighted gene co-expression network analysis (WGCNA), we identified 19 co-expressed modules, further hierarchically clustered into 5 groups. Quantifying average expression and co-expression patterns across these groups revealed temporal aspects of muscle's response to BoNT-A. Functional analysis revealed enrichment of group 1 with metabolism; group 5 with contradictory functions of atrophy and cellular recovery; and groups 2 and 3 with extracellular matrix (ECM) and non-fast fiber isoforms. Topological positioning of two highly ranked, significantly expressed genes- Dclk1 and Ostalpha within group 5 suggested possible mechanistic roles in recovery from BoNT-A induced atrophy. Phenotypic correlations of groups with titin and myosin protein content further emphasized the effect of BoNT-A on the sarcomeric contraction machinery in early phase of chemodenervation. In summary, our approach revealed a hierarchical functional response to BoNT-A induced paralysis with early metabolic and later ECM responses and identified putative biomarkers associated with chemodenervation. Additionally, our results provide an unbiased validation of the response documented in our previous workBotulinum Neurotoxin A (BoNT-A) is a potent neurotoxin with several clinical applications.The goal of this study was to utilize co-expression network theory to analyze temporal transcriptional data from skeletal muscle after BoNT-A treatment. Expression data for 2000 genes (extracted using a ranking heuristic) served as the basis for this analysis. Using weighted gene co-expression network analysis (WGCNA), we identified 19 co-expressed modules, further hierarchically clustered into 5 groups. Quantifying average expression and co-expression patterns across these groups revealed temporal aspects of muscle's response to BoNT-A. Functional analysis revealed enrichment of group 1 with metabolism; group 5 with contradictory functions of atrophy and cellular recovery; and groups 2 and 3 with extracellular matrix (ECM) and non-fast fiber isoforms. Topological positioning of two highly ranked, significantly expressed genes- Dclk1 and Ostalpha within group 5 suggested possible mechanistic roles in recovery from BoNT-A induced atrophy. Phenotypic correlations of groups with titin and myosin protein content further emphasized the effect of BoNT-A on the sarcomeric contraction machinery in early phase of chemodenervation. In summary, our approach revealed a hierarchical functional response to BoNT-A induced paralysis with early metabolic and later ECM responses and identified putative biomarkers associated with chemodenervation. Additionally, our results provide an unbiased validation of the response documented in our previous work.
An iterative network partition algorithm for accurate identification of dense network modules
Sun, Siqi; Dong, Xinran; Fu, Yao; Tian, Weidong
2012-01-01
A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks. PMID:22121225
Rittman, Timothy; Rubinov, Mikail; Vértes, Petra E; Patel, Ameera X; Ginestet, Cedric E; Ghosh, Boyd C P; Barker, Roger A; Spillantini, Maria Grazia; Bullmore, Edward T; Rowe, James B
2016-12-01
Abnormalities of tau protein are central to the pathogenesis of progressive supranuclear palsy, whereas haplotype variation of the tau gene MAPT influences the risk of Parkinson disease and Parkinson's disease dementia. We assessed whether regional MAPT expression might be associated with selective vulnerability of global brain networks to neurodegenerative pathology. Using task-free functional magnetic resonance imaging in progressive supranuclear palsy, Parkinson disease, and healthy subjects (n = 128), we examined functional brain networks and measured the connection strength between 471 gray matter regions. We obtained MAPT and SNCA microarray expression data in healthy subjects from the Allen brain atlas. Regional connectivity varied according to the normal expression of MAPT. The regional expression of MAPT correlated with the proportionate loss of regional connectivity in Parkinson's disease. Executive cognition was impaired in proportion to the loss of hub connectivity. These effects were not seen with SNCA, suggesting that alpha-synuclein pathology is not mediated through global network properties. The results establish a link between regional MAPT expression and selective vulnerability of functional brain networks to neurodegeneration. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
2011-01-01
Background Green plant leaves have always fascinated biologists as hosts for photosynthesis and providers of basic energy to many food webs. Today, comprehensive databases of gene expression data enable us to apply increasingly more advanced computational methods for reverse-engineering the regulatory network of leaves, and to begin to understand the gene interactions underlying complex emergent properties related to stress-response and development. These new systems biology methods are now also being applied to organisms such as Populus, a woody perennial tree, in order to understand the specific characteristics of these species. Results We present a systems biology model of the regulatory network of Populus leaves. The network is reverse-engineered from promoter information and expression profiles of leaf-specific genes measured over a large set of conditions related to stress and developmental. The network model incorporates interactions between regulators, such as synergistic and competitive relationships, by evaluating increasingly more complex regulatory mechanisms, and is therefore able to identify new regulators of leaf development not found by traditional genomics methods based on pair-wise expression similarity. The approach is shown to explain available gene function information and to provide robust prediction of expression levels in new data. We also use the predictive capability of the model to identify condition-specific regulation as well as conserved regulation between Populus and Arabidopsis. Conclusions We outline a computationally inferred model of the regulatory network of Populus leaves, and show how treating genes as interacting, rather than individual, entities identifies new regulators compared to traditional genomics analysis. Although systems biology models should be used with care considering the complexity of regulatory programs and the limitations of current genomics data, methods describing interactions can provide hypotheses about the underlying cause of emergent properties and are needed if we are to identify target genes other than those constituting the "low hanging fruit" of genomic analysis. PMID:21232107
Exploring Plant Co-Expression and Gene-Gene Interactions with CORNET 3.0.
Van Bel, Michiel; Coppens, Frederik
2017-01-01
Selecting and filtering a reference expression and interaction dataset when studying specific pathways and regulatory interactions can be a very time-consuming and error-prone task. In order to reduce the duplicated efforts required to amass such datasets, we have created the CORNET (CORrelation NETworks) platform which allows for easy access to a wide variety of data types: coexpression data, protein-protein interactions, regulatory interactions, and functional annotations. The CORNET platform outputs its results in either text format or through the Cytoscape framework, which is automatically launched by the CORNET website.CORNET 3.0 is the third iteration of the web platform designed for the user exploration of the coexpression space of plant genomes, with a focus on the model species Arabidopsis thaliana. Here we describe the platform: the tools, data, and best practices when using the platform. We indicate how the platform can be used to infer networks from a set of input genes, such as upregulated genes from an expression experiment. By exploring the network, new target and regulator genes can be discovered, allowing for follow-up experiments and more in-depth study. We also indicate how to avoid common pitfalls when evaluating the networks and how to avoid over interpretation of the results.All CORNET versions are available at http://bioinformatics.psb.ugent.be/cornet/ .
Kaushik, Abhinav; Ali, Shakir; Gupta, Dinesh
2017-01-01
Gene connection rewiring is an essential feature of gene network dynamics. Apart from its normal functional role, it may also lead to dysregulated functional states by disturbing pathway homeostasis. Very few computational tools measure rewiring within gene co-expression and its corresponding regulatory networks in order to identify and prioritize altered pathways which may or may not be differentially regulated. We have developed Altered Pathway Analyzer (APA), a microarray dataset analysis tool for identification and prioritization of altered pathways, including those which are differentially regulated by TFs, by quantifying rewired sub-network topology. Moreover, APA also helps in re-prioritization of APA shortlisted altered pathways enriched with context-specific genes. We performed APA analysis of simulated datasets and p53 status NCI-60 cell line microarray data to demonstrate potential of APA for identification of several case-specific altered pathways. APA analysis reveals several altered pathways not detected by other tools evaluated by us. APA analysis of unrelated prostate cancer datasets identifies sample-specific as well as conserved altered biological processes, mainly associated with lipid metabolism, cellular differentiation and proliferation. APA is designed as a cross platform tool which may be transparently customized to perform pathway analysis in different gene expression datasets. APA is freely available at http://bioinfo.icgeb.res.in/APA. PMID:28084397
Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T
2016-11-14
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
Dynamic Visualization of Co-expression in Systems Genetics Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
New, Joshua Ryan; Huang, Jian; Chesler, Elissa J
2008-01-01
Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biologicalmore » networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.« less
De Cegli, Rossella; Iacobacci, Simona; Flore, Gemma; Gambardella, Gennaro; Mao, Lei; Cutillo, Luisa; Lauria, Mario; Klose, Joachim; Illingworth, Elizabeth; Banfi, Sandro; di Bernardo, Diego
2013-01-01
Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology ‘reverse engineering’ approaches. We ‘reverse engineered’ an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression (‘hubs’). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central ‘hub’ of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation. PMID:23180766