Science.gov

Sample records for coexpressed gene networks

  1. Differentially Coexpressed Disease Gene Identification Based on Gene Coexpression Network.

    PubMed

    Jiang, Xue; Zhang, Han; Quan, Xiongwen

    2016-01-01

    Screening disease-related genes by analyzing gene expression data has become a popular theme. Traditional disease-related gene selection methods always focus on identifying differentially expressed gene between case samples and a control group. These traditional methods may not fully consider the changes of interactions between genes at different cell states and the dynamic processes of gene expression levels during the disease progression. However, in order to understand the mechanism of disease, it is important to explore the dynamic changes of interactions between genes in biological networks at different cell states. In this study, we designed a novel framework to identify disease-related genes and developed a differentially coexpressed disease-related gene identification method based on gene coexpression network (DCGN) to screen differentially coexpressed genes. We firstly constructed phase-specific gene coexpression network using time-series gene expression data and defined the conception of differential coexpression of genes in coexpression network. Then, we designed two metrics to measure the value of gene differential coexpression according to the change of local topological structures between different phase-specific networks. Finally, we conducted meta-analysis of gene differential coexpression based on the rank-product method. Experimental results demonstrated the feasibility and effectiveness of DCGN and the superior performance of DCGN over other popular disease-related gene selection methods through real-world gene expression data sets.

  2. Differentially Coexpressed Disease Gene Identification Based on Gene Coexpression Network

    PubMed Central

    Quan, Xiongwen

    2016-01-01

    Screening disease-related genes by analyzing gene expression data has become a popular theme. Traditional disease-related gene selection methods always focus on identifying differentially expressed gene between case samples and a control group. These traditional methods may not fully consider the changes of interactions between genes at different cell states and the dynamic processes of gene expression levels during the disease progression. However, in order to understand the mechanism of disease, it is important to explore the dynamic changes of interactions between genes in biological networks at different cell states. In this study, we designed a novel framework to identify disease-related genes and developed a differentially coexpressed disease-related gene identification method based on gene coexpression network (DCGN) to screen differentially coexpressed genes. We firstly constructed phase-specific gene coexpression network using time-series gene expression data and defined the conception of differential coexpression of genes in coexpression network. Then, we designed two metrics to measure the value of gene differential coexpression according to the change of local topological structures between different phase-specific networks. Finally, we conducted meta-analysis of gene differential coexpression based on the rank-product method. Experimental results demonstrated the feasibility and effectiveness of DCGN and the superior performance of DCGN over other popular disease-related gene selection methods through real-world gene expression data sets. PMID:28042568

  3. COXPRESdb: a database of coexpressed gene networks in mammals.

    PubMed

    Obayashi, Takeshi; Hayashi, Shinpei; Shibaoka, Masayuki; Saeki, Motoshi; Ohta, Hiroyuki; Kinoshita, Kengo

    2008-01-01

    A database of coexpressed gene sets can provide valuable information for a wide variety of experimental designs, such as targeting of genes for functional identification, gene regulation and/or protein-protein interactions. Coexpressed gene databases derived from publicly available GeneChip data are widely used in Arabidopsis research, but platforms that examine coexpression for higher mammals are rather limited. Therefore, we have constructed a new database, COXPRESdb (coexpressed gene database) (http://coxpresdb.hgc.jp), for coexpressed gene lists and networks in human and mouse. Coexpression data could be calculated for 19 777 and 21 036 genes in human and mouse, respectively, by using the GeneChip data in NCBI GEO. COXPRESdb enables analysis of the four types of coexpression networks: (i) highly coexpressed genes for every gene, (ii) genes with the same GO annotation, (iii) genes expressed in the same tissue and (iv) user-defined gene sets. When the networks became too big for the static picture on the web in GO networks or in tissue networks, we used Google Maps API to visualize them interactively. COXPRESdb also provides a view to compare the human and mouse coexpression patterns to estimate the conservation between the two species.

  4. Gene Coexpression Network Topology of Cardiac Development, Hypertrophy, and Failure

    PubMed Central

    Dewey, Frederick E.; Perez, Marco V.; Wheeler, Matthew T.; Watt, Clifton; Spin, Joshua; Langfelder, Peter; Horvath, Stephen; Hannenhalli, Sridhar; Cappola, Thomas P.; Ashley, Euan A.

    2011-01-01

    Background Network analysis techniques allow a more accurate reflection of underlying systems biology to be realized than traditional unidimensional molecular biology approaches. Here, using gene coexpression network analysis, we define the gene expression network topology of cardiac hypertrophy and failure and the extent of recapitulation of fetal gene expression programs in failing and hypertrophied adult myocardium. Methods and Results We assembled all myocardial transcript data in the Gene Expression Omnibus (n = 1617). Since hierarchical analysis revealed species had primacy over disease clustering, we focused this analysis on the most complete (murine) dataset (n = 478). Using gene coexpression network analysis, we derived functional modules, regulatory mediators and higher order topological relationships between genes and identified 50 gene co-expression modules in developing myocardium that were not present in normal adult tissue. We found that known gene expression markers of myocardial adaptation were members of upregulated modules but not hub genes. We identified ZIC2 as a novel transcription factor associated with coexpression modules common to developing and failing myocardium. Of 50 fetal gene co-expression modules, three (6%) were reproduced in hypertrophied myocardium and seven (14%) were reproduced in failing myocardium. One fetal module was common to both failing and hypertrophied myocardium. Conclusions Network modeling allows systems analysis of cardiovascular development and disease. While we did not find evidence for a global coordinated program of fetal gene expression in adult myocardial adaptation, our analysis revealed specific gene expression modules active during both development and disease and specific candidates for their regulation. PMID:21127201

  5. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  6. Functional Module Analysis for Gene Coexpression Networks with Network Integration

    PubMed Central

    Zhang, Shuqin; Zhao, Hongyu

    2015-01-01

    Network has been a general tool for studying the complex interactions between different genes, proteins and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with 3 complete subgraphs, and 11 modules with 2 complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally. PMID:26451826

  7. Beyond Genomics: Studying Evolution with Gene Coexpression Networks.

    PubMed

    Ruprecht, Colin; Vaid, Neha; Proost, Sebastian; Persson, Staffan; Mutwil, Marek

    2017-04-01

    Understanding how genomes change as organisms become more complex is a central question in evolution. Molecular evolutionary studies typically correlate the appearance of genes and gene families with the emergence of biological pathways and morphological features. While such approaches are of great importance to understand how organisms evolve, they are also limited, as functionally related genes work together in contexts of dynamic gene networks. Since functionally related genes are often transcriptionally coregulated, gene coexpression networks present a resource to study the evolution of biological pathways. In this opinion article, we discuss recent developments in this field and how coexpression analyses can be merged with existing genomic approaches to transfer functional knowledge between species to study the appearance or extension of pathways.

  8. Gene Coexpression Network Analysis as a Source of Functional Annotation for Rice Genes

    PubMed Central

    Childs, Kevin L.; Davidson, Rebecca M.; Buell, C. Robin

    2011-01-01

    With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa) gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional annotation of those

  9. Random matrix analysis of localization properties of gene coexpression network.

    PubMed

    Jalan, Sarika; Solymosi, Norbert; Vattay, Gábor; Li, Baowen

    2010-04-01

    We analyze gene coexpression network under the random matrix theory framework. The nearest-neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range and deviates afterwards. Eigenvector analysis of the network using inverse participation ratio suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets: (a) The nondegenerate part that follows RMT. (b) The nondegenerate part, at both ends and at intermediate eigenvalues, which deviates from RMT and expected to contain information about important nodes in the network. (c) The degenerate part with zero eigenvalue, which fluctuates around RMT-predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties.

  10. Analysis of bHLH coding genes using gene co-expression network approach.

    PubMed

    Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

    2016-07-01

    Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.

  11. Investigating the Combinatory Effects of Biological Networks on Gene Co-expression

    PubMed Central

    Zhang, Cheng; Lee, Sunjae; Mardinoglu, Adil; Hua, Qiang

    2016-01-01

    Co-expressed genes often share similar functions, and gene co-expression networks have been widely used in studying the functionality of gene modules. Previous analysis indicated that genes are more likely to be co-expressed if they are either regulated by the same transcription factors, forming protein complexes or sharing similar topological properties in protein-protein interaction networks. Here, we reconstructed transcriptional regulatory and protein-protein networks for Saccharomyces cerevisiae using well-established databases, and we evaluated their co-expression activities using publically available gene expression data. Based on our network-dependent analysis, we found that genes that were co-regulated in the transcription regulatory networks and shared similar neighbors in the protein-protein networks were more likely to be co-expressed. Moreover, their biological functions were closely related. PMID:27445830

  12. Drug Repositioning through Systematic Mining of Gene Coexpression Networks in Cancer

    PubMed Central

    Ivliev, Alexander E.; ‘t Hoen, Peter A. C.; Borisevich, Dmitrii; Nikolsky, Yuri; Sergeeva, Marina G.

    2016-01-01

    Gene coexpression network analysis is a powerful “data-driven” approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise “meta-analysis” framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research. PMID:27824868

  13. Drug Repositioning through Systematic Mining of Gene Coexpression Networks in Cancer.

    PubMed

    Ivliev, Alexander E; 't Hoen, Peter A C; Borisevich, Dmitrii; Nikolsky, Yuri; Sergeeva, Marina G

    2016-01-01

    Gene coexpression network analysis is a powerful "data-driven" approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise "meta-analysis" framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research.

  14. Differentially correlated genes in co-expression networks control phenotype transitions.

    PubMed

    Thomas, Lina D; Vyshenska, Dariia; Shulzhenko, Natalia; Yambartsev, Anatoly; Morgun, Andrey

    2016-01-01

    Co-expression networks are a tool widely used for analysis of "Big Data" in biology that can range from transcriptomes to proteomes, metabolomes and more recently even microbiomes. Several methods were proposed to answer biological questions interrogating these networks. Differential co-expression analysis is a recent approach that measures how gene interactions change when a biological system transitions from one state to another. Although the importance of differentially co-expressed genes to identify dysregulated pathways has been noted, their role in gene regulation is not well studied. Herein we investigated differentially co-expressed genes in a relatively simple mono-causal process (B lymphocyte deficiency) and in a complex multi-causal system (cervical cancer). Co-expression networks of B cell deficiency (Control and BcKO) were reconstructed using Pearson correlation coefficient for two mus musculus datasets: B10.A strain (12 normal, 12 BcKO) and BALB/c strain (10 normal, 10 BcKO). Co-expression networks of cervical cancer (normal and cancer) were reconstructed using local partial correlation method for five datasets (total of 64 normal, 148 cancer). Differentially correlated pairs were identified along with the location of their genes in BcKO and in cancer networks. Minimum Shortest Path and Bi-partite Betweenness Centrality where statistically evaluated for differentially co-expressed genes in corresponding networks.    Results: We show that in B cell deficiency the differentially co-expressed genes are highly enriched with immunoglobulin genes (causal genes). In cancer we found that differentially co-expressed genes act as "bottlenecks" rather than causal drivers with most flows that come from the key driver genes to the peripheral genes passing through differentially co-expressed genes. Using in vitro knockdown experiments for two out of 14 differentially co-expressed genes found in cervical cancer (FGFR2 and CACYBP), we showed that they play

  15. Differentially correlated genes in co-expression networks control phenotype transitions

    PubMed Central

    Thomas, Lina D.; Vyshenska, Dariia; Shulzhenko, Natalia; Yambartsev, Anatoly; Morgun, Andrey

    2016-01-01

    Background: Co-expression networks are a tool widely used for analysis of “Big Data” in biology that can range from transcriptomes to proteomes, metabolomes and more recently even microbiomes. Several methods were proposed to answer biological questions interrogating these networks. Differential co-expression analysis is a recent approach that measures how gene interactions change when a biological system transitions from one state to another. Although the importance of differentially co-expressed genes to identify dysregulated pathways has been noted, their role in gene regulation is not well studied. Herein we investigated differentially co-expressed genes in a relatively simple mono-causal process (B lymphocyte deficiency) and in a complex multi-causal system (cervical cancer). Methods: Co-expression networks of B cell deficiency (Control and BcKO) were reconstructed using Pearson correlation coefficient for two mus musculus datasets: B10.A strain (12 normal, 12 BcKO) and BALB/c strain (10 normal, 10 BcKO). Co-expression networks of cervical cancer (normal and cancer) were reconstructed using local partial correlation method for five datasets (total of 64 normal, 148 cancer). Differentially correlated pairs were identified along with the location of their genes in BcKO and in cancer networks. Minimum Shortest Path and Bi-partite Betweenness Centrality where statistically evaluated for differentially co-expressed genes in corresponding networks.    Results: We show that in B cell deficiency the differentially co-expressed genes are highly enriched with immunoglobulin genes (causal genes). In cancer we found that differentially co-expressed genes act as “bottlenecks” rather than causal drivers with most flows that come from the key driver genes to the peripheral genes passing through differentially co-expressed genes. Using in vitro knockdown experiments for two out of 14 differentially co-expressed genes found in cervical cancer (FGFR2 and CACYBP), we

  16. Constructing gene co-expression networks and predicting functions of unknown genes by random matrix theory

    PubMed Central

    Luo, Feng; Yang, Yunfeng; Zhong, Jianxin; Gao, Haichun; Khan, Latifur; Thompson, Dorothea K; Zhou, Jizhong

    2007-01-01

    Background Large-scale sequencing of entire genomes has ushered in a new age in biology. One of the next grand challenges is to dissect the cellular networks consisting of many individual functional modules. Defining co-expression networks without ambiguity based on genome-wide microarray data is difficult and current methods are not robust and consistent with different data sets. This is particularly problematic for little understood organisms since not much existing biological knowledge can be exploited for determining the threshold to differentiate true correlation from random noise. Random matrix theory (RMT), which has been widely and successfully used in physics, is a powerful approach to distinguish system-specific, non-random properties embedded in complex systems from random noise. Here, we have hypothesized that the universal predictions of RMT are also applicable to biological systems and the correlation threshold can be determined by characterizing the correlation matrix of microarray profiles using random matrix theory. Results Application of random matrix theory to microarray data of S. oneidensis, E. coli, yeast, A. thaliana, Drosophila, mouse and human indicates that there is a sharp transition of nearest neighbour spacing distribution (NNSD) of correlation matrix after gradually removing certain elements insider the matrix. Testing on an in silico modular model has demonstrated that this transition can be used to determine the correlation threshold for revealing modular co-expression networks. The co-expression network derived from yeast cell cycling microarray data is supported by gene annotation. The topological properties of the resulting co-expression network agree well with the general properties of biological networks. Computational evaluations have showed that RMT approach is sensitive and robust. Furthermore, evaluation on sampled expression data of an in silico modular gene system has showed that under-sampled expressions do not affect the

  17. Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias.

    PubMed

    Li, Lin; Briskine, Roman; Schaefer, Robert; Schnable, Patrick S; Myers, Chad L; Flagel, Lex E; Springer, Nathan M; Muehlbauer, Gary J

    2016-11-04

    Gene duplication is prevalent in many species and can result in coding and regulatory divergence. Gene duplications can be classified as whole genome duplication (WGD), tandem and inserted (non-syntenic). In maize, WGD resulted in the subgenomes maize1 and maize2, of which maize1 is considered the dominant subgenome. However, the landscape of co-expression network divergence of duplicate genes in maize is still largely uncharacterized. To address the consequence of gene duplication on co-expression network divergence, we developed a gene co-expression network from RNA-seq data derived from 64 different tissues/stages of the maize reference inbred-B73. WGD, tandem and inserted gene duplications exhibited distinct regulatory divergence. Inserted duplicate genes were more likely to be singletons in the co-expression networks, while WGD duplicate genes were likely to be co-expressed with other genes. Tandem duplicate genes were enriched in the co-expression pattern where co-expressed genes were nearly identical for the duplicates in the network. Older gene duplications exhibit more extensive co-expression variation than younger duplications. Overall, non-syntenic genes primarily from inserted duplications show more co-expression divergence. Also, such enlarged co-expression divergence is significantly related to duplication age. Moreover, subgenome dominance was not observed in the co-expression networks - maize1 and maize2 exhibit similar levels of intra subgenome correlations. Intriguingly, the level of inter subgenome co-expression was similar to the level of intra subgenome correlations, and genes from specific subgenomes were not likely to be the enriched in co-expression network modules and the hub genes were not predominantly from any specific subgenomes in maize. Our work provides a comprehensive analysis of maize co-expression network divergence for three different types of gene duplications and identifies potential relationships between duplication types

  18. Predicting glioblastoma prognosis networks using weighted gene co-expression network analysis on TCGA data

    PubMed Central

    2012-01-01

    Background Using gene co-expression analysis, researchers were able to predict clusters of genes with consistent functions that are relevant to cancer development and prognosis. We applied a weighted gene co-expression network (WGCN) analysis algorithm on glioblastoma multiforme (GBM) data obtained from the TCGA project and predicted a set of gene co-expression networks which are related to GBM prognosis. Methods We modified the Quasi-Clique Merger algorithm (QCM algorithm) into edge-covering Quasi-Clique Merger algorithm (eQCM) for mining weighted sub-network in WGCN. Each sub-network is considered a set of features to separate patients into two groups using K-means algorithm. Survival times of the two groups are compared using log-rank test and Kaplan-Meier curves. Simulations using random sets of genes are carried out to determine the thresholds for log-rank test p-values for network selection. Sub-networks with p-values less than their corresponding thresholds were further merged into clusters based on overlap ratios (>50%). The functions for each cluster are analyzed using gene ontology enrichment analysis. Results Using the eQCM algorithm, we identified 8,124 sub-networks in the WGCN, out of which 170 sub-networks show p-values less than their corresponding thresholds. They were then merged into 16 clusters. Conclusions We identified 16 gene clusters associated with GBM prognosis using the eQCM algorithm. Our results not only confirmed previous findings including the importance of cell cycle and immune response in GBM, but also suggested important epigenetic events in GBM development and prognosis. PMID:22536863

  19. RiceFREND: a platform for retrieving coexpressed gene networks in rice.

    PubMed

    Sato, Yutaka; Namiki, Nobukazu; Takehisa, Hinako; Kamatsuki, Kaori; Minami, Hiroshi; Ikawa, Hiroshi; Ohyanagi, Hajime; Sugimoto, Kazuhiko; Itoh, Jun-Ichi; Antonio, Baltazar A; Nagamura, Yoshiaki

    2013-01-01

    Similarity of gene expression across a wide range of biological conditions can be efficiently used in characterization of gene function. We have constructed a rice gene coexpression database, RiceFREND (http://ricefrend.dna.affrc.go.jp/), to identify gene modules with similar expression profiles and provide a platform for more accurate prediction of gene functions. Coexpression analysis of 27 201 genes was performed against 815 microarray data derived from expression profiling of various organs and tissues at different developmental stages, mature organs throughout the growth from transplanting until harvesting in the field and plant hormone treatment conditions, using a single microarray platform. The database is provided with two search options, namely, 'single guide gene search' and 'multiple guide gene search' to efficiently retrieve information on coexpressed genes. A user-friendly web interface facilitates visualization and interpretation of gene coexpression networks in HyperTree, Cytoscape Web and Graphviz formats. In addition, analysis tools for identification of enriched Gene Ontology terms and cis-elements provide clue for better prediction of biological functions associated with the coexpressed genes. These features allow users to clarify gene functions and gene regulatory networks that could lead to a more thorough understanding of many complex agronomic traits.

  20. Identification of hub genes and pathways associated with retinoblastoma based on co-expression network analysis.

    PubMed

    Wang, Q L; Chen, X; Zhang, M H; Shen, Q H; Qin, Z M

    2015-12-08

    The objective of this paper was to identify hub genes and pathways associated with retinoblastoma using centrality analysis of the co-expression network and pathway-enrichment analysis. The co-expression network of retinoblastoma was constructed by weighted gene co-expression network analysis (WGCNA) based on differentially expressed (DE) genes, and clusters were obtained through the molecular complex detection (MCODE) algorithm. Degree centrality analysis of the co-expression network was performed to explore hub genes present in retinoblastoma. Pathway-enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Validation of hub gene expression in retinoblastoma was performed by reverse transcription-polymerase chain reaction (RT-PCR) analysis. The co-expression network based on 221 DE genes between retinoblastoma and normal controls consisted of 210 nodes and 3965 edges, and 5 clusters of the network were evaluated. By assessing the centrality analysis of the co-expression network, 21 hub genes were identified, such as SNORD115-41, RASSF2, and SNORD115-44. According to RT-PCR analysis, 16 of the 21 hub genes were differently expressed, including RASSF2 and CDCA7, and 5 were not differently expressed in retinoblastoma compared to normal controls. Pathway analysis showed that genes in 2 clusters were enriched in 3 pathways: purine metabolism, p53 signaling pathway, and melanogenesis. In this study, we successfully identified 16 hub genes and 3 pathways associated with retinoblastoma, which may be potential biomarkers for early detection and therapy for retinoblastoma.

  1. Annotation of gene function in citrus using gene expression information and co-expression networks

    PubMed Central

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks

  2. Annotation of gene function in citrus using gene expression information and co-expression networks.

    PubMed

    Wong, Darren C J; Sweetman, Crystal; Ford, Christopher M

    2014-07-15

    The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world's most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a "guilt-by-association" principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Integration of citrus gene co-expression networks, functional enrichment analysis and gene

  3. Construction of citrus gene coexpression networks from microarray data using random matrix theory.

    PubMed

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus.

  4. Construction of citrus gene coexpression networks from microarray data using random matrix theory

    PubMed Central

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G.

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  5. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    PubMed

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  6. Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

    PubMed Central

    Kumari, Sapna; Nie, Jeff; Chen, Huann-Sheng; Ma, Hao; Stewart, Ron; Li, Xiang; Lu, Meng-Zhu; Taylor, William M.; Wei, Hairong

    2012-01-01

    Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. PMID:23226279

  7. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering

    PubMed Central

    McDowell, Ian C.; Zhao, Shiwen; Brown, Christopher D.; Engelhardt, Barbara E.

    2016-01-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues. PMID:27467526

  8. Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory

    PubMed Central

    Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071

  9. Massive-scale gene co-expression network construction and robustness testing using random matrix theory.

    PubMed

    Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.

  10. Pan- and core- network analysis of co-expression genes in a model plant

    DOE PAGES

    He, Fei; Maslov, Sergei

    2016-12-16

    Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ andmore » ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.« less

  11. Pan- and core- network analysis of co-expression genes in a model plant

    SciTech Connect

    He, Fei; Maslov, Sergei

    2016-12-16

    Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ and ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. Here, we showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis.

  12. Pan- and core- network analysis of co-expression genes in a model plant

    PubMed Central

    He, Fei; Maslov, Sergei

    2016-01-01

    Genome-wide gene expression experiments have been performed using the model plant Arabidopsis during the last decade. Some studies involved construction of coexpression networks, a popular technique used to identify groups of co-regulated genes, to infer unknown gene functions. One approach is to construct a single coexpression network by combining multiple expression datasets generated in different labs. We advocate a complementary approach in which we construct a large collection of 134 coexpression networks based on expression datasets reported in individual publications. To this end we reanalyzed public expression data. To describe this collection of networks we introduced concepts of ‘pan-network’ and ‘core-network’ representing union and intersection between a sizeable fractions of individual networks, respectively. We showed that these two types of networks are different both in terms of their topology and biological function of interacting genes. For example, the modules of the pan-network are enriched in regulatory and signaling functions, while the modules of the core-network tend to include components of large macromolecular complexes such as ribosomes and photosynthetic machinery. Our analysis is aimed to help the plant research community to better explore the information contained within the existing vast collection of gene expression data in Arabidopsis. PMID:27982071

  13. Reconstructing differentially co-expressed gene modules and regulatory networks of soybean cells

    PubMed Central

    2012-01-01

    Background Current experimental evidence indicates that functionally related genes show coordinated expression in order to perform their cellular functions. In this way, the cell transcriptional machinery can respond optimally to internal or external stimuli. This provides a research opportunity to identify and study co-expressed gene modules whose transcription is controlled by shared gene regulatory networks. Results We developed and integrated a set of computational methods of differential gene expression analysis, gene clustering, gene network inference, gene function prediction, and DNA motif identification to automatically identify differentially co-expressed gene modules, reconstruct their regulatory networks, and validate their correctness. We tested the methods using microarray data derived from soybean cells grown under various stress conditions. Our methods were able to identify 42 coherent gene modules within which average gene expression correlation coefficients are greater than 0.8 and reconstruct their putative regulatory networks. A total of 32 modules and their regulatory networks were further validated by the coherence of predicted gene functions and the consistency of putative transcription factor binding motifs. Approximately half of the 32 modules were partially supported by the literature, which demonstrates that the bioinformatic methods used can help elucidate the molecular responses of soybean cells upon various environmental stresses. Conclusions The bioinformatics methods and genome-wide data sources for gene expression, clustering, regulation, and function analysis were integrated seamlessly into one modular protocol to systematically analyze and infer modules and networks from only differential expression genes in soybean cells grown under stress conditions. Our approach appears to effectively reduce the complexity of the problem, and is sufficiently robust and accurate to generate a rather complete and detailed view of putative soybean

  14. Inferring pathway crosstalk networks using gene set co-expression signatures.

    PubMed

    Wang, Ting; Gu, Jin; Yuan, Jun; Tao, Ran; Li, Yanda; Li, Shao

    2013-07-01

    Constructing molecular interaction networks in cells is important for understanding the underlying mechanisms of biological processes. Except for single gene analysis, several gene set-based methods have been proposed to infer pathway crosstalk by analyzing large-scale gene expression data. But most of them take all pathway genes as a whole to infer the crosstalk. Biological evidence suggests that the pathway crosstalk usually occurs between some subsets rather than the whole sets of pathway genes. In this study, we propose a novel method, sGSCA (signature-based gene set co-expression analysis) which can use the co-expression correlations between subsets of pathway genes to infer the pathway crosstalk networks. The method applies sparse canonical correlation analysis (sCCA) to measure the pathway level co-expression and simultaneously obtain the subsets or signature genes that contribute to the co-expression of pathways. On simulated datasets, sGSCA can efficiently detect pathway crosstalk and the corresponding highly correlated signature genes. We applied sGSCA to two cancer gene expression datasets (one for hepatocellular cancer and the other for lung cancer). In the inferred networks, we found several important pathway crosstalks related to the cancers. The identified signature genes also show high enrichment for the cancer related genes. sGSCA can infer pathway crosstalk networks using large-scale gene expression data, and should be a useful tool for systematically studying the molecular mechanisms of complex diseases on both pathway and gene levels at the same time.

  15. Weighted gene co-expression network analysis reveals key genes involved in pancreatic ductal adenocarcinoma development.

    PubMed

    Giulietti, Matteo; Occhipinti, Giulia; Principato, Giovanni; Piva, Francesco

    2016-08-01

    Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive malignancy. Up till now, the patient's prognosis remains poor which, among others, is due to the paucity of reliable early diagnostic biomarkers. In the past, candidate diagnostic biomarkers and therapeutic targets have been delineated from genes that were found to be differentially expressed in normal versus tumour samples. Recently, new systems biology approaches have been developed to analyse gene expression data, which may yield new biomarkers. As of yet, the weighted gene co-expression network analysis (WGCNA) tool has not been applied to PDAC microarray-based gene expression data. PDAC microarray-based gene expression datasets, listed in the Gene Expression Omnibus (GEO) database, were analysed. After pre-processing of the data, we built two final datasets, Normal and PDAC, encompassing 104 and 129 patient samples, respectively. Next, we constructed a weighted gene co-expression network and identified modules of co-expressed genes distinguishing normal from disease conditions. Functional annotations of the genes in these modules were carried out to highlight PDAC-associated molecular pathways and common regulatory mechanisms. Finally, overall survival analyses were carried out to assess the suitability of the genes identified as prognostic biomarkers. Using WGCNA, we identified several key genes that may play important roles in PDAC. These genes are mainly related to either endoplasmic reticulum, mitochondrion or membrane functions, exhibit transferase or hydrolase activities and are involved in biological processes such as lipid metabolism or transmembrane transport. As a validation of the applied method, we found that some of the identified key genes (CEACAM1, MCU, VDAC1, CYCS, C15ORF52, TMEM51, LARP1 and ERLIN2) have previously been reported by others as potential PDAC biomarkers. Using overall survival analyses, we found that several of the newly identified genes may serve as biomarkers to

  16. Evolutionary Conservation and Divergence of Gene Coexpression Networks in Gossypium (Cotton) Seeds.

    PubMed

    Hu, Guanjing; Hovav, Ran; Grover, Corrinne E; Faigenboim-Doron, Adi; Kadmon, Noa; Page, Justin T; Udall, Joshua A; Wendel, Jonathan F

    2016-12-01

    The cotton genus (Gossypium) provides a superior system for the study of diversification, genome evolution, polyploidization, and human-mediated selection. To gain insight into phenotypic diversification in cotton seeds, we conducted coexpression network analysis of developing seeds from diploid and allopolyploid cotton species and explored network properties. Key network modules and functional associations were identified related to seed oil content and seed weight. We compared species-specific networks to reveal topological changes, including rewired edges and differentially coexpressed genes, associated with speciation, polyploidy, and cotton domestication. Network comparisons among species indicate that topologies are altered in addition to gene expression profiles, indicating that changes in transcriptomic coexpression relationships play a role in the developmental architecture of cotton seed development. The global network topology of allopolyploids, especially for domesticated G. hirsutum, resembles the network of the A-genome diploid more than that of the D-genome parent, despite its D-like phenotype in oil content. Expression modifications associated with allopolyploidy include coexpression level dominance and transgressive expression, suggesting that the transcriptomic architecture in polyploids is to some extent a modular combination of that of its progenitor genomes. Among allopolyploids, intermodular relationships are more preserved between two different wild allopolyploid species than they are between wild and domesticated forms of a cultivated cotton, and regulatory connections of oil synthesis-related pathways are denser and more closely clustered in domesticated vs. wild G. hirsutum. These results demonstrate substantial modification of genic coexpression under domestication. Our work demonstrates how network inference informs our understanding of the transcriptomic architecture of phenotypic variation associated with temporal scales ranging from

  17. Evolutionary Conservation and Divergence of Gene Coexpression Networks in Gossypium (Cotton) Seeds

    PubMed Central

    Hu, Guanjing; Grover, Corrinne E.; Faigenboim-Doron, Adi; Kadmon, Noa; Page, Justin T.; Udall, Joshua A.

    2016-01-01

    The cotton genus (Gossypium) provides a superior system for the study of diversification, genome evolution, polyploidization, and human-mediated selection. To gain insight into phenotypic diversification in cotton seeds, we conducted coexpression network analysis of developing seeds from diploid and allopolyploid cotton species and explored network properties. Key network modules and functional associations were identified related to seed oil content and seed weight. We compared species-specific networks to reveal topological changes, including rewired edges and differentially coexpressed genes, associated with speciation, polyploidy, and cotton domestication. Network comparisons among species indicate that topologies are altered in addition to gene expression profiles, indicating that changes in transcriptomic coexpression relationships play a role in the developmental architecture of cotton seed development. The global network topology of allopolyploids, especially for domesticated G. hirsutum, resembles the network of the A-genome diploid more than that of the D-genome parent, despite its D-like phenotype in oil content. Expression modifications associated with allopolyploidy include coexpression level dominance and transgressive expression, suggesting that the transcriptomic architecture in polyploids is to some extent a modular combination of that of its progenitor genomes. Among allopolyploids, intermodular relationships are more preserved between two different wild allopolyploid species than they are between wild and domesticated forms of a cultivated cotton, and regulatory connections of oil synthesis-related pathways are denser and more closely clustered in domesticated vs. wild G. hirsutum. These results demonstrate substantial modification of genic coexpression under domestication. Our work demonstrates how network inference informs our understanding of the transcriptomic architecture of phenotypic variation associated with temporal scales ranging from

  18. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    PubMed Central

    Lim, Dajeong; Kim, Nam-Kuk; Lee, Seung-Hwan; Park, Hye-Sun; Cho, Yong-Min; Chai, Han-Ha; Kim, Heebal

    2014-01-01

    Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7) using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60) and dihydropyrimidine dehydrogenase (DPYD) are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness. PMID:24624372

  19. FastGCN: a GPU accelerated tool for fast gene co-expression networks.

    PubMed

    Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun

    2015-01-01

    Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out.

  20. FastGCN: A GPU Accelerated Tool for Fast Gene Co-Expression Networks

    PubMed Central

    Liang, Meimei; Zhang, Futao; Jin, Gulei; Zhu, Jun

    2015-01-01

    Gene co-expression networks comprise one type of valuable biological networks. Many methods and tools have been published to construct gene co-expression networks; however, most of these tools and methods are inconvenient and time consuming for large datasets. We have developed a user-friendly, accelerated and optimized tool for constructing gene co-expression networks that can fully harness the parallel nature of GPU (Graphic Processing Unit) architectures. Genetic entropies were exploited to filter out genes with no or small expression changes in the raw data preprocessing step. Pearson correlation coefficients were then calculated. After that, we normalized these coefficients and employed the False Discovery Rate to control the multiple tests. At last, modules identification was conducted to construct the co-expression networks. All of these calculations were implemented on a GPU. We also compressed the coefficient matrix to save space. We compared the performance of the GPU implementation with those of multi-core CPU implementations with 16 CPU threads, single-thread C/C++ implementation and single-thread R implementation. Our results show that GPU implementation largely outperforms single-thread C/C++ implementation and single-thread R implementation, and GPU implementation outperforms multi-core CPU implementation when the number of genes increases. With the test dataset containing 16,000 genes and 590 individuals, we can achieve greater than 63 times the speed using a GPU implementation compared with a single-thread R implementation when 50 percent of genes were filtered out and about 80 times the speed when no genes were filtered out. PMID:25602758

  1. Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

    PubMed Central

    Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

    2011-01-01

    Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235

  2. Inter-Tissue Gene Co-Expression Networks between Metabolically Healthy and Unhealthy Obese Individuals.

    PubMed

    Kogelman, Lisette J A; Fu, Jingyuan; Franke, Lude; Greve, Jan Willem; Hofker, Marten; Rensen, Sander S; Kadarmideen, Haja N

    2016-01-01

    Obesity is associated with severe co-morbidities such as type 2 diabetes and nonalcoholic steatohepatitis. However, studies have shown that 10-25 percent of the severely obese individuals are metabolically healthy. To date, the identification of genetic factors underlying the metabolically healthy obese (MHO) state is limited. Systems genetics approaches have led to the identification of genes and pathways in complex diseases. Here, we have used such approaches across tissues to detect genes and pathways involved in obesity-induced disease development. Expression data of 60 severely obese individuals was accessible, of which 28 individuals were MHO and 32 were metabolically unhealthy obese (MUO). A whole genome expression profile of four tissues was available: liver, muscle, subcutaneous adipose tissue and visceral adipose tissue. Using insulin-related genes, we used the weighted gene co-expression network analysis (WGCNA) method to build within- and inter-tissue gene networks. We identified genes that were differentially connected between MHO and MUO individuals, which were further investigated by homing in on the modules they were active in. To identify potentially causal genes, we integrated genomic and transcriptomic data using an eQTL mapping approach. Both IL-6 and IL1B were identified as highly differentially co-expressed genes across tissues between MHO and MUO individuals, showing their potential role in obesity-induced disease development. WGCNA showed that those genes were clustering together within tissues, and further analysis showed different co-expression patterns between MHO and MUO subnetworks. A potential causal role for metabolic differences under similar obesity state was detected for PTPRE, IL-6R and SLC6A5. We used a novel integrative approach by integration of co-expression networks across tissues to elucidate genetic factors related to obesity-induced metabolic disease development. The identified genes and their interactions give more

  3. Inter-Tissue Gene Co-Expression Networks between Metabolically Healthy and Unhealthy Obese Individuals

    PubMed Central

    Kogelman, Lisette J. A.; Fu, Jingyuan; Franke, Lude; Greve, Jan Willem; Hofker, Marten; Rensen, Sander S.; Kadarmideen, Haja N.

    2016-01-01

    Background Obesity is associated with severe co-morbidities such as type 2 diabetes and nonalcoholic steatohepatitis. However, studies have shown that 10–25 percent of the severely obese individuals are metabolically healthy. To date, the identification of genetic factors underlying the metabolically healthy obese (MHO) state is limited. Systems genetics approaches have led to the identification of genes and pathways in complex diseases. Here, we have used such approaches across tissues to detect genes and pathways involved in obesity-induced disease development. Methods Expression data of 60 severely obese individuals was accessible, of which 28 individuals were MHO and 32 were metabolically unhealthy obese (MUO). A whole genome expression profile of four tissues was available: liver, muscle, subcutaneous adipose tissue and visceral adipose tissue. Using insulin-related genes, we used the weighted gene co-expression network analysis (WGCNA) method to build within- and inter-tissue gene networks. We identified genes that were differentially connected between MHO and MUO individuals, which were further investigated by homing in on the modules they were active in. To identify potentially causal genes, we integrated genomic and transcriptomic data using an eQTL mapping approach. Results Both IL-6 and IL1B were identified as highly differentially co-expressed genes across tissues between MHO and MUO individuals, showing their potential role in obesity-induced disease development. WGCNA showed that those genes were clustering together within tissues, and further analysis showed different co-expression patterns between MHO and MUO subnetworks. A potential causal role for metabolic differences under similar obesity state was detected for PTPRE, IL-6R and SLC6A5. Conclusions We used a novel integrative approach by integration of co-expression networks across tissues to elucidate genetic factors related to obesity-induced metabolic disease development. The identified

  4. Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

    PubMed

    Obayashi, Takeshi; Kinoshita, Kengo

    2010-05-01

    Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.

  5. Discovering missing reactions of metabolic networks by using gene co-expression data

    PubMed Central

    Hosseini, Zhaleh; Marashi, Sayed-Amir

    2017-01-01

    Flux coupling analysis is a computational method which is able to explain co-expression of metabolic genes by analyzing the topological structure of a metabolic network. It has been suggested that if genes in two seemingly fully-coupled reactions are not highly co-expressed, then these two reactions are not fully coupled in reality, and hence, there is a gap or missing reaction in the network. Here, we present GAUGE as a novel approach for gap filling of metabolic networks, which is a two-step algorithm based on a mixed integer linear programming formulation. In GAUGE, the discrepancies between experimental co-expression data and predicted flux coupling relations is minimized by adding a minimum number of reactions to the network. We show that GAUGE is able to predict missing reactions of E. coli metabolism that are not detectable by other popular gap filling approaches. We propose that our algorithm may be used as a complementary strategy for the gap filling problem of metabolic networks. Since GAUGE relies only on gene expression data, it can be potentially useful for exploring missing reactions in the metabolism of non-model organisms, which are often poorly characterized, cannot grow in the laboratory, and lack genetic tools for generating knockouts. PMID:28150713

  6. Discovering missing reactions of metabolic networks by using gene co-expression data

    NASA Astrophysics Data System (ADS)

    Hosseini, Zhaleh; Marashi, Sayed-Amir

    2017-02-01

    Flux coupling analysis is a computational method which is able to explain co-expression of metabolic genes by analyzing the topological structure of a metabolic network. It has been suggested that if genes in two seemingly fully-coupled reactions are not highly co-expressed, then these two reactions are not fully coupled in reality, and hence, there is a gap or missing reaction in the network. Here, we present GAUGE as a novel approach for gap filling of metabolic networks, which is a two-step algorithm based on a mixed integer linear programming formulation. In GAUGE, the discrepancies between experimental co-expression data and predicted flux coupling relations is minimized by adding a minimum number of reactions to the network. We show that GAUGE is able to predict missing reactions of E. coli metabolism that are not detectable by other popular gap filling approaches. We propose that our algorithm may be used as a complementary strategy for the gap filling problem of metabolic networks. Since GAUGE relies only on gene expression data, it can be potentially useful for exploring missing reactions in the metabolism of non-model organisms, which are often poorly characterized, cannot grow in the laboratory, and lack genetic tools for generating knockouts.

  7. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis

    PubMed Central

    Creanza, Teresa Maria; Liguori, Maria; Liuni, Sabino; Nuzziello, Nicoletta; Ancona, Nicola

    2016-01-01

    Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment. PMID:27314336

  8. Gene Coexpression Analyses Differentiate Networks Associated with Diverse Cancers Harboring TP53 Missense or Null Mutations

    PubMed Central

    Oros Klein, Kathleen; Oualkacha, Karim; Lafond, Marie-Hélène; Bhatnagar, Sahir; Tonin, Patricia N.; Greenwood, Celia M. T.

    2016-01-01

    In a variety of solid cancers, missense mutations in the well-established TP53 tumor suppressor gene may lead to the presence of a partially-functioning protein molecule, whereas mutations affecting the protein encoding reading frame, often referred to as null mutations, result in the absence of p53 protein. Both types of mutations have been observed in the same cancer type. As the resulting tumor biology may be quite different between these two groups, we used RNA-sequencing data from The Cancer Genome Atlas (TCGA) from four different cancers with poor prognosis, namely ovarian, breast, lung and skin cancers, to compare the patterns of coexpression of genes in tumors grouped according to their TP53 missense or null mutation status. We used Weighted Gene Coexpression Network analysis (WGCNA) and a new test statistic built on differences between groups in the measures of gene connectivity. For each cancer, our analysis identified a set of genes showing differential coexpression patterns between the TP53 missense- and null mutation-carrying groups that was robust to the choice of the tuning parameter in WGCNA. After comparing these sets of genes across the four cancers, one gene (KIR3DL2) consistently showed differential coexpression patterns between the null and missense groups. KIR3DL2 is known to play an important role in regulating the immune response, which is consistent with our observation that this gene's strongly-correlated partners implicated many immune-related pathways. Examining mutation-type-related changes in correlations between sets of genes may provide new insight into tumor biology. PMID:27536319

  9. Co-expression network analysis and genetic algorithms for gene prioritization in preeclampsia.

    PubMed

    Tejera, Eduardo; Bernardes, João; Rebelo, Irene

    2013-11-12

    In this study, we explored the gene prioritization in preeclampsia, combining co-expression network analysis and genetic algorithms optimization approaches. We analysed five public projects obtaining 1,146 significant genes after cross-platform and processing of 81 and 149 microarrays in preeclamptic and normal conditions, respectively. After co-expression network construction, modular and node analysis were performed using several approaches. Moreover, genetic algorithms were also applied in combination with the nearest neighbour and discriminant analysis classification methods. Significant differences were found in the genes connectivity distribution, both in normal and preeclampsia conditions pointing to the need and importance of examining connectivity alongside expression for prioritization. We discuss the global as well as intra-modular connectivity for hubs detection and also the utility of genetic algorithms in combination with the network information. FLT1, LEP, INHA and ENG genes were identified according to the literature, however, we also found other genes as FLNB, INHBA, NDRG1 and LYN highly significant but underexplored during normal pregnancy or preeclampsia. Weighted genes co-expression network analysis reveals a similar distribution along the modules detected both in normal and preeclampsia conditions. However, major differences were obtained by analysing the nodes connectivity. All models obtained by genetic algorithm procedures were consistent with a correct classification, higher than 90%, restricting to 30 variables in both classification methods applied.Combining the two methods we identified well known genes related to preeclampsia, but also lead us to propose new candidates poorly explored or completely unknown in the pathogenesis of preeclampsia, which may have to be validated experimentally.

  10. Chronic ethanol exposure produces time- and brain region-dependent changes in gene coexpression networks.

    PubMed

    Osterndorff-Kahanek, Elizabeth A; Becker, Howard C; Lopez, Marcelo F; Farris, Sean P; Tiwari, Gayatri R; Nunez, Yury O; Harris, R Adron; Mayfield, R Dayne

    2015-01-01

    Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY), nucleus accumbens (NAC), prefrontal cortex (PFC), and liver after four weekly cycles of chronic intermittent ethanol (CIE) vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000) at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600). Within each region, there was little gene overlap across time (~20%). All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global 'rewiring' of coexpression systems involving glial and immune signaling as well as neuronal genes.

  11. Gene co-expression networks shed light into diseases of brain iron accumulation

    PubMed Central

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M.; Botía, Juan A.; Collingwood, Joanna F.; Hardy, John; Milward, Elizabeth A.; Ryten, Mina; Houlden, Henry

    2016-01-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700

  12. Effects of threshold on the topology of gene co-expression networks.

    PubMed

    Couto, Cynthia Martins Villar; Comin, César Henrique; Costa, Luciano da Fontoura

    2017-09-26

    Several developments regarding the analysis of gene co-expression profiles using complex network theory have been reported recently. Such approaches usually start with the construction of an unweighted gene co-expression network, therefore requiring the selection of a suitable threshold defining which pairs of vertices will be connected. We aimed at addressing such an important problem by suggesting and comparing five different approaches for threshold selection. Each of the methods considers a respective biologically-motivated criterion for electing a potentially suitable threshold. A set of 21 microarray experiments from different biological groups was used to investigate the effect of applying the five proposed criteria to several biological situations. For each experiment, we used the Pearson correlation coefficient to measure the relationship between each gene pair, and the resulting weight matrices were thresholded considering several values, generating respective adjacency matrices (co-expression networks). Each of the five proposed criteria was then applied in order to select the respective threshold value. The effects of these thresholding approaches on the topology of the resulting networks were compared by using several measurements, and we verified that, depending on the database, the impact on the topological properties can be large. However, a group of databases was verified to be similarly affected by most of the considered criteria. Based on such results, it can be suggested that when the generated networks present similar measurements, the thresholding method can be chosen with greater freedom. If the generated networks are markedly different, the thresholding method that better suits the interests of each specific research study represents a reasonable choice.

  13. The Structure of a Gene Co-Expression Network Reveals Biological Functions Underlying eQTLs

    PubMed Central

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology. PMID:23577081

  14. The structure of a gene co-expression network reveals biological functions underlying eQTLs.

    PubMed

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology.

  15. A contribution to the study of plant development evolution based on gene co-expression networks

    PubMed Central

    Romero-Campero, Francisco J.; Lucas-Reina, Eva; Said, Fatima E.; Romero, José M.; Valverde, Federico

    2013-01-01

    Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms. PMID:23935602

  16. Identification of Common Regulators of Genes in Co-Expression Networks Affecting Muscle and Meat Properties

    PubMed Central

    Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2015-01-01

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network analysis (WGCNA) groups genes into modules based on patterns of co-expression, which can be linked to phenotypes by correlation analysis of trait values and the module eigengenes, i.e. the first principal component of a given module. Network hub genes and regulators of the genes in the modules are likely to play an important role in the emergence of respective traits. In order to detect common regulators of genes in modules showing association with meat quality traits, we identified eQTL for each of these genes, including the highly connected hub genes. Additionally, the module eigengene values were used for association analyses in order to derive a joint eQTL for the respective module. Thereby major sites of orchestrated regulation of genes within trait-associated modules were detected as hotspots of eQTL of many genes of a module and of its eigengene. These sites harbor likely common regulators of genes in the modules. We exemplarily showed the consistent impact of candidate common regulators on the expression of members of respective modules by RNAi knockdown experiments. In fact, Cxcr7 was identified and validated as a regulator of genes in a module, which is involved in the function of defense response in muscle cells. Zfp36l2 was confirmed as a regulator of genes of a module related to cell death or apoptosis pathways. The integration of eQTL in module networks enabled to interpret the differentially-regulated genes from a systems perspective. By integrating genome-wide genomic and transcriptomic data, employing co-expression and eQTL analyses, the study revealed likely regulators that are involved in the fine-tuning and synchronization of genes with trait

  17. Uncovering co-expression gene network modules regulating fruit acidity in diverse apples.

    PubMed

    Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Zhong, Gan-Yuan; Xu, Kenong

    2015-08-16

    Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that putatively encodes a vacuolar aluminum-activated malate transporter1 (ALMT1)-like protein is a strong candidate gene. We hypothesize that fruit acidity is governed by a gene network in which Ma1 is key member. The goal of this study is to identify the gene network and the potential mechanisms through which the network operates. Guided by Ma1, we analyzed the transcriptomes of mature fruit of contrasting acidity from six apple accessions of genotype Ma_ (MaMa or Mama) and four of mama using RNA-seq and identified 1301 fruit acidity associated genes, among which 18 were most significant acidity genes (MSAGs). Network inferring using weighted gene co-expression network analysis (WGCNA) revealed five co-expression gene network modules of significant (P < 0.001) correlation with malate. Of these, the Ma1 containing module (Turquoise) of 336 genes showed the highest correlation (0.79). We also identified 12 intramodular hub genes from each of the five modules and 18 enriched gene ontology (GO) terms and MapMan sub-bines, including two GO terms (GO:0015979 and GO:0009765) and two MapMap sub-bins (1.3.4 and 1.1.1.1) related to photosynthesis in module Turquoise. Using Lemon-Tree algorithms, we identified 12 regulator genes of probabilistic scores 35.5-81.0, including MDP0000525602 (a LLR receptor kinase), MDP0000319170 (an IQD2-like CaM binding protein) and MDP0000190273 (an EIN3-like transcription factor) of greater interest for being one of the 18 MSAGs or one of the 12 intramodular hub genes in Turquoise, and/or a regulator to the cluster containing Ma1. The most relevant finding of this study is the identification of the MSAGs, intramodular hub genes, enriched photosynthesis related processes, and regulator genes in a

  18. Tissue and cell-type co-expression networks of transcription factors and wood component genes in Populus trichocarpa.

    PubMed

    Shi, Rui; Wang, Jack P; Lin, Ying-Chung; Li, Quanzi; Sun, Ying-Hsuan; Chen, Hao; Sederoff, Ronald R; Chiang, Vincent L

    2017-05-01

    Co-expression networks based on transcriptomes of Populus trichocarpa major tissues and specific cell types suggest redundant control of cell wall component biosynthetic genes by transcription factors in wood formation. We analyzed the transcriptomes of five tissues (xylem, phloem, shoot, leaf, and root) and two wood forming cell types (fiber and vessel) of Populus trichocarpa to assemble gene co-expression subnetworks associated with wood formation. We identified 165 transcription factors (TFs) that showed xylem-, fiber-, and vessel-specific expression. Of these 165 TFs, 101 co-expressed (correlation coefficient, r > 0.7) with the 45 secondary cell wall cellulose, hemicellulose, and lignin biosynthetic genes. Each cell wall component gene co-expressed on average with 34 TFs, suggesting redundant control of the cell wall component gene expression. Co-expression analysis showed that the 101 TFs and the 45 cell wall component genes each has two distinct groups (groups 1 and 2), based on their co-expression patterns. The group 1 TFs (44 members) are predominantly xylem and fiber specific, and are all highly positively co-expressed with the group 1 cell wall component genes (30 members), suggesting their roles as major wood formation regulators. Group 1 TFs include a lateral organ boundary domain gene (LBD) that has the highest number of positively correlated cell wall component genes (36) and TFs (47). The group 2 TFs have 57 members, including 14 vessel-specific TFs, and are generally less correlated with the cell wall component genes. An exception is a vessel-specific basic helix-loop-helix (bHLH) gene that negatively correlates with 20 cell wall component genes, and may function as a key transcriptional suppressor. The co-expression networks revealed here suggest a well-structured transcriptional homeostasis for cell wall component biosynthesis during wood formation.

  19. Construction and Optimization of a Large Gene Coexpression Network in Maize Using RNA-Seq Data.

    PubMed

    Huang, Ji; Vendramin, Stefania; Shi, Lizhen; McGinnis, Karen M

    2017-09-01

    With the emergence of massively parallel sequencing, genomewide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a gene coexpression network (GCN) can be constructed and used for gene function prediction, candidate gene selection, and improving understanding of regulatory pathways. Several GCN studies have been done in maize (Zea mays), mostly using microarray datasets. To build an optimal GCN from plant materials RNA-Seq data, parameters for expression data normalization and network inference were evaluated. A comprehensive evaluation of these two parameters and a ranked aggregation strategy on network performance, using libraries from 1266 maize samples, were conducted. Three normalization methods and 10 inference methods, including six correlation and four mutual information methods, were tested. The three normalization methods had very similar performance. For network inference, correlation methods performed better than mutual information methods at some genes. Increasing sample size also had a positive effect on GCN. Aggregating single networks together resulted in improved performance compared to single networks. © 2017 American Society of Plant Biologists. All Rights Reserved.

  20. Frontotemporal dementia: insights into the biological underpinnings of disease through gene co-expression network analysis.

    PubMed

    Ferrari, Raffaele; Forabosco, Paola; Vandrovcova, Jana; Botía, Juan A; Guelfi, Sebastian; Warren, Jason D; Momeni, Parastoo; Weale, Michael E; Ryten, Mina; Hardy, John

    2016-02-24

    In frontotemporal dementia (FTD) there is a critical lack in the understanding of biological and molecular mechanisms involved in disease pathogenesis. The heterogeneous genetic features associated with FTD suggest that multiple disease-mechanisms are likely to contribute to the development of this neurodegenerative condition. We here present a systems biology approach with the scope of i) shedding light on the biological processes potentially implicated in the pathogenesis of FTD and ii) identifying novel potential risk factors for FTD. We performed a gene co-expression network analysis of microarray expression data from 101 individuals without neurodegenerative diseases to explore regional-specific co-expression patterns in the frontal and temporal cortices for 12 genes (MAPT, GRN, CHMP2B, CTSC, HLA-DRA, TMEM106B, C9orf72, VCP, UBQLN2, OPTN, TARDBP and FUS) associated with FTD and we then carried out gene set enrichment and pathway analyses, and investigated known protein-protein interactors (PPIs) of FTD-genes products. Gene co-expression networks revealed that several FTD-genes (such as MAPT and GRN, CTSC and HLA-DRA, TMEM106B, and C9orf72, VCP, UBQLN2 and OPTN) were clustering in modules of relevance in the frontal and temporal cortices. Functional annotation and pathway analyses of such modules indicated enrichment for: i) DNA metabolism, i.e. transcription regulation, DNA protection and chromatin remodelling (MAPT and GRN modules); ii) immune and lysosomal processes (CTSC and HLA-DRA modules), and; iii) protein meta/catabolism (C9orf72, VCP, UBQLN2 and OPTN, and TMEM106B modules). PPI analysis supported the results of the functional annotation and pathway analyses. This work further characterizes known FTD-genes and elaborates on their biological relevance to disease: not only do we indicate likely impacted regional-specific biological processes driven by FTD-genes containing modules, but also do we suggest novel potential risk factors among the FTD-genes

  1. Weighted gene co-expression network analysis in identification of endometrial cancer prognosis markers.

    PubMed

    Zhu, Xiao-Lu; Ai, Zhi-Hong; Wang, Juan; Xu, Yan-Li; Teng, Yin-Cheng

    2012-01-01

    Endometrial cancer (EC) is the most common gynecologic malignancy. Identification of potential biomarkers of EC would be helpful for the detection and monitoring of malignancy, improving clinical outcomes. The Weighted Gene Co-expression Network Analysis method was used to identify prognostic markers for EC in this study. Moreover, underlying molecular mechanisms were characterized by KEGG pathway enrichment and transcriptional regulation analyses. Seven gene co-expression modules were obtained, but only the turquoise module was positively related with EC stage. Among the genes in the turquoise module, COL5A2 (collagen, type V, alpha 2) could be regulated by PBX (pre-B-cell leukemia homeobox 1)1/2 and HOXB1(homeobox B1) transcription factors to be involved in the focal adhesion pathway; CENP-E (centromere protein E, 312kDa) by E2F4 (E2F transcription factor 4, p107/p130-binding); MYCN (v-myc myelocytomatosis viral related oncogene, neuroblastoma derived [avian]) by PAX5 (paired box 5); and BCL-2 (B-cell CLL/ lymphoma 2) and IGFBP-6 (insulin-like growth factor binding protein 6) by GLI1. They were predicted to be associated with EC progression via Hedgehog signaling and other cancer related-pathways. These data on transcriptional regulation may provide a better understanding of molecular mechanisms and clues to potential therapeutic targets in the treatment of EC.

  2. Gene coexpression network analysis of oil biosynthesis in an interspecific backcross of oil palm.

    PubMed

    Guerin, Chloé; Joët, Thierry; Serret, Julien; Lashermes, Philippe; Vaissayre, Virginie; Agbessi, Mawussé D T; Beulé, Thierry; Severac, Dany; Amblard, Philippe; Tregear, James; Durand-Gasselin, Tristan; Morcillo, Fabienne; Dussert, Stéphane

    2016-09-01

    Global demand for vegetable oils is increasing at a dramatic rate, while our understanding of the regulation of oil biosynthesis in plants remains limited. To gain insights into the mechanisms that govern oil synthesis and fatty acid (FA) composition in the oil palm fruit, we used a multilevel approach combining gene coexpression analysis, quantification of allele-specific expression and joint multivariate analysis of transcriptomic and lipid data, in an interspecific backcross population between the African oil palm, Elaeis guineensis, and the American oil palm, Elaeis oleifera, which display contrasting oil contents and FA compositions. The gene coexpression network produced revealed tight transcriptional coordination of fatty acid synthesis (FAS) in the plastid with sugar sensing, plastidial glycolysis, transient starch storage and carbon recapture pathways. It also revealed a concerted regulation, along with FAS, of both the transfer of nascent FA to the endoplasmic reticulum, where triacylglycerol assembly occurs, and of the production of glycerol-3-phosphate, which provides the backbone of triacylglycerols. Plastid biogenesis and auxin transport were the two other biological processes most tightly connected to FAS in the network. In addition to WRINKLED1, a transcription factor (TF) known to activate FAS genes, two novel TFs, termed NF-YB-1 and ZFP-1, were found at the core of the FAS module. The saturated FA content of palm oil appeared to vary above all in relation to the level of transcripts of the gene coding for β-ketoacyl-acyl carrier protein synthase II. Our findings should facilitate the development of breeding and engineering strategies in this and other oil crops. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

  3. Weighted gene co-expression network analysis identifies specific modules and hub genes related to coronary artery disease.

    PubMed

    Liu, Jing; Jing, Ling; Tu, Xilin

    2016-03-05

    The analysis of the potential molecule targets of coronary artery disease (CAD) is critical for understanding the molecular mechanisms of disease. However, studies of global microarray gene co-expression analysis of CAD still remain limited. Microarray data of CAD (GSE23561) were downloaded from Gene Expression Omnibus, including peripheral blood samples from CAD patients (n = 6) and controls (n = 9). Limma package in R was used to identify the differentially expressed genes (DEGs) between CAD and control samples. Using weighted gene co-expression network analysis (WGCNA) package in R, WGCNA was performed to identify significant modules in the network. Then, functional and pathway enrichment analyses were conducted for genes in the most significant module using DAVID software. Moreover, hub genes in the module were analyzed by isubpathwayminer package in R and GenCLiP 2.0 tool to identify the significant sub-pathways. Total 3711 DEGs and 21 modules for them were identified in CAD samples. The most significant module was associated with the pathways of hypertrophic cardiomyopathy and membrane related functions. In addition, the top 30 hub genes with high connectivity in the module were selected, and two genes (G6PD and S100A7) were taken as key molecules via sub-pathway screening and data mining. A module associated with hypertrophic cardiomyopathy pathway was detected in CAD samples. G6PD and S100A7 were the potential targets in CAD. Our finding might provide novel insight into the underlying molecular mechanism of CAD.

  4. Broad integration of expression maps and co-expression networks compassing novel gene functions in the brain.

    PubMed

    Okamura-Oho, Yuko; Shimokawa, Kazuro; Nishimura, Masaomi; Takemoto, Satoko; Sato, Akira; Furuichi, Teiichi; Yokota, Hideo

    2014-11-10

    Using a recently invented technique for gene expression mapping in the whole-anatomy context, termed transcriptome tomography, we have generated a dataset of 36,000 maps of overall gene expression in the adult-mouse brain. Here, using an informatics approach, we identified a broad co-expression network that follows an inverse power law and is rich in functional interaction and gene-ontology terms. Our framework for the integrated analysis of expression maps and graphs of co-expression networks revealed that groups of combinatorially expressed genes, which regulate cell differentiation during development, were present in the adult brain and each of these groups was associated with a discrete cell types. These groups included non-coding genes of unknown function. We found that these genes specifically linked developmentally conserved groups in the network. A previously unrecognized robust expression pattern covering the whole brain was related to the molecular anatomy of key biological processes occurring in particular areas.

  5. ALCOdb: Gene Coexpression Database for Microalgae.

    PubMed

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems.

  6. ALCOdb: Gene Coexpression Database for Microalgae

    PubMed Central

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems. PMID:26644461

  7. An expression atlas of human primary cells: inference of gene function from coexpression networks

    PubMed Central

    2013-01-01

    Background The specialisation of mammalian cells in time and space requires genes associated with specific pathways and functions to be co-ordinately expressed. Here we have combined a large number of publically available microarray datasets derived from human primary cells and analysed large correlation graphs of these data. Results Using the network analysis tool BioLayout Express3D we identify robust co-associations of genes expressed in a wide variety of cell lineages. We discuss the biological significance of a number of these associations, in particular the coexpression of key transcription factors with the genes that they are likely to control. Conclusions We consider the regulation of genes in human primary cells and specifically in the human mononuclear phagocyte system. Of particular note is the fact that these data do not support the identity of putative markers of antigen-presenting dendritic cells, nor classification of M1 and M2 activation states, a current subject of debate within immunological field. We have provided this data resource on the BioGPS web site (http://biogps.org/dataset/2429/primary-cell-atlas/) and on macrophages.com (http://www.macrophages.com/hu-cell-atlas). PMID:24053356

  8. An expression atlas of human primary cells: inference of gene function from coexpression networks.

    PubMed

    Mabbott, Neil A; Baillie, J Kenneth; Brown, Helen; Freeman, Tom C; Hume, David A

    2013-09-20

    The specialisation of mammalian cells in time and space requires genes associated with specific pathways and functions to be co-ordinately expressed. Here we have combined a large number of publically available microarray datasets derived from human primary cells and analysed large correlation graphs of these data. Using the network analysis tool BioLayout Express3D we identify robust co-associations of genes expressed in a wide variety of cell lineages. We discuss the biological significance of a number of these associations, in particular the coexpression of key transcription factors with the genes that they are likely to control. We consider the regulation of genes in human primary cells and specifically in the human mononuclear phagocyte system. Of particular note is the fact that these data do not support the identity of putative markers of antigen-presenting dendritic cells, nor classification of M1 and M2 activation states, a current subject of debate within immunological field. We have provided this data resource on the BioGPS web site (http://biogps.org/dataset/2429/primary-cell-atlas/) and on macrophages.com (http://www.macrophages.com/hu-cell-atlas).

  9. Incorporating Motif Analysis into Gene Co-expression Networks Reveals Novel Modular Expression Pattern and New Signaling Pathways

    PubMed Central

    Ma, Shisong; Shah, Smit; Bohnert, Hans J.; Snyder, Michael; Dinesh-Kumar, Savithramma P.

    2013-01-01

    Understanding of gene regulatory networks requires discovery of expression modules within gene co-expression networks and identification of promoter motifs and corresponding transcription factors that regulate their expression. A commonly used method for this purpose is a top-down approach based on clustering the network into a range of densely connected segments, treating these segments as expression modules, and extracting promoter motifs from these modules. Here, we describe a novel bottom-up approach to identify gene expression modules driven by known cis-regulatory motifs in the gene promoters. For a specific motif, genes in the co-expression network are ranked according to their probability of belonging to an expression module regulated by that motif. The ranking is conducted via motif enrichment or motif position bias analysis. Our results indicate that motif position bias analysis is an effective tool for genome-wide motif analysis. Sub-networks containing the top ranked genes are extracted and analyzed for inherent gene expression modules. This approach identified novel expression modules for the G-box, W-box, site II, and MYB motifs from an Arabidopsis thaliana gene co-expression network based on the graphical Gaussian model. The novel expression modules include those involved in house-keeping functions, primary and secondary metabolism, and abiotic and biotic stress responses. In addition to confirmation of previously described modules, we identified modules that include new signaling pathways. To associate transcription factors that regulate genes in these co-expression modules, we developed a novel reporter system. Using this approach, we evaluated MYB transcription factor-promoter interactions within MYB motif modules. PMID:24098147

  10. Large-scale gene co-expression network as a source of functional annotation for cattle genes.

    PubMed

    Beiki, Hamid; Nejati-Javaremi, Ardeshir; Pakdel, Abbas; Masoudi-Nejad, Ali; Hu, Zhi-Liang; Reecy, James M

    2016-11-02

    Genome sequencing and subsequent gene annotation of genomes has led to the elucidation of many genes, but in vertebrates the actual number of protein coding genes are very consistent across species (~20,000). Seven years after sequencing the cattle genome, there are still genes that have limited annotation and the function of many genes are still not understood, or partly understood at best. Based on the assumption that genes with similar patterns of expression across a vast array of tissues and experimental conditions are likely to encode proteins with related functions or participate within a given pathway, we constructed a genome-wide Cattle Gene Co-expression Network (CGCN) using 72 microarray datasets that contained a total of 1470 Affymetrix Genechip Bovine Genome Arrays that were retrieved from either NCBI GEO or EBI ArrayExpress. The total of 16,607 probe sets, which represented 11,397 genes, with unique Entrez ID were consolidated into 32 co-expression modules that contained between 29 and 2569 probe sets. All of the identified modules showed strong functional enrichment for gene ontology (GO) terms and Reactome pathways. For example, modules with important biological functions such as response to virus, response to bacteria, energy metabolism, cell signaling and cell cycle have been identified. Moreover, gene co-expression networks using "guilt-by-association" principle have been used to predict the potential function of 132 genes with no functional annotation. Four unknown Hub genes were identified in modules highly enriched for GO terms related to leukocyte activation (LOC509513), RNA processing (LOC100848208), nucleic acid metabolic process (LOC100850151) and organic-acid metabolic process (MGC137211). Such highly connected genes should be investigated more closely as they likely to have key regulatory roles. We have demonstrated that the CGCN and its corresponding regulons provides rich information for experimental biologists to design experiments

  11. Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks.

    PubMed

    Ray, Monika; Yunis, Reem; Chen, Xiucui; Rocke, David M

    2012-05-17

    The growing use of imaging procedures in medicine has raised concerns about exposure to low-dose ionising radiation (LDIR). While the disastrous effects of high dose ionising radiation (HDIR) is well documented, the detrimental effects of LDIR is not well understood and has been a topic of much debate. Since little is known about the effects of LDIR, various kinds of wet-lab and computational analyses are required to advance knowledge in this domain. In this paper we carry out an "upside-down pyramid" form of systems biology analysis of microarray data. We characterised the global genomic response following 10 cGy (low dose) and 100 cGy (high dose) doses of X-ray ionising radiation at four time points by analysing the topology of gene coexpression networks. This study includes a rich experimental design and state-of-the-art computational systems biology methods of analysis to study the differences in the transcriptional response of skin cells exposed to low and high doses of radiation. Using this method we found important genes that have been linked to immune response, cell survival and apoptosis. Furthermore, we also were able to identify genes such as BRCA1, ABCA1, TNFRSF1B, MLLT11 that have been associated with various types of cancers. We were also able to detect many genes known to be associated with various medical conditions. Our method of applying network topological differences can aid in identifying the differences among similar (eg: radiation effect) yet very different biological conditions (eg: different dose and time) to generate testable hypotheses. This is the first study where a network level analysis was performed across two different radiation doses at various time points, thereby illustrating changes in the cellular response over time.

  12. Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks

    PubMed Central

    2012-01-01

    Background The growing use of imaging procedures in medicine has raised concerns about exposure to low-dose ionising radiation (LDIR). While the disastrous effects of high dose ionising radiation (HDIR) is well documented, the detrimental effects of LDIR is not well understood and has been a topic of much debate. Since little is known about the effects of LDIR, various kinds of wet-lab and computational analyses are required to advance knowledge in this domain. In this paper we carry out an “upside-down pyramid” form of systems biology analysis of microarray data. We characterised the global genomic response following 10 cGy (low dose) and 100 cGy (high dose) doses of X-ray ionising radiation at four time points by analysing the topology of gene coexpression networks. This study includes a rich experimental design and state-of-the-art computational systems biology methods of analysis to study the differences in the transcriptional response of skin cells exposed to low and high doses of radiation. Results Using this method we found important genes that have been linked to immune response, cell survival and apoptosis. Furthermore, we also were able to identify genes such as BRCA1, ABCA1, TNFRSF1B, MLLT11 that have been associated with various types of cancers. We were also able to detect many genes known to be associated with various medical conditions. Conclusions Our method of applying network topological differences can aid in identifying the differences among similar (eg: radiation effect) yet very different biological conditions (eg: different dose and time) to generate testable hypotheses. This is the first study where a network level analysis was performed across two different radiation doses at various time points, thereby illustrating changes in the cellular response over time. PMID:22594378

  13. Gene networks in skeletal muscle following endurance exercise are coexpressed in blood neutrophils and linked with blood inflammation markers.

    PubMed

    Broadbent, James; Sampson, Dayle; Sabapathy, Surendran; Haseler, Luke J; Wagner, Karl-Heinz; Bulmer, Andrew C; Peake, Jonathan M; Neubauer, Oliver

    2017-04-01

    It remains incompletely understood whether there is an association between the transcriptome profiles of skeletal muscle and blood leukocytes in response to exercise or other physiological stressors. We have previously analyzed the changes in the muscle and blood neutrophil transcriptome in eight trained men before and 3, 48, and 96 h after 2 h cycling and running. Because we collected muscle and blood in the same individuals and under the same conditions, we were able to directly compare gene expression between the muscle and blood neutrophils. Applying weighted gene coexpression network analysis (WGCNA) as an advanced network-driven method to these original data sets enabled us to compare the muscle and neutrophil transcriptomes in a rigorous and systematic manner. Two gene networks were identified that were preserved between skeletal muscle and blood neutrophils, functionally related to mitochondria and posttranslational processes. Strong preservation measures (Zsummary > 10) for both muscle-neutrophil gene networks were evident within the postexercise recovery period. Muscle and neutrophil gene coexpression was strongly correlated in the mitochondria-related network (r = 0.97; P = 3.17E-2). We also identified multiple correlations between muscular gene subnetworks and exercise-induced changes in blood leukocyte counts, inflammation, and muscle damage markers. These data reveal previously unidentified gene coexpression between skeletal muscle and blood neutrophils following exercise, showing the value of WGCNA to understand exercise physiology. Furthermore, these findings provide preliminary evidence in support of the notion that blood neutrophil gene networks may potentially help us to track physiological and pathophysiological changes in the muscle.NEW & NOTEWORTHY By using weighted gene coexpression network analysis, an advanced bioinformatics method, we have identified previously unknown, functional gene networks that are preserved between skeletal muscle

  14. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  15. A co-expression gene network associated with developmental regulation of apple fruit acidity.

    PubMed

    Bai, Yang; Dougherty, Laura; Cheng, Lailiang; Xu, Kenong

    2015-08-01

    Apple fruit acidity, which affects the fruit's overall taste and flavor to a large extent, is primarily determined by the concentration of malic acid. Previous studies demonstrated that the major QTL malic acid (Ma) on chromosome 16 is largely responsible for fruit acidity variations in apple. Recent advances suggested that a natural mutation that gives rise to a premature stop codon in one of the two aluminum-activated malate transporter (ALMT)-like genes (called Ma1) is the genetic causal element underlying Ma. However, the natural mutation does not explain the developmental changes of fruit malate levels in a given genotype. Using RNA-seq data from the fruit of 'Golden Delicious' taken at 14 developmental stages from 1 week after full-bloom (WAF01) to harvest (WAF20), we characterized their transcriptomes in groups of high (12.2 ± 1.6 mg/g fw, WAF03-WAF08), mid (7.4 ± 0.5 mg/g fw, WAF01-WAF02 and WAF10-WAF14) and low (5.4 ± 0.4 mg/g fw, WAF16-WAF20) malate concentrations. Detailed analyses showed that a set of 3,066 genes (including Ma1) were expressed not only differentially (P FDR < 0.05) between the high and low malate groups (or between the early and late developmental stages) but also in significant (P < 0.05) correlation with malate concentrations. The 3,066 genes fell in 648 MapMan (sub-) bins or functional classes, and 19 of them were significantly (P FDR < 0.05) co-enriched or co-suppressed in a malate dependent manner. Network inferring using the 363 genes encompassed in the 19 (sub-) bins, identified a major co-expression network of 239 genes. Since the 239 genes were also differentially expressed between the early (WAF03-WAF08) and late (WAF16-WAF20) developmental stages, the major network was considered to be associated with developmental regulation of apple fruit acidity in 'Golden Delicious'.

  16. Statistical Approaches for Gene Selection, Hub Gene Identification and Module Interaction in Gene Co-Expression Network Analysis: An Application to Aluminum Stress in Soybean (Glycine max L.)

    PubMed Central

    Das, Samarendra; Meher, Prabina Kumar; Bhar, Lal Mohan; Mandal, Baidya Nath

    2017-01-01

    Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean. PMID:28056073

  17. Statistical Approaches for Gene Selection, Hub Gene Identification and Module Interaction in Gene Co-Expression Network Analysis: An Application to Aluminum Stress in Soybean (Glycine max L.).

    PubMed

    Das, Samarendra; Meher, Prabina Kumar; Rai, Anil; Bhar, Lal Mohan; Mandal, Baidya Nath

    2017-01-01

    Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean.

  18. GeNET: a web application to explore and share Gene Co-expression Network Analysis data.

    PubMed

    Desai, Amit P; Razeghin, Mehdi; Meruvia-Pastor, Oscar; Peña-Castillo, Lourdes

    2017-01-01

    Gene Co-expression Network Analysis (GCNA) is a popular approach to analyze a collection of gene expression profiles. GCNA yields an assignment of genes to gene co-expression modules, a list of gene sets statistically over-represented in these modules, and a gene-to-gene network. There are several computer programs for gene-to-gene network visualization, but these programs have limitations in terms of integrating all the data generated by a GCNA and making these data available online. To facilitate sharing and study of GCNA data, we developed GeNET. For researchers interested in sharing their GCNA data, GeNET provides a convenient interface to upload their data and automatically make it accessible to the public through an online server. For researchers interested in exploring GCNA data published by others, GeNET provides an intuitive online tool to interactively explore GCNA data by genes, gene sets or modules. In addition, GeNET allows users to download all or part of the published data for further computational analysis. To demonstrate the applicability of GeNET, we imported three published GCNA datasets, the largest of which consists of roughly 17,000 genes and 200 conditions. GeNET is available at bengi.cs.mun.ca/genet.

  19. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

    SciTech Connect

    Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia; Callister, Stephen J.; Wright, Aaron T.; Westbye, Alexander; Beatty, J. T.; Lang, Andrew S.

    2014-08-28

    The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigated preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional

  20. Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes

    PubMed Central

    Riquelme Medina, Ignacio; Lubovac-Pilav, Zelmina

    2016-01-01

    Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body’s inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D. PMID:27257970

  1. Identification of hub genes of pneumocyte senescence induced by thoracic irradiation using weighted gene co-expression network analysis

    PubMed Central

    XING, YONGHUA; ZHANG, JUNLING; LU, LU; LI, DEGUAN; WANG, YUEYING; HUANG, SONG; LI, CHENGCHENG; ZHANG, ZHUBO; LI, JIANGUO; MENG, AIMIN

    2016-01-01

    Irradiation commonly causes pneumocyte senescence, which may lead to severe fatal lung injury characterized by pulmonary dysfunction and respiratory failure. However, the molecular mechanism underlying the induction of pneumocyte senescence by irradiation remains to be elucidated. In the present study, weighted gene co-expression network analysis (WGCNA) was used to screen for differentially expressed genes, and to identify the hub genes and gene modules, which may be critical for senescence. A total of 2,916 differentially expressed genes were identified between the senescence and non-senescence groups following thoracic irradiation. In total, 10 gene modules associated with cell senescence were detected, and six hub genes were identified, including B-cell scaffold protein with ankyrin repeats 1, translocase of outer mitochondrial membrane 70 homolog A, actin filament-associated protein 1, Cd84, Nuf2 and nuclear factor erythroid 2. These genes were markedly associated with cell proliferation, cell division and cell cycle arrest. The results of the present study demonstrated that WGCNA of microarray data may provide further insight into the molecular mechanism underlying pneumocyte senescence. PMID:26572216

  2. A combination of gene expression ranking and co-expression network analysis increases discovery rate in large-scale mutant screens for novel Arabidopsis thaliana abiotic stress genes.

    PubMed

    Ransbotyn, Vanessa; Yeger-Lotem, Esti; Basha, Omer; Acuna, Tania; Verduyn, Christoph; Gordon, Michal; Chalifa-Caspi, Vered; Hannah, Matthew A; Barak, Simon

    2015-05-01

    As challenges to food security increase, the demand for lead genes for improving crop production is growing. However, genetic screens of plant mutants typically yield very low frequencies of desired phenotypes. Here, we present a powerful computational approach for selecting candidate genes for screening insertion mutants. We combined ranking of Arabidopsis thaliana regulatory genes according to their expression in response to multiple abiotic stresses (Multiple Stress [MST] score), with stress-responsive RNA co-expression network analysis to select candidate multiple stress regulatory (MSTR) genes. Screening of 62 T-DNA insertion mutants defective in candidate MSTR genes, for abiotic stress germination phenotypes yielded a remarkable hit rate of up to 62%; this gene discovery rate is 48-fold greater than that of other large-scale insertional mutant screens. Moreover, the MST score of these genes could be used to prioritize them for screening. To evaluate the contribution of the co-expression analysis, we screened 64 additional mutant lines of MST-scored genes that did not appear in the RNA co-expression network. The screening of these MST-scored genes yielded a gene discovery rate of 36%, which is much higher than that of classic mutant screens but not as high as when picking candidate genes from the co-expression network. The MSTR co-expression network that we created, AraSTressRegNet is publicly available at http://netbio.bgu.ac.il/arnet. This systems biology-based screening approach combining gene ranking and network analysis could be generally applicable to enhancing identification of genes regulating additional processes in plants and other organisms provided that suitable transcriptome data are available.

  3. Broad Integration of Expression Maps and Co-Expression Networks Compassing Novel Gene Functions in the Brain

    PubMed Central

    Okamura-Oho, Yuko; Shimokawa, Kazuro; Nishimura, Masaomi; Takemoto, Satoko; Sato, Akira; Furuichi, Teiichi; Yokota, Hideo

    2014-01-01

    Using a recently invented technique for gene expression mapping in the whole-anatomy context, termed transcriptome tomography, we have generated a dataset of 36,000 maps of overall gene expression in the adult-mouse brain. Here, using an informatics approach, we identified a broad co-expression network that follows an inverse power law and is rich in functional interaction and gene-ontology terms. Our framework for the integrated analysis of expression maps and graphs of co-expression networks revealed that groups of combinatorially expressed genes, which regulate cell differentiation during development, were present in the adult brain and each of these groups was associated with a discrete cell types. These groups included non-coding genes of unknown function. We found that these genes specifically linked developmentally conserved groups in the network. A previously unrecognized robust expression pattern covering the whole brain was related to the molecular anatomy of key biological processes occurring in particular areas. PMID:25382412

  4. Gene Coexpression Networks in Human Brain Developmental Transcriptomes Implicate the Association of Long Noncoding RNAs with Intellectual Disability

    PubMed Central

    Gudenas, Brian L.; Wang, Liangjiang

    2015-01-01

    The advent of next-generation sequencing for genetic diagnoses of complex developmental disorders, such as intellectual disability (ID), has facilitated the identification of hundreds of predisposing genetic variants. However, there still exists a vast gap in our knowledge of causal genetic factors for ID as evidenced by low diagnostic yield of genetic screening, in which identifiable genetic causes are not found for the majority of ID cases. Most methods of genetic screening focus on protein-coding genes; however, noncoding RNAs may outnumber protein-coding genes and play important roles in brain development. Long noncoding RNAs (lncRNAs) specifically have been shown to be enriched in the brain and have diverse roles in gene regulation at the transcriptional and posttranscriptional levels. LncRNAs are a vastly uncharacterized group of noncoding genes, which could function in brain development and harbor ID-predisposing genetic variants. We analyzed lncRNAs for coexpression with known ID genes and affected biological pathways within a weighted gene coexpression network derived from RNA-sequencing data spanning human brain development. Several ID-associated gene modules were found to be enriched for lncRNAs, known ID genes, and affected biological pathways. Utilizing a list of de novo and pathogenic copy number variants detected in ID probands, we identified lncRNAs overlapping these genetic structural variants. By integrating our results, we have made a prioritized list of potential ID-associated lncRNAs based on the developing brain gene coexpression network and genetic structural variants found in ID probands. PMID:26523118

  5. Identification of rice genes associated with cosmic-ray response via co-expression gene network analysis.

    PubMed

    Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong

    2014-05-15

    In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Assessing the utility of gene co-expression stability in combination with correlation in the analysis of protein-protein interaction networks

    PubMed Central

    2011-01-01

    Background Gene co-expression, in the form of a correlation coefficient, has been valuable in the analysis, classification and prediction of protein-protein interactions. However, it is susceptible to bias from a few samples having a large effect on the correlation coefficient. Gene co-expression stability is a means of quantifying this bias, with high stability indicating robust, unbiased co-expression correlation coefficients. We assess the utility of gene co-expression stability as an additional measure to support the co-expression correlation in the analysis of protein-protein interaction networks. Results We studied the patterns of co-expression correlation and stability in interacting proteins with respect to their interaction promiscuity, levels of intrinsic disorder, and essentiality or disease-relatedness. Co-expression stability, along with co-expression correlation, acts as a better classifier of hub proteins in interaction networks, than co-expression correlation alone, enabling the identification of a class of hubs that are functionally distinct from the widely accepted transient (date) and obligate (party) hubs. Proteins with high levels of intrinsic disorder have low co-expression correlation and high stability with their interaction partners suggesting their involvement in transient interactions, except for a small group that have high co-expression correlation and are typically subunits of stable complexes. Similar behavior was seen for disease-related and essential genes. Interacting proteins that are both disordered have higher co-expression stability than ordered protein pairs. Using co-expression correlation and stability, we found that transient interactions are more likely to occur between an ordered and a disordered protein while obligate interactions primarily occur between proteins that are either both ordered, or disordered. Conclusions We observe that co-expression stability shows distinct patterns in structurally and functionally

  7. SeqEnrich: A tool to predict transcription factor networks from co-expressed Arabidopsis and Brassica napus gene sets.

    PubMed

    Becker, Michael G; Walker, Philip L; Pulgar-Vidal, Nadège C; Belmonte, Mark F

    2017-01-01

    Transcription factors and their associated DNA binding sites are key regulatory elements of cellular differentiation, development, and environmental response. New tools that predict transcriptional regulation of biological processes are valuable to researchers studying both model and emerging-model plant systems. SeqEnrich predicts transcription factor networks from co-expressed Arabidopsis or Brassica napus gene sets. The networks produced by SeqEnrich are supported by existing literature and predicted transcription factor-DNA interactions that can be functionally validated at the laboratory bench. The program functions with gene sets of varying sizes and derived from diverse tissues and environmental treatments. SeqEnrich presents as a powerful predictive framework for the analysis of Arabidopsis and Brassica napus co-expression data, and is designed so that researchers at all levels can easily access and interpret predicted transcriptional circuits. The program outperformed its ancestral program ChipEnrich, and produced detailed transcription factor networks from Arabidopsis and Brassica napus gene expression data. The SeqEnrich program is ideal for generating new hypotheses and distilling biological information from large-scale expression data.

  8. A network approach of gene co-expression in the zea mays/Aspergillus flavus pathosystem to map host/pathogen interaction pathways

    USDA-ARS?s Scientific Manuscript database

    A gene co-expression network was generated using a dual RNA-seq study with the fungal pathogen A. flavus and its plant host Z. mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network reveal...

  9. Gene coexpression networks in human brain identify epigenetic modifications in alcohol dependence.

    PubMed

    Ponomarev, Igor; Wang, Shi; Zhang, Lingling; Harris, R Adron; Mayfield, R Dayne

    2012-02-01

    Alcohol abuse causes widespread changes in gene expression in human brain, some of which contribute to alcohol dependence. Previous microarray studies identified individual genes as candidates for alcohol phenotypes, but efforts to generate an integrated view of molecular and cellular changes underlying alcohol addiction are lacking. Here, we applied a novel systems approach to transcriptome profiling in postmortem human brains and generated a systemic view of brain alterations associated with alcohol abuse. We identified critical cellular components and previously unrecognized epigenetic determinants of gene coexpression relationships and discovered novel markers of chromatin modifications in alcoholic brain. Higher expression levels of endogenous retroviruses and genes with high GC content in alcoholics were associated with DNA hypomethylation and increased histone H3K4 trimethylation, suggesting a critical role of epigenetic mechanisms in alcohol addiction. Analysis of cell-type-specific transcriptomes revealed remarkable consistency between molecular profiles and cellular abnormalities in alcoholic brain. Based on evidence from this study and others, we generated a systems hypothesis for the central role of chromatin modifications in alcohol dependence that integrates epigenetic regulation of gene expression with pathophysiological and neuroadaptive changes in alcoholic brain. Our results offer implications for epigenetic therapeutics in alcohol and drug addiction.

  10. Identifying gene coexpression networks underlying the dynamic regulation of wood-forming tissues in Populus under diverse environmental conditions.

    PubMed

    Zinkgraf, Matthew; Liu, Lijun; Groover, Andrew; Filkov, Vladimir

    2017-03-01

    Trees modify wood formation through integration of environmental and developmental signals in complex but poorly defined transcriptional networks, allowing trees to produce woody tissues appropriate to diverse environmental conditions. In order to identify relationships among genes expressed during wood formation, we integrated data from new and publically available datasets in Populus. These datasets were generated from woody tissue and include transcriptome profiling, transcription factor binding, DNA accessibility and genome-wide association mapping experiments. Coexpression modules were calculated, each of which contains genes showing similar expression patterns across experimental conditions, genotypes and treatments. Conserved gene coexpression modules (four modules totaling 8398 genes) were identified that were highly preserved across diverse environmental conditions and genetic backgrounds. Functional annotations as well as correlations with specific experimental treatments associated individual conserved modules with distinct biological processes underlying wood formation, such as cell-wall biosynthesis, meristem development and epigenetic pathways. Module genes were also enriched for DNase I hypersensitivity footprints and binding from four transcription factors associated with wood formation. The conserved modules are excellent candidates for modeling core developmental pathways common to wood formation in diverse environments and genotypes, and serve as testbeds for hypothesis generation and testing for future studies.

  11. Studying the complex expression dependences between sets of coexpressed genes.

    PubMed

    Huerta, Mario; Casanova, Oriol; Barchino, Roberto; Flores, Jose; Querol, Enrique; Cedano, Juan

    2014-01-01

    Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments.

  12. Gene Co-Expression Network Analysis Provides Novel Insights into Myostatin Regulation at Three Different Mouse Developmental Timepoints

    PubMed Central

    Yang, Xuerong; Koltes, James E.; Park, Carissa A.; Chen, Daiwen; Reecy, James M.

    2015-01-01

    Myostatin (Mstn) knockout mice exhibit large increases in skeletal muscle mass. However, relatively few of the genes that mediate or modify MSTN effects are known. In this study, we performed co-expression network analysis using whole transcriptome microarray data from MSTN-null and wild-type mice to identify genes involved in important biological processes and pathways related to skeletal muscle and adipose development. Genes differentially expressed between wild-type and MSTN-null mice were further analyzed for shared DNA motifs using DREME. Differentially expressed genes were identified at 13.5 d.p.c. during primary myogenesis and at d35 during postnatal muscle development, but not at 17.5 d.p.c. during secondary myogenesis. In total, 283 and 2034 genes were differentially expressed at 13.5 d.p.c. and d35, respectively. Over-represented transcription factor binding sites in differentially expressed genes included SMAD3, SP1, ZFP187, and PLAGL1. The use of regulatory (RIF) and phenotypic (PIF) impact factor and differential hubbing co-expression analyses identified both known and potentially novel regulators of skeletal muscle growth, including Apobec2, Atp2a2, and Mmp13 at d35 and Sox2, Tmsb4x, and Vdac1 at 13.5 d.p.c. Among the genes with the highest PIF scores were many fiber type specifying genes. The use of RIF, PIF, and differential hubbing analyses identified both known and potentially novel regulators of muscle development. These results provide new details of how MSTN may mediate transcriptional regulation as well as insight into novel regulators of MSTN signal transduction that merit further study regarding their physiological roles in muscle and adipose development. PMID:25695797

  13. STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

    PubMed Central

    Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent

    2009-01-01

    Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to

  14. Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection

    PubMed Central

    Jiang, Zhenhong; Dong, Xiaobao; Li, Zhi-Gang; He, Fei; Zhang, Ziding

    2016-01-01

    Plant defense responses to pathogens involve massive transcriptional reprogramming. Recently, differential coexpression analysis has been developed to study the rewiring of gene networks through microarray data, which is becoming an important complement to traditional differential expression analysis. Using time-series microarray data of Arabidopsis thaliana infected with Pseudomonas syringae, we analyzed Arabidopsis defense responses to P. syringae through differential coexpression analysis. Overall, we found that differential coexpression was a common phenomenon of plant immunity. Genes that were frequently involved in differential coexpression tend to be related to plant immune responses. Importantly, many of those genes have similar average expression levels between normal plant growth and pathogen infection but have different coexpression partners. By integrating the Arabidopsis regulatory network into our analysis, we identified several transcription factors that may be regulators of differential coexpression during plant immune responses. We also observed extensive differential coexpression between genes within the same metabolic pathways. Several metabolic pathways, such as photosynthesis light reactions, exhibited significant changes in expression correlation between normal growth and pathogen infection. Taken together, differential coexpression analysis provides a new strategy for analyzing transcriptional data related to plant defense responses and new insights into the understanding of plant-pathogen interactions. PMID:27721457

  15. Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection.

    PubMed

    Jiang, Zhenhong; Dong, Xiaobao; Li, Zhi-Gang; He, Fei; Zhang, Ziding

    2016-10-10

    Plant defense responses to pathogens involve massive transcriptional reprogramming. Recently, differential coexpression analysis has been developed to study the rewiring of gene networks through microarray data, which is becoming an important complement to traditional differential expression analysis. Using time-series microarray data of Arabidopsis thaliana infected with Pseudomonas syringae, we analyzed Arabidopsis defense responses to P. syringae through differential coexpression analysis. Overall, we found that differential coexpression was a common phenomenon of plant immunity. Genes that were frequently involved in differential coexpression tend to be related to plant immune responses. Importantly, many of those genes have similar average expression levels between normal plant growth and pathogen infection but have different coexpression partners. By integrating the Arabidopsis regulatory network into our analysis, we identified several transcription factors that may be regulators of differential coexpression during plant immune responses. We also observed extensive differential coexpression between genes within the same metabolic pathways. Several metabolic pathways, such as photosynthesis light reactions, exhibited significant changes in expression correlation between normal growth and pathogen infection. Taken together, differential coexpression analysis provides a new strategy for analyzing transcriptional data related to plant defense responses and new insights into the understanding of plant-pathogen interactions.

  16. Coexpression network analysis of the genes regulated by two types of resistance responses to powdery mildew in wheat

    PubMed Central

    Zhang, Juncheng; Zheng, Hongyuan; Li, Yiwen; Li, Hongjie; Liu, Xin; Qin, Huanju; Dong, Lingli; Wang, Daowen

    2016-01-01

    Powdery mildew disease caused by Blumeria graminis f. sp. tritici (Bgt) inflicts severe economic losses in wheat crops. A systematic understanding of the molecular mechanisms involved in wheat resistance to Bgt is essential for effectively controlling the disease. Here, using the diploid wheat Triticum urartu as a host, the genes regulated by immune (IM) and hypersensitive reaction (HR) resistance responses to Bgt were investigated through transcriptome sequencing. Four gene coexpression networks (GCNs) were developed using transcriptomic data generated for 20 T. urartu accessions showing IM, HR or susceptible responses. The powdery mildew resistance regulated (PMRR) genes whose expression was significantly correlated with Bgt resistance were identified, and they tended to be hubs and enriched in six major modules. A wide occurrence of negative regulation of PMRR genes was observed. Three new candidate immune receptor genes (TRIUR3_13045, TRIUR3_01037 and TRIUR3_06195) positively associated with Bgt resistance were discovered. Finally, the involvement of TRIUR3_01037 in Bgt resistance was tentatively verified through cosegregation analysis in a F2 population and functional expression assay in Bgt susceptible leaf cells. This research provides insights into the global network properties of PMRR genes. Potential molecular differences between IM and HR resistance responses to Bgt are discussed. PMID:27033636

  17. Weighted gene co-expression network analysis in identification of metastasis-related genes of lung squamous cell carcinoma based on the Cancer Genome Atlas database.

    PubMed

    Tian, Feng; Zhao, Jinlong; Fan, Xinlei; Kang, Zhenxing

    2017-01-01

    Lung squamous cell carcinoma (lung SCC) is a common type of malignancy. Its pathogenesis mechanism of tumor development is unclear. The aim of this study was to identify key genes for diagnosis biomarkers in lung SCC metastasis. We searched and downloaded mRNA expression data and clinical data from The Cancer Genome Atlas (TCGA) database to identify differences in mRNA expression of primary tumor tissues from lung SCC with and without metastasis. Gene co-expression network analysis, protein-protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and quantitative real-time polymerase chain reactions (qRT-PCR) were used to explore the biological functions of the identified dysregulated genes. Four hundred and eighty-two differentially expressed genes (DEGs) were identified between lung SCC with and without metastasis. Nineteen modules were identified in lung SCC through weighted gene co-expression network analysis (WGCNA). Twenty-three DEGs and 26 DEGs were significantly enriched in the respective pink and black module. KEGG pathway analysis displayed that 26 DEGs in the black module were significantly enriched in bile secretion pathway. Forty-nine DEGs in the two gene co-expression module were used to construct PPI network. CFTR in the black module was the hub protein, had the connectivity with 182 genes. The results of qRT-PCR displayed that FIGF, SFTPD, DYNLRB2 were significantly down-regulated in the tumor samples of lung SCC with metastasis and CFTR, SCGB3A2, SSTR1, SCTR, ROPN1L had the down-regulation tendency in lung SCC with metastasis compared to lung SCC without metastasis. The dysregulated genes including CFTR, SCTR and FIGF might be involved in the pathology of lung SCC metastasis and could be used as potential diagnosis biomarkers or therapeutic targets for lung SCC.

  18. Weighted gene co-expression network analysis in identification of metastasis-related genes of lung squamous cell carcinoma based on the Cancer Genome Atlas database

    PubMed Central

    Tian, Feng; Zhao, Jinlong; Kang, Zhenxing

    2017-01-01

    Background Lung squamous cell carcinoma (lung SCC) is a common type of malignancy. Its pathogenesis mechanism of tumor development is unclear. The aim of this study was to identify key genes for diagnosis biomarkers in lung SCC metastasis. Methods We searched and downloaded mRNA expression data and clinical data from The Cancer Genome Atlas (TCGA) database to identify differences in mRNA expression of primary tumor tissues from lung SCC with and without metastasis. Gene co-expression network analysis, protein-protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and quantitative real-time polymerase chain reactions (qRT-PCR) were used to explore the biological functions of the identified dysregulated genes. Results Four hundred and eighty-two differentially expressed genes (DEGs) were identified between lung SCC with and without metastasis. Nineteen modules were identified in lung SCC through weighted gene co-expression network analysis (WGCNA). Twenty-three DEGs and 26 DEGs were significantly enriched in the respective pink and black module. KEGG pathway analysis displayed that 26 DEGs in the black module were significantly enriched in bile secretion pathway. Forty-nine DEGs in the two gene co-expression module were used to construct PPI network. CFTR in the black module was the hub protein, had the connectivity with 182 genes. The results of qRT-PCR displayed that FIGF, SFTPD, DYNLRB2 were significantly down-regulated in the tumor samples of lung SCC with metastasis and CFTR, SCGB3A2, SSTR1, SCTR, ROPN1L had the down-regulation tendency in lung SCC with metastasis compared to lung SCC without metastasis. Conclusions The dysregulated genes including CFTR, SCTR and FIGF might be involved in the pathology of lung SCC metastasis and could be used as potential diagnosis biomarkers or therapeutic targets for lung SCC. PMID:28203405

  19. Learning from Co-expression Networks: Possibilities and Challenges

    PubMed Central

    Serin, Elise A. R.; Nijveen, Harm; Hilhorst, Henk W. M.; Ligterink, Wilco

    2016-01-01

    Plants are fascinating and complex organisms. A comprehensive understanding of the organization, function and evolution of plant genes is essential to disentangle important biological processes and to advance crop engineering and breeding strategies. The ultimate aim in deciphering complex biological processes is the discovery of causal genes and regulatory mechanisms controlling these processes. The recent surge of omics data has opened the door to a system-wide understanding of the flow of biological information underlying complex traits. However, dealing with the corresponding large data sets represents a challenging endeavor that calls for the development of powerful bioinformatics methods. A popular approach is the construction and analysis of gene networks. Such networks are often used for genome-wide representation of the complex functional organization of biological systems. Network based on similarity in gene expression are called (gene) co-expression networks. One of the major application of gene co-expression networks is the functional annotation of unknown genes. Constructing co-expression networks is generally straightforward. In contrast, the resulting network of connected genes can become very complex, which limits its biological interpretation. Several strategies can be employed to enhance the interpretation of the networks. A strategy in coherence with the biological question addressed needs to be established to infer reliable networks. Additional benefits can be gained from network-based strategies using prior knowledge and data integration to further enhance the elucidation of gene regulatory relationships. As a result, biological networks provide many more applications beyond the simple visualization of co-expressed genes. In this study we review the different approaches for co-expression network inference in plants. We analyse integrative genomics strategies used in recent studies that successfully identified candidate genes taking advantage of

  20. Physiological Responses and Gene Co-Expression Network of Mycorrhizal Roots under K+ Deprivation1[OPEN

    PubMed Central

    Roy, Sushmita

    2017-01-01

    Arbuscular mycorrhizal (AM) associations enhance the phosphorous and nitrogen nutrition of host plants, but little is known about their role in potassium (K+) nutrition. Medicago truncatula plants were cocultured with the AM fungus Rhizophagus irregularis under high and low K+ regimes for 6 weeks. We determined how K+ deprivation affects plant development and mineral acquisition and how these negative effects are tempered by the AM colonization. The transcriptional response of AM roots under K+ deficiency was analyzed by whole-genome RNA sequencing. K+ deprivation decreased root biomass and external K+ uptake and modulated oxidative stress gene expression in M. truncatula roots. AM colonization induced specific transcriptional responses to K+ deprivation that seem to temper these negative effects. A gene network analysis revealed putative key regulators of these responses. This study confirmed that AM associations provide some tolerance to K+ deprivation to host plants, revealed that AM symbiosis modulates the expression of specific root genes to cope with this nutrient stress, and identified putative regulators participating in these tolerance mechanisms. PMID:28159827

  1. Physiological Responses and Gene Co-Expression Network of Mycorrhizal Roots under K(+) Deprivation.

    PubMed

    Garcia, Kevin; Chasman, Deborah; Roy, Sushmita; Ané, Jean-Michel

    2017-03-01

    Arbuscular mycorrhizal (AM) associations enhance the phosphorous and nitrogen nutrition of host plants, but little is known about their role in potassium (K(+)) nutrition. Medicago truncatula plants were cocultured with the AM fungus Rhizophagus irregularis under high and low K(+) regimes for 6 weeks. We determined how K(+) deprivation affects plant development and mineral acquisition and how these negative effects are tempered by the AM colonization. The transcriptional response of AM roots under K(+) deficiency was analyzed by whole-genome RNA sequencing. K(+) deprivation decreased root biomass and external K(+) uptake and modulated oxidative stress gene expression in M. truncatula roots. AM colonization induced specific transcriptional responses to K(+) deprivation that seem to temper these negative effects. A gene network analysis revealed putative key regulators of these responses. This study confirmed that AM associations provide some tolerance to K(+) deprivation to host plants, revealed that AM symbiosis modulates the expression of specific root genes to cope with this nutrient stress, and identified putative regulators participating in these tolerance mechanisms.

  2. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model.

    PubMed

    Kogelman, Lisette J A; Cirera, Susanna; Zhernakova, Daria V; Fredholm, Merete; Franke, Lude; Kadarmideen, Haja N

    2014-09-30

    Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and 34.58). Moreover, detection

  3. Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model

    PubMed Central

    2014-01-01

    Background Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model. Methods We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms. Results WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and

  4. A Null Model for Pearson Coexpression Networks

    PubMed Central

    Gobbi, Andrea; Jurman, Giuseppe

    2015-01-01

    Gene coexpression networks inferred by correlation from high-throughput profiling such as microarray data represent simple but effective structures for discovering and interpreting linear gene relationships. In recent years, several approaches have been proposed to tackle the problem of deciding when the resulting correlation values are statistically significant. This is most crucial when the number of samples is small, yielding a non-negligible chance that even high correlation values are due to random effects. Here we introduce a novel hard thresholding solution based on the assumption that a coexpression network inferred by randomly generated data is expected to be empty. The threshold is theoretically derived by means of an analytic approach and, as a deterministic independent null model, it depends only on the dimensions of the starting data matrix, with assumptions on the skewness of the data distribution compatible with the structure of gene expression levels data. We show, on synthetic and array datasets, that the proposed threshold is effective in eliminating all false positive links, with an offsetting cost in terms of false negative detected edges. PMID:26030917

  5. A null model for Pearson coexpression networks.

    PubMed

    Gobbi, Andrea; Jurman, Giuseppe

    2015-01-01

    Gene coexpression networks inferred by correlation from high-throughput profiling such as microarray data represent simple but effective structures for discovering and interpreting linear gene relationships. In recent years, several approaches have been proposed to tackle the problem of deciding when the resulting correlation values are statistically significant. This is most crucial when the number of samples is small, yielding a non-negligible chance that even high correlation values are due to random effects. Here we introduce a novel hard thresholding solution based on the assumption that a coexpression network inferred by randomly generated data is expected to be empty. The threshold is theoretically derived by means of an analytic approach and, as a deterministic independent null model, it depends only on the dimensions of the starting data matrix, with assumptions on the skewness of the data distribution compatible with the structure of gene expression levels data. We show, on synthetic and array datasets, that the proposed threshold is effective in eliminating all false positive links, with an offsetting cost in terms of false negative detected edges.

  6. Gene Co-Expression Network Analysis Unraveling Transcriptional Regulation of High-Altitude Adaptation of Tibetan Pig

    PubMed Central

    Koltes, James E.; Gou, Xiao; Yang, Shuli; Yan, Dawei; Lu, Shaoxiong

    2016-01-01

    Tibetan pigs have survived at high altitude for millennia and they have a suite of adaptive features to tolerate the hypoxic environment. However, the molecular mechanisms underlying the regulation of hypoxia-adaptive phenotypes have not been completely elucidated. In this study, we analyzed differentially expressed genes (DEGs), biological pathways and constructed co-expression regulation networks using whole-transcriptome microarrays from lung tissues of Tibetan and Duroc pigs both at high and low altitude. A total of 3,066 DEGs were identified and this list was over-represented for the ontology terms including metabolic process, catalytic activity, and KEGG pathway including metabolic pathway and PI3K-Akt signaling pathway. The regulatory (RIF) and phenotypic (PIF) impact factor analysis identified several known and several potentially novel regulators of hypoxia adaption, including: IKBKG, KLF6 and RBPJ (RIF1), SF3B1, EFEMP1, HOXB6 and ATF6 (RIF2). These findings provide new details of the regulatory architecture of hypoxia-adaptive genes and also insight into which genes may undergo epigenetic modification for further study in the high-altitude adaptation. PMID:27936142

  7. Immuno-Navigator, a batch-corrected coexpression database, reveals cell type-specific gene networks in the immune system

    PubMed Central

    Vandenbon, Alexis; Dinh, Viet H.; Mikami, Norihisa; Kitagawa, Yohko; Teraguchi, Shunsuke; Ohkura, Naganari; Sakaguchi, Shimon

    2016-01-01

    High-throughput gene expression data are one of the primary resources for exploring complex intracellular dynamics in modern biology. The integration of large amounts of public data may allow us to examine general dynamical relationships between regulators and target genes. However, obstacles for such analyses are study-specific biases or batch effects in the original data. Here we present Immuno-Navigator, a batch-corrected gene expression and coexpression database for 24 cell types of the mouse immune system. We systematically removed batch effects from the underlying gene expression data and showed that this removal considerably improved the consistency between inferred correlations and prior knowledge. The data revealed widespread cell type-specific correlation of expression. Integrated analysis tools allow users to use this correlation of expression for the generation of hypotheses about biological networks and candidate regulators in specific cell types. We show several applications of Immuno-Navigator as examples. In one application we successfully predicted known regulators of importance in naturally occurring Treg cells from their expression correlation with a set of Treg-specific genes. For one high-scoring gene, integrin β8 (Itgb8), we confirmed an association between Itgb8 expression in forkhead box P3 (Foxp3)-positive T cells and Treg-specific epigenetic remodeling. Our results also suggest that the regulation of Treg-specific genes within Treg cells is relatively independent of Foxp3 expression, supporting recent results pointing to a Foxp3-independent component in the development of Treg cells. PMID:27078110

  8. ALS disrupts spinal motor neuron maturation and aging pathways within gene co-expression networks

    PubMed Central

    Ho, Ritchie; Sances, Samuel; Gowing, Genevieve; Amoroso, Mackenzie Weygandt; O'Rourke, Jacqueline G.; Sahabian, Anais; Wichterle, Hynek; Baloh, Robert H.; Sareen, Dhruv

    2016-01-01

    Modeling Amyotrophic Lateral Sclerosis (ALS) with human induced pluripotent stem cells (iPSCs) aims to reenact embryogenesis, maturation, and aging of spinal motor neurons (spMNs) in vitro. As the maturity of spMNs grown in vitro compared to spMNs in vivo remains largely unaddressed, it is unclear to what extent this in vitro system captures critical aspects of spMN development and molecular signatures associated with ALS. Here, we compared transcriptomes among iPSC-derived spMNs, fetal, and adult spinal tissues. This approach produced a maturation scale revealing that iPSC-derived spMNs were more similar to fetal spinal tissue than to adult spMNs. Additionally, we resolved gene networks and pathways associated with spMN maturation and aging. These networks enriched for pathogenic familial ALS genetic variants and were disrupted in sporadic ALS spMNs. Altogether, our findings suggest that developing strategies to further mature and age iPSC-derived spMNs will provide more effective iPSC models of ALS pathology. PMID:27428653

  9. ALS disrupts spinal motor neuron maturation and aging pathways within gene co-expression networks.

    PubMed

    Ho, Ritchie; Sances, Samuel; Gowing, Genevieve; Amoroso, Mackenzie Weygandt; O'Rourke, Jacqueline G; Sahabian, Anais; Wichterle, Hynek; Baloh, Robert H; Sareen, Dhruv; Svendsen, Clive N

    2016-09-01

    Modeling amyotrophic lateral sclerosis (ALS) with human induced pluripotent stem cells (iPSCs) aims to reenact embryogenesis, maturation and aging of spinal motor neurons (spMNs) in vitro. As the maturity of spMNs grown in vitro compared to spMNs in vivo remains largely unaddressed, it is unclear to what extent this in vitro system captures critical aspects of spMN development and molecular signatures associated with ALS. Here, we compared transcriptomes among iPSC-derived spMNs, fetal spinal tissues and adult spinal tissues. This approach produced a maturation scale revealing that iPSC-derived spMNs were more similar to fetal spinal tissue than to adult spMNs. Additionally, we resolved gene networks and pathways associated with spMN maturation and aging. These networks enriched for pathogenic familial ALS genetic variants and were disrupted in sporadic ALS spMNs. Altogether, our findings suggest that developing strategies to further mature and age iPSC-derived spMNs will provide more effective iPSC models of ALS pathology.

  10. Differential expression and co-expression gene networks reveal candidate biomarkers of boar taint in non-castrated pigs.

    PubMed

    Drag, Markus; Skinkyté-Juskiené, Ruta; Do, Duy N; Kogelman, Lisette J A; Kadarmideen, Haja N

    2017-09-22

    Boar taint (BT) is an offensive odour or taste observed in pork from a proportion of non-castrated male pigs. Surgical castration is effective in avoiding BT, but animal welfare issues have created an incentive for alternatives such as genomic selection. In order to find candidate biomarkers, gene expression profiles were analysed from tissues of non-castrated pigs grouped by their genetic merit of BT. Differential expression analysis revealed substantial changes with log-transformed fold changes of liver and testis from -3.39 to 2.96 and -7.51 to 3.53, respectively. Co-expression network analysis revealed one module with a correlation of -0.27 in liver and three modules with correlations of 0.31, -0.44 and -0.49 in testis. Differential expression and co-expression analysis revealed candidate biomarkers with varying biological functions: phase I (COQ3, COX6C, CYP2J2, CYP2B6, ACOX2) and phase II metabolism (GSTO1, GSR, FMO3) of skatole and androstenone in liver to steroidgenesis (HSD17B7, HSD17B8, CYP27A1), regulation of steroidgenesis (STARD10, CYB5R3) and GnRH signalling (MAPK3, MAP2K2, MAP3K2) in testis. Overrepresented pathways included "Ribosome", "Protein export" and "Oxidative phosphorylation" in liver and "Steroid hormone biosynthesis" and "Gap junction" in testis. Future work should evaluate the biomarkers in large populations to ensure their usefulness in genomic selection programs.

  11. Functional Analysis and Characterization of Differential Coexpression Networks

    PubMed Central

    Hsu, Chia-Lang; Juan, Hsueh-Fen; Huang, Hsuan-Cheng

    2015-01-01

    Differential coexpression analysis is emerging as a complement to conventional differential gene expression analysis. The identified differential coexpression links can be assembled into a differential coexpression network (DCEN) in response to environmental stresses or genetic changes. Differential coexpression analyses have been successfully used to identify condition-specific modules; however, the structural properties and biological significance of general DCENs have not been well investigated. Here, we analyzed two independent Saccharomyces cerevisiae DCENs constructed from large-scale time-course gene expression profiles in response to different situations. Topological analyses show that DCENs are tree-like networks possessing scale-free characteristics, but not small-world. Functional analyses indicate that differentially coexpressed gene pairs in DCEN tend to link different biological processes, achieving complementary or synergistic effects. Furthermore, the gene pairs lacking common transcription factors are sensitive to perturbation and hence lead to differential coexpression. Based on these observations, we integrated transcriptional regulatory information into DCEN and identified transcription factors that might cause differential coexpression by gain or loss of activation in response to different situations. Collectively, our results not only uncover the unique structural characteristics of DCEN but also provide new insights into interpretation of DCEN to reveal its biological significance and infer the underlying gene regulatory dynamics. PMID:26282208

  12. Functional Analysis and Characterization of Differential Coexpression Networks.

    PubMed

    Hsu, Chia-Lang; Juan, Hsueh-Fen; Huang, Hsuan-Cheng

    2015-08-18

    Differential coexpression analysis is emerging as a complement to conventional differential gene expression analysis. The identified differential coexpression links can be assembled into a differential coexpression network (DCEN) in response to environmental stresses or genetic changes. Differential coexpression analyses have been successfully used to identify condition-specific modules; however, the structural properties and biological significance of general DCENs have not been well investigated. Here, we analyzed two independent Saccharomyces cerevisiae DCENs constructed from large-scale time-course gene expression profiles in response to different situations. Topological analyses show that DCENs are tree-like networks possessing scale-free characteristics, but not small-world. Functional analyses indicate that differentially coexpressed gene pairs in DCEN tend to link different biological processes, achieving complementary or synergistic effects. Furthermore, the gene pairs lacking common transcription factors are sensitive to perturbation and hence lead to differential coexpression. Based on these observations, we integrated transcriptional regulatory information into DCEN and identified transcription factors that might cause differential coexpression by gain or loss of activation in response to different situations. Collectively, our results not only uncover the unique structural characteristics of DCEN but also provide new insights into interpretation of DCEN to reveal its biological significance and infer the underlying gene regulatory dynamics.

  13. Rat Hepatocytes Weighted Gene Co-Expression Network Analysis Identifies Specific Modules and Hub Genes Related to Liver Regeneration after Partial Hepatectomy

    PubMed Central

    Zhou, Yun; Xu, Jiucheng; Liu, Yunqing; Li, Juntao; Chang, Cuifang; Xu, Cunshuan

    2014-01-01

    The recovery of liver mass is mainly mediated by proliferation of hepatocytes after 2/3 partial hepatectomy (PH) in rats. Studying the gene expression profiles of hepatocytes after 2/3 PH will be helpful to investigate the molecular mechanisms of liver regeneration (LR). We report here the first application of weighted gene co-expression network analysis (WGCNA) to analyze the biological implications of gene expression changes associated with LR. WGCNA identifies 12 specific gene modules and some hub genes from hepatocytes genome-scale microarray data in rat LR. The results suggest that upregulated MCM5 may promote hepatocytes proliferation during LR; BCL3 may play an important role by activating or inhibiting NF-kB pathway; MAPK9 may play a permissible role in DNA replication by p38 MAPK inactivation in hepatocytes proliferation stage. Thus, WGCNA can provide novel insight into understanding the molecular mechanisms of LR. PMID:24743545

  14. Identification of candidate genes in Arabidopsis and Populus cell wall biosynthesis using text-mining, co-expression network analysis and comparative genomics.

    PubMed

    Yang, Xiaohan; Ye, Chu-Yu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-12-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of biofuels from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidence supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database, and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional characterization in relation to cell wall biosynthesis.

  15. CluGene: A Bioinformatics Framework for the Identification of Co-Localized, Co-Expressed and Co-Regulated Genes Aimed at the Investigation of Transcriptional Regulatory Networks from High-Throughput Expression Data

    PubMed Central

    Dottorini, Tania; Palladino, Pietro; Senin, Nicola; Persampieri, Tania; Spaccapelo, Roberta; Crisanti, Andrea

    2013-01-01

    The full understanding of the mechanisms underlying transcriptional regulatory networks requires unravelling of complex causal relationships. Genome high-throughput technologies produce a huge amount of information pertaining gene expression and regulation; however, the complexity of the available data is often overwhelming and tools are needed to extract and organize the relevant information. This work starts from the assumption that the observation of co-occurrent events (in particular co-localization, co-expression and co-regulation) may provide a powerful starting point to begin unravelling transcriptional regulatory networks. Co-expressed genes often imply shared functional pathways; co-expressed and functionally related genes are often co-localized, too; moreover, co-expressed and co-localized genes are also potential targets for co-regulation; finally, co-regulation seems more frequent for genes mapped to proximal chromosome regions. Despite the recognized importance of analysing co-occurrent events, no bioinformatics solution allowing the simultaneous analysis of co-expression, co-localization and co-regulation is currently available. Our work resulted in developing and valuating CluGene, a software providing tools to analyze multiple types of co-occurrences within a single interactive environment allowing the interactive investigation of combined co-expression, co-localization and co-regulation of genes. The use of CluGene will enhance the power of testing hypothesis and experimental approaches aimed at unravelling transcriptional regulatory networks. The software is freely available at http://bioinfolab.unipg.it/. PMID:23823315

  16. A gene co-expression network predicts functional genes controlling the re-establishment of desiccation tolerance in germinated Arabidopsis thaliana seeds.

    PubMed

    Costa, Maria Cecília D; Righetti, Karima; Nijveen, Harm; Yazdanpanah, Farzaneh; Ligterink, Wilco; Buitink, Julia; Hilhorst, Henk W M

    2015-08-01

    During re-establishment of desiccation tolerance (DT), early events promote initial protection and growth arrest, while late events promote stress adaptation and contribute to survival in the dry state. Mature seeds of Arabidopsis thaliana are desiccation tolerant, but they lose desiccation tolerance (DT) while progressing to germination. Yet, there is a small developmental window during which DT can be rescued by treatment with abscisic acid (ABA). To gain temporal resolution and identify relevant genes in this process, data from a time series of microarrays were used to build a gene co-expression network. The network has two regions, namely early response (ER) and late response (LR). Genes in the ER region are related to biological processes, such as dormancy, acquisition of DT and drought, amplification of signals, growth arrest and induction of protection mechanisms (such as LEA proteins). Genes in the LR region lead to inhibition of photosynthesis and primary metabolism, promote adaptation to stress conditions and contribute to seed longevity. Phenotyping of 12 hubs in relation to re-establishment of DT with T-DNA insertion lines indicated a significant increase in the ability to re-establish DT compared with the wild-type in the lines cbsx4, at3g53040 and at4g25580, suggesting the operation of redundant and compensatory mechanisms. Moreover, we show that re-establishment of DT by polyethylene glycol and ABA occurs through partially overlapping mechanisms. Our data confirm that co-expression network analysis is a valid approach to examine data from time series of transcriptome analysis, as it provides promising insights into biologically relevant relations that help to generate new information about the roles of certain genes for DT.

  17. Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network

    PubMed Central

    Liao, Qi; Liu, Changning; Yuan, Xiongying; Kang, Shuli; Miao, Ruoyu; Xiao, Hui; Zhao, Guoguang; Luo, Haitao; Bu, Dechao; Zhao, Haitao; Skogerbø, Geir; Wu, Zhongdao; Zhao, Yi

    2011-01-01

    Although accumulating evidence has provided insight into the various functions of long-non-coding RNAs (lncRNAs), the exact functions of the majority of such transcripts are still unknown. Here, we report the first computational annotation of lncRNA functions based on public microarray expression profiles. A coding–non-coding gene co-expression (CNC) network was constructed from re-annotated Affymetrix Mouse Genome Array data. Probable functions for altogether 340 lncRNAs were predicted based on topological or other network characteristics, such as module sharing, association with network hubs and combinations of co-expression and genomic adjacency. The functions annotated to the lncRNAs mainly involve organ or tissue development (e.g. neuron, eye and muscle development), cellular transport (e.g. neuronal transport and sodium ion, acid or lipid transport) or metabolic processes (e.g. involving macromolecules, phosphocreatine and tyrosine). PMID:21247874

  18. Gene co-expression network analysis identifies porcine genes associated with variation in metabolizing fenbendazole and flunixin meglumine in the liver.

    PubMed

    Howard, Jeremy T; Ashwell, Melissa S; Baynes, Ronald E; Brooks, James D; Yeatts, James L; Maltecca, Christian

    2017-05-02

    Identifying individual genetic variation in drug metabolism pathways is of importance not only in livestock, but also in humans in order to provide the ultimate goal of giving the right drug at the right dose at the right time. Our objective was to identify individual genes and gene networks involved in metabolizing fenbendazole (FBZ) and flunixin meglumine (FLU) in swine liver. The population consisted of female and castrated male pigs that were sired by boars represented by 4 breeds. Progeny were randomly placed into groups: no drug (UNT), FLU or FBZ administered. Liver transcriptome profiles from 60 animals with extreme (i.e. fast or slow drug metabolism) pharmacokinetic (PK) profiles were generated from RNA sequencing. Multiple cytochrome P450 (CYP1A1, CYP2A19 and CYP2C36) genes displayed different transcript levels across treated versus UNT. Weighted gene co-expression network analysis identified 5 and 3 modules of genes correlated with PK parameters and a portion of these were enriched for biological processes relevant to drug metabolism for FBZ and FLU, respectively. Genes within identified modules were shown to have a higher transcript level relationship (i.e. connectivity) in treated versus UNT animals. Investigation into the identified genes would allow for greater insight into FBZ and FLU metabolism.

  19. Identifying genetic loci and spleen gene coexpression networks underlying immunophenotypes in BXD recombinant inbred mice

    PubMed Central

    Lynch, Rachel M.; Naswa, Sudhir; Rogers, Gary L.; Kania, Stephen A.; Das, Suchita; Chesler, Elissa J.; Saxton, Arnold M.; Langston, Michael A.

    2010-01-01

    The immune system plays a pivotal role in the susceptibility to and progression of a variety of diseases. Due to a strong genetic basis, heritable differences in immune function may contribute to differential disease susceptibility between individuals. Genetic reference populations, such as the BXD (C57BL/6J × DBA/2J) panel of recombinant inbred (RI) mouse strains, provide unique models through which to integrate baseline phenotypes in healthy individuals with heritable risk for disease because of the ability to combine data collected from these populations across both multiple studies and time. We performed basic immunophenotyping (e.g., percentage of circulating B and T lymphocytes and CD4+ and CD8+ T cell subpopulations) in peripheral blood of healthy mice from 41 BXD RI strains to define the immunophenotypic variation in this strain panel and to characterize the genetic architecture that underlies these traits. Significant QTL models that explained the majority (50–77%) of phenotypic variance were derived for each trait and for the T:B cell and CD4+:CD8+ ratios. Combining QTL mapping with spleen gene expression data uncovered two quantitative trait transcripts, Ptprk and Acp1, as candidates for heritable differences in the relative abundance of helper and cytotoxic T cells. These data will be valuable in extracting genetic correlates of the immune system in the BXD panel. In addition, they will be a useful resource for prospective, phenotype-driven model selection to test hypotheses about differential disease or environmental susceptibility between individuals with baseline differences in the composition of the immune system. PMID:20179155

  20. A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes

    PubMed Central

    de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806

  1. A gene co-expression network in whole blood of schizophrenia patients is independent of antipsychotic-use and enriched for brain-expressed genes.

    PubMed

    de Jong, Simone; Boks, Marco P M; Fuller, Tova F; Strengman, Eric; Janson, Esther; de Kovel, Carolien G F; Ori, Anil P S; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D; Cahn, Wiepke; Kahn, René S; Horvath, Steve; Ophoff, Roel A

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network.

  2. Identification of therapeutic targets for Alzheimer's disease via differentially expressed gene and weighted gene co-expression network analyses.

    PubMed

    Jia, Yujie; Nie, Kun; Li, Jing; Liang, Xinyue; Zhang, Xuezhu

    2016-11-01

    In order to investigate the pathogenic targets and associated biological process of Alzheimer's disease in the present study, mRNA expression profiles (GSE28146) and microRNA (miRNA) expression profiles (GSE16759) were downloaded from the Gene Expression Omnibus database. In GSE28146, eight control samples, and Alzheimer's disease samples comprising seven incipient, eight moderate, seven severe Alzheimer's disease samples, were included. The Affy package in R was used for background correction and normalization of the raw microarray data. The differentially expressed genes (DEGs) and differentially expressed miRNAs were identified using the Limma package. In addition, mRNAs were clustered using weighted gene correlation network analysis, and modules found to be significantly associated with the stages of Alzheimer's disease were screened out. The Database for Annotation, Visualization, and Integrated Discovery was used to perform Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses. The target genes of the differentially expressed miRNAs were identified using the miRWalk database. Compared with the control samples, 175,59 genes and 90 DEGs were identified in the incipient, moderate and severe Alzheimer's disease samples, respectively. A module, which contained 1,592 genes was found to be closely associated with the stage of Alzheimer's disease and biological processes. In addition, pathways associated with Alzheimer's disease and other neurological diseases were found to be enriched in those genes. A total of 139 overlapped genes were identified between those genes and the DEGs in the three groups. From the miRNA expression profiles, 189 miRNAs were found differentially expressed in the samples from patients with Alzheimer's disease and 1,647 target genes were obtained. In addition, five overlapped genes were identified between those 1,647 target genes and the 139 genes, and these genes may be important pathogenic targets for Alzheimer

  3. From SNP co-association to RNA co-expression: novel insights into gene networks for intramuscular fatty acid composition in porcine.

    PubMed

    Ramayo-Caldas, Yuliaxis; Ballester, Maria; Fortes, Marina R S; Esteve-Codina, Anna; Castelló, Anna; Noguera, Jose L; Fernández, Ana I; Pérez-Enciso, Miguel; Reverter, Antonio; Folch, Josep M

    2014-03-26

    Fatty acids (FA) play a critical role in energy homeostasis and metabolic diseases; in the context of livestock species, their profile also impacts on meat quality for healthy human consumption. Molecular pathways controlling lipid metabolism are highly interconnected and are not fully understood. Elucidating these molecular processes will aid technological development towards improvement of pork meat quality and increased knowledge of FA metabolism, underpinning metabolic diseases in humans. The results from genome-wide association studies (GWAS) across 15 phenotypes were subjected to an Association Weight Matrix (AWM) approach to predict a network of 1,096 genes related to intramuscular FA composition in pigs. To identify the key regulators of FA metabolism, we focused on the minimal set of transcription factors (TF) that the explored the majority of the network topology. Pathway and network analyses pointed towards a trio of TF as key regulators of FA metabolism: NCOA2, FHL2 and EP300. Promoter sequence analyses confirmed that these TF have binding sites for some well-know regulators of lipid and carbohydrate metabolism. For the first time in a non-model species, some of the co-associations observed at the genetic level were validated through co-expression at the transcriptomic level based on real-time PCR of 40 genes in adipose tissue, and a further 55 genes in liver. In particular, liver expression of NCOA2 and EP300 differed between pig breeds (Iberian and Landrace) extreme in terms of fat deposition. Highly clustered co-expression networks in both liver and adipose tissues were observed. EP300 and NCOA2 showed centrality parameters above average in the both networks. Over all genes, co-expression analyses confirmed 28.9% of the AWM predicted gene-gene interactions in liver and 33.0% in adipose tissue. The magnitude of this validation varied across genes, with up to 60.8% of the connections of NCOA2 in adipose tissue being validated via co-expression. Our

  4. In silico identification of miRNAs and their target genes and analysis of gene co-expression network in saffron (Crocus sativus L.) stigma

    PubMed Central

    Zinati, Zahra; Shamloo-Dashtpagerdi, Roohollah; Behpouri, Ali

    2016-01-01

    As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characterization of miRNAs along with the corresponding target genes in C. sativus might expand our perspectives on the roles of miRNAs in carotenoid/apocarotenoid biosynthetic pathway. A computational analysis was used to identify miRNAs and their targets using EST (Expressed Sequence Tag) library from mature saffron stigmas. Then, a gene co- expression network was constructed to identify genes which are potentially involved in carotenoid/apocarotenoid biosynthetic pathways. EST analysis led to the identification of two putative miRNAs (miR414 and miR837-5p) along with the corresponding stem- looped precursors. To our knowledge, this is the first report on miR414 and miR837-5p in C. sativus. Co-expression network analysis indicated that miR414 and miR837-5p may play roles in C. sativus metabolic pathways and led to identification of candidate genes including six transcription factors and one protein kinase probably involved in carotenoid/apocarotenoid biosynthetic pathway. Presence of transcription factors, miRNAs and protein kinase in the network indicated multiple layers of regulation in saffron stigma. The candidate genes from this study may help unraveling regulatory networks underlying the carotenoid/apocarotenoid biosynthesis in saffron and designing metabolic engineering for enhanced secondary metabolites. PMID:28261627

  5. In silico identification of miRNAs and their target genes and analysis of gene co-expression network in saffron (Crocus sativus L.) stigma.

    PubMed

    Zinati, Zahra; Shamloo-Dashtpagerdi, Roohollah; Behpouri, Ali

    2016-12-01

    As an aromatic and colorful plant of substantive taste, saffron (Crocus sativus L.) owes such properties of matter to growing class of the secondary metabolites derived from the carotenoids, apocarotenoids. Regarding the critical role of microRNAs in secondary metabolic synthesis and the limited number of identified miRNAs in C. sativus, on the other hand, one may see the point how the characterization of miRNAs along with the corresponding target genes in C. sativus might expand our perspectives on the roles of miRNAs in carotenoid/apocarotenoid biosynthetic pathway. A computational analysis was used to identify miRNAs and their targets using EST (Expressed Sequence Tag) library from mature saffron stigmas. Then, a gene co- expression network was constructed to identify genes which are potentially involved in carotenoid/apocarotenoid biosynthetic pathways. EST analysis led to the identification of two putative miRNAs (miR414 and miR837-5p) along with the corresponding stem- looped precursors. To our knowledge, this is the first report on miR414 and miR837-5p in C. sativus. Co-expression network analysis indicated that miR414 and miR837-5p may play roles in C. sativus metabolic pathways and led to identification of candidate genes including six transcription factors and one protein kinase probably involved in carotenoid/apocarotenoid biosynthetic pathway. Presence of transcription factors, miRNAs and protein kinase in the network indicated multiple layers of regulation in saffron stigma. The candidate genes from this study may help unraveling regulatory networks underlying the carotenoid/apocarotenoid biosynthesis in saffron and designing metabolic engineering for enhanced secondary metabolites.

  6. Inference of Longevity-Related Genes from a Robust Coexpression Network of Seed Maturation Identifies Regulators Linking Seed Storability to Biotic Defense-Related Pathways

    PubMed Central

    Righetti, Karima; Vu, Joseph Ly; Pelletier, Sandra; Vu, Benoit Ly; Glaab, Enrico; Lalanne, David; Pasha, Asher; Patel, Rohan V.; Provart, Nicholas J.; Verdier, Jerome; Leprince, Olivier

    2015-01-01

    Seed longevity, the maintenance of viability during storage, is a crucial factor for preservation of genetic resources and ensuring proper seedling establishment and high crop yield. We used a systems biology approach to identify key genes regulating the acquisition of longevity during seed maturation of Medicago truncatula. Using 104 transcriptomes from seed developmental time courses obtained in five growth environments, we generated a robust, stable coexpression network (MatNet), thereby capturing the conserved backbone of maturation. Using a trait-based gene significance measure, a coexpression module related to the acquisition of longevity was inferred from MatNet. Comparative analysis of the maturation processes in M. truncatula and Arabidopsis thaliana seeds and mining Arabidopsis interaction databases revealed conserved connectivity for 87% of longevity module nodes between both species. Arabidopsis mutant screening for longevity and maturation phenotypes demonstrated high predictive power of the longevity cross-species network. Overrepresentation analysis of the network nodes indicated biological functions related to defense, light, and auxin. Characterization of defense-related wrky3 and nf-x1-like1 (nfxl1) transcription factor mutants demonstrated that these genes regulate some of the network nodes and exhibit impaired acquisition of longevity during maturation. These data suggest that seed longevity evolved by co-opting existing genetic pathways regulating the activation of defense against pathogens. PMID:26410298

  7. Transcriptome comparison and gene coexpression network analysis provide a systems view of citrus response to ‘Candidatus Liberibacter asiaticus’ infection

    PubMed Central

    2013-01-01

    Background Huanglongbing (HLB) is arguably the most destructive disease for the citrus industry. HLB is caused by infection of the bacterium, Candidatus Liberibacter spp. Several citrus GeneChip studies have revealed thousands of genes that are up- or down-regulated by infection with Ca. Liberibacter asiaticus. However, whether and how these host genes act to protect against HLB remains poorly understood. Results As a first step towards a mechanistic view of citrus in response to the HLB bacterial infection, we performed a comparative transcriptome analysis and found that a total of 21 Probesets are commonly up-regulated by the HLB bacterial infection. In addition, a number of genes are likely regulated specifically at early, late or very late stages of the infection. Furthermore, using Pearson correlation coefficient-based gene coexpression analysis, we constructed a citrus HLB response network consisting of 3,507 Probesets and 56,287 interactions. Genes involved in carbohydrate and nitrogen metabolic processes, transport, defense, signaling and hormone response were overrepresented in the HLB response network and the subnetworks for these processes were constructed. Analysis of the defense and hormone response subnetworks indicates that hormone response is interconnected with defense response. In addition, mapping the commonly up-regulated HLB responsive genes into the HLB response network resulted in a core subnetwork where transport plays a key role in the citrus response to the HLB bacterial infection. Moreover, analysis of a phloem protein subnetwork indicates a role for this protein and zinc transporters or zinc-binding proteins in the citrus HLB defense response. Conclusion Through integrating transcriptome comparison and gene coexpression network analysis, we have provided for the first time a systems view of citrus in response to the Ca. Liberibacter spp. infection causing HLB. PMID:23324561

  8. Expression atlas and comparative coexpression network analyses reveal important genes involved in the formation of lignified cell wall in Brachypodium distachyon.

    PubMed

    Sibout, Richard; Proost, Sebastian; Hansen, Bjoern Oest; Vaid, Neha; Giorgi, Federico M; Ho-Yue-Kuang, Severine; Legée, Frédéric; Cézart, Laurent; Bouchabké-Coussa, Oumaya; Soulhat, Camille; Provart, Nicholas; Pasha, Asher; Le Bris, Philippe; Roujol, David; Hofte, Herman; Jamet, Elisabeth; Lapierre, Catherine; Persson, Staffan; Mutwil, Marek

    2017-08-01

    While Brachypodium distachyon (Brachypodium) is an emerging model for grasses, no expression atlas or gene coexpression network is available. Such tools are of high importance to provide insights into the function of Brachypodium genes. We present a detailed Brachypodium expression atlas, capturing gene expression in its major organs at different developmental stages. The data were integrated into a large-scale coexpression database ( www.gene2function.de), enabling identification of duplicated pathways and conserved processes across 10 plant species, thus allowing genome-wide inference of gene function. We highlight the importance of the atlas and the platform through the identification of duplicated cell wall modules, and show that a lignin biosynthesis module is conserved across angiosperms. We identified and functionally characterised a putative ferulate 5-hydroxylase gene through overexpression of it in Brachypodium, which resulted in an increase in lignin syringyl units and reduced lignin content of mature stems, and led to improved saccharification of the stem biomass. Our Brachypodium expression atlas thus provides a powerful resource to reveal functionally related genes, which may advance our understanding of important biological processes in grasses. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  9. Protein Co-Expression Network Analysis (ProCoNA)

    SciTech Connect

    Gibbs, David L.; Baratt, Arie; Baric, Ralph; Kawaoka, Yoshihiro; Smith, Richard D.; Orwoll, Eric S.; Katze, Michael G.; Mcweeney, Shannon K.

    2013-06-01

    Biological networks are important for elucidating disease etiology due to their ability to model complex high dimensional data and biological systems. Proteomics provides a critical data source for such models, but currently lacks robust de novo methods for network construction, which could bring important insights in systems biology. We have evaluated the construction of network models using methods derived from weighted gene co-expression network analysis (WGCNA). We show that approximately scale-free peptide networks, composed of statistically significant modules, are feasible and biologically meaningful using two mouse lung experiments and one human plasma experiment. Within each network, peptides derived from the same protein are shown to have a statistically higher topological overlap and concordance in abundance, which is potentially important for inferring protein abundance. The module representatives, called eigenpeptides, correlate significantly with biological phenotypes. Furthermore, within modules, we find significant enrichment for biological function and known interactions (gene ontology and protein-protein interactions). Biological networks are important tools in the analysis of complex systems. In this paper we evaluate the application of weighted co-expression network analysis to quantitative proteomics data. Protein co-expression networks allow novel approaches for biological interpretation, quality control, inference of protein abundance, a framework for potentially resolving degenerate peptide-protein mappings, and a biomarker signature discovery.

  10. A developmental transcriptional network for maize defines coexpression modules.

    PubMed

    Downs, Gregory S; Bi, Yong-Mei; Colasanti, Joseph; Wu, Wenqing; Chen, Xi; Zhu, Tong; Rothstein, Steven J; Lukens, Lewis N

    2013-04-01

    Here, we present a genome-wide overview of transcriptional circuits in the agriculturally significant crop species maize (Zea mays). We examined transcript abundance data at 50 developmental stages, from embryogenesis to senescence, for 34,876 gene models and classified genes into 24 robust coexpression modules. Modules were strongly associated with tissue types and related biological processes. Sixteen of the 24 modules (67%) have preferential transcript abundance within specific tissues. One-third of modules had an absence of gene expression in specific tissues. Genes within a number of modules also correlated with the developmental age of tissues. Coexpression of genes is likely due to transcriptional control. For a number of modules, key genes involved in transcriptional control have expression profiles that mimic the expression profiles of module genes, although the expression of transcriptional control genes is not unusually representative of module gene expression. Known regulatory motifs are enriched in several modules. Finally, of the 13 network modules with more than 200 genes, three contain genes that are notably clustered (P < 0.05) within the genome. This work, based on a carefully selected set of major tissues representing diverse stages of maize development, demonstrates the remarkable power of transcript-level coexpression networks to identify underlying biological processes and their molecular components.

  11. Network-Based Identification of Biomarkers Coexpressed with Multiple Pathways

    PubMed Central

    Guo, Nancy Lan; Wan, Ying-Wooi

    2014-01-01

    Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database. PMID:25392692

  12. A Network Approach of Gene Co-expression in the Zea mays/Aspergillus flavus Pathosystem to Map Host/Pathogen Interaction Pathways

    PubMed Central

    Musungu, Bryan M.; Bhatnagar, Deepak; Brown, Robert L.; Payne, Gary A.; OBrian, Greg; Fakhoury, Ahmad M.; Geisler, Matt

    2016-01-01

    A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus, a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays, and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays, there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus. Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus. PMID:27917194

  13. A Network Approach of Gene Co-expression in the Zea mays/Aspergillus flavus Pathosystem to Map Host/Pathogen Interaction Pathways.

    PubMed

    Musungu, Bryan M; Bhatnagar, Deepak; Brown, Robert L; Payne, Gary A; OBrian, Greg; Fakhoury, Ahmad M; Geisler, Matt

    2016-01-01

    A gene co-expression network (GEN) was generated using a dual RNA-seq study with the fungal pathogen Aspergillus flavus and its plant host Zea mays during the initial 3 days of infection. The analysis deciphered novel pathways and mapped genes of interest in both organisms during the infection. This network revealed a high degree of connectivity in many of the previously recognized pathways in Z. mays such as jasmonic acid, ethylene, and reactive oxygen species (ROS). For the pathogen A. flavus, a link between aflatoxin production and vesicular transport was identified within the network. There was significant interspecies correlation of expression between Z. mays and A. flavus for a subset of 104 Z. mays, and 1942 A. flavus genes. This resulted in an interspecies subnetwork enriched in multiple Z. mays genes involved in the production of ROS. In addition to the ROS from Z. mays, there was enrichment in the vesicular transport pathways and the aflatoxin pathway for A. flavus. Included in these genes, a key aflatoxin cluster regulator, AflS, was found to be co-regulated with multiple Z. mays ROS producing genes within the network, suggesting AflS may be monitoring host ROS levels. The entire GEN for both host and pathogen, and the subset of interspecies correlations, is presented as a tool for hypothesis generation and discovery for events in the early stages of fungal infection of Z. mays by A. flavus.

  14. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.

  15. Integration of liver gene co-expression networks and eGWAs analyses highlighted candidate regulators implicated in lipid metabolism in pigs.

    PubMed

    Ballester, Maria; Ramayo-Caldas, Yuliaxis; Revilla, Manuel; Corominas, Jordi; Castelló, Anna; Estellé, Jordi; Fernández, Ana I; Folch, Josep M

    2017-04-19

    In the present study, liver co-expression networks and expression Genome Wide Association Study (eGWAS) were performed to identify DNA variants and molecular pathways implicated in the functional regulatory mechanisms of meat quality traits in pigs. With this purpose, the liver mRNA expression of 44 candidates genes related with lipid metabolism was analysed in 111 Iberian x Landrace backcross animals. The eGWAS identified 92 eSNPs located in seven chromosomal regions and associated with eight genes: CROT, CYP2U1, DGAT1, EGF, FABP1, FABP5, PLA2G12A, and PPARA. Remarkably, cis-eSNPs associated with FABP1 gene expression which may be determining the C18:2(n-6)/C18:3(n-3) ratio in backfat through the multiple interaction of DNA variants and genes were identified. Furthermore, a hotspot on SSC8 associated with the gene expression of eight genes was identified and the TBCK gene was pointed out as candidate gene regulating it. Our results also suggested that the PI3K-Akt-mTOR pathway plays an important role in the control of the analysed genes highlighting nuclear receptors as the NR3C1 or PPARA. Finally, sex-dimorphism associated with hepatic lipid metabolism was identified with over-representation of female-biased genes. These results increase our knowledge of the genetic architecture underlying fat composition traits.

  16. Comprehensive meta-analysis, co-expression, and miRNA nested network analysis identifies gene candidates in citrus against Huanglongbing disease.

    PubMed

    Rawat, Nidhi; Kiran, Sandhya P; Du, Dongliang; Gmitter, Fred G; Deng, Zhanao

    2015-07-28

    Huanglongbing (HLB), the most devastating disease of citrus, is associated with infection by Candidatus Liberibacter asiaticus (CaLas) and is vectored by the Asian citrus psyllid (ACP). Recently, the molecular basis of citrus-HLB interactions has been examined using transcriptome analyses, and these analyses have identified many probe sets and pathways modulated by CaLas infection among different citrus cultivars. However, lack of consistency among reported findings indicates that an integrative approach is needed. This study was designed to identify the candidate probe sets in citrus-HLB interactions using meta-analysis and gene co-expression network modelling. Twenty-two publically available transcriptome studies on citrus-HLB interactions, comprising 18 susceptible (S) datasets and four resistant (R) datasets, were investigated using Limma and RankProd methods of meta-analysis. A combined list of 7,412 differentially expressed probe sets was generated using a Teradata in-house Structured Query Language (SQL) script. We identified the 65 most common probe sets modulated in HLB disease among different tissues from the S and R datasets. Gene ontology analysis of these probe sets suggested that carbohydrate metabolism, nutrient transport, and biotic stress were the core pathways that were modulated in citrus by CaLas infection and HLB development. We also identified R-specific probe sets, which encoded leucine-rich repeat proteins, chitinase, constitutive disease resistance (CDR), miraculins, and lectins. Weighted gene co-expression network analysis (WGCNA) was conducted on 3,499 probe sets, and 21 modules with major hub probe sets were identified. Further, a miRNA nested network was created to examine gene regulation of the 3,499 target probe sets. Results suggest that csi-miR167 and csi-miR396 could affect ion transporters and defence response pathways, respectively. Most of the potential candidate hub probe sets were co-expressed with gibberellin pathway (GA

  17. Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer

    PubMed Central

    Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi

    2017-01-01

    Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species. PMID:28727829

  18. Functional differentiation and spatial-temporal co-expression networks of the NBS-encoding gene family in Jilin ginseng, Panax ginseng C.A. Meyer.

    PubMed

    Yin, Rui; Zhao, Mingzhu; Wang, Kangyu; Lin, Yanping; Wang, Yanfang; Sun, Chunyu; Wang, Yi; Zhang, Meiping

    2017-01-01

    Ginseng, Panax ginseng C.A. Meyer, is one of the most important medicinal plants for human health and medicine. It has been documented that over 80% of genes conferring resistance to bacteria, viruses, fungi and nematodes are contributed by the nucleotide binding site (NBS)-encoding gene family. Therefore, identification and characterization of NBS genes expressed in ginseng are paramount to its genetic improvement and breeding. However, little is known about the NBS-encoding genes in ginseng. Here we report genome-wide identification and systems analysis of the NBS genes actively expressed in ginseng (PgNBS genes). Four hundred twelve PgNBS gene transcripts, derived from 284 gene models, were identified from the transcriptomes of 14 ginseng tissues. These genes were classified into eight types, including TNL, TN, CNL, CN, NL, N, RPW8-NL and RPW8-N. Seven conserved motifs were identified in both the Toll/interleukine-1 receptor (TIR) and coiled-coil (CC) typed genes whereas six were identified in the RPW8 typed genes. Phylogenetic analysis showed that the PgNBS gene family is an ancient family, with a vast majority of its genes originated before ginseng originated. In spite of their belonging to a family, the PgNBS genes have functionally dramatically differentiated and been categorized into numerous functional categories. The expressions of the across tissues, different aged roots and the roots of different genotypes. However, they are coordinating in expression, forming a single co-expression network. These results provide a deeper understanding of the origin, evolution and functional differentiation and expression dynamics of the NBS-encoding gene family in plants in general and in ginseng particularly, and a NBS gene toolkit useful for isolation and characterization of disease resistance genes and for enhanced disease resistance breeding in ginseng and related species.

  19. Integrating mRNA and miRNA Weighted Gene Co-Expression Networks with eQTLs in the Nucleus Accumbens of Subjects with Alcohol Dependence

    PubMed Central

    Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; D. van der Vaart, Andrew; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S.; Miles, Michael F.; Dick, Danielle; Riley, Brien P.; Dumur, Catherine; Vladimirov, Vladimir I.

    2015-01-01

    Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263

  20. Global transcriptional responses of denitrifying bacteria to functionalized single-walled carbon nanotubes revealed by weighted gene-coexpression network analysis.

    PubMed

    Zheng, Xiong; Su, Yinglong; Chen, Yinguang; Huang, Haining; Shen, Qiuting

    2017-09-24

    Functionalized single-walled carbon nanotubes (f-SWNTs) are widely used in many fields due to the unique structure and the excellent properties. Although these nanomaterials have been reported to enable to cause negative effects on denitrifying bacteria once they enter the environment, the toxic behaviors and regulatory mechanisms of f-SWNTs to denitrification remain unclear. In this study, the denitrification performance of a model denitrifier exposed to pristine and functionalized SWNTs was investigated, and the global transcriptional responses were comprehensively explored by RNA-seq and weighted gene-coexpression network analysis (WGCNA). Although both hydroxyl SWNTs (SWNTs-OH) and carboxyl SWNTs (SWNTs-COOH) showed inhibitory effects on bacterial denitrification, the former more severely inhibited denitrification than the latter. Transcriptional profiles showed that compared with SWNTs-COOH, SWNTs-OH much more strongly influenced the expressions of the key genes related to signal transduction, substance transport, electron transfer and transcriptional regulation. Functional analysis further indicated that the genes associated with substrate transport, carbon source metabolism and electron transfer underwent dramatic down-regulation. Using WGCNA, 12 gene modules were established corresponding to various types of carbon nanotubes, and eigengene adjacency analysis revealed the key gene modules related to denitrification performance under different conditions. Hub gene network analysis revealed the key regulatory factors of bacterial denitrification induced by f-SWNTs. The results suggested that f-SWNTs modulated the key genes responsible for the glycerolipid/free fatty acid (GL/FFA) cycle, and thus disturb processes associated with denitrification, including signaling process, energy homeostasis, intracellular redox balance and transportation. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses.

    PubMed

    Rutter, William B; Salcedo, Andres; Akhunova, Alina; He, Fei; Wang, Shichen; Liang, Hanquan; Bowden, Robert L; Akhunov, Eduard

    2017-04-12

    Two opposing evolutionary constraints exert pressure on plant pathogens: one to diversify virulence factors in order to evade plant defenses, and the other to retain virulence factors critical for maintaining a compatible interaction with the plant host. To better understand how the diversified arsenals of fungal genes promote interaction with the same compatible wheat line, we performed a comparative genomic analysis of two North American isolates of Puccinia graminis f. sp. tritici (Pgt). The patterns of inter-isolate divergence in the secreted candidate effector genes were compared with the levels of conservation and divergence of plant-pathogen gene co-expression networks (GCN) developed for each isolate. Comprative genomic analyses revealed substantial level of interisolate divergence in effector gene complement and sequence divergence. Gene Ontology (GO) analyses of the conserved and unique parts of the isolate-specific GCNs identified a number of conserved host pathways targeted by both isolates. Interestingly, the degree of inter-isolate sub-network conservation varied widely for the different host pathways and was positively associated with the proportion of conserved effector candidates associated with each sub-network. While different Pgt isolates tended to exploit similar wheat pathways for infection, the mode of plant-pathogen interaction varied for different pathways with some pathways being associated with the conserved set of effectors and others being linked with the diverged or isolate-specific effectors. Our data suggest that at the intra-species level pathogen populations likely maintain divergent sets of effectors capable of targeting the same plant host pathways. This functional redundancy may play an important role in the dynamic of the "arms-race" between host and pathogen serving as the basis for diverse virulence strategies and creating conditions where mutations in certain effector groups will not have a major effect on the pathogen

  2. Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction

    SciTech Connect

    Wang, Jing; Ma, Zihao; Carr, Steven A.; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W.; Ellis, Matthew J. C.; Townsend, R. Reid; Smith, Richard D.; McDermott, Jason E.; Chen, Xian; Paulovich, Amanda G.; Boja, Emily S.; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Rodland, Karin D.; Liebler, Daniel C.; Zhang, Bing

    2016-11-11

    Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies

  3. Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction.

    PubMed

    Wang, Jing; Ma, Zihao; Carr, Steven A; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W; Ellis, Matthew J C; Townsend, R Reid; Smith, Richard D; McDermott, Jason E; Chen, Xian; Paulovich, Amanda G; Boja, Emily S; Mesri, Mehdi; Kinsinger, Christopher R; Rodriguez, Henry; Rodland, Karin D; Liebler, Daniel C; Zhang, Bing

    2017-01-01

    Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this "guilt-by-association" (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. © 2017 by

  4. Proteome Profiling Outperforms Transcriptome Profiling for Coexpression Based Gene Function Prediction*

    PubMed Central

    Wang, Jing; Ma, Zihao; Carr, Steven A.; Mertins, Philipp; Zhang, Hui; Zhang, Zhen; Chan, Daniel W.; Ellis, Matthew J. C.; Townsend, R. Reid; Smith, Richard D.; McDermott, Jason E.; Chen, Xian; Paulovich, Amanda G.; Boja, Emily S.; Mesri, Mehdi; Kinsinger, Christopher R.; Rodriguez, Henry; Rodland, Karin D.; Liebler, Daniel C.; Zhang, Bing

    2017-01-01

    Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. PMID

  5. ChlamyNET: a Chlamydomonas gene co-expression network reveals global properties of the transcriptome and the early setup of key co-expression patterns in the green lineage.

    PubMed

    Romero-Campero, Francisco J; Perez-Hurtado, Ignacio; Lucas-Reina, Eva; Romero, Jose M; Valverde, Federico

    2016-03-12

    Chlamydomonas reinhardtii is the model organism that serves as a reference for studies in algal genomics and physiology. It is of special interest in the study of the evolution of regulatory pathways from algae to higher plants. Additionally, it has recently gained attention as a potential source for bio-fuel and bio-hydrogen production. The genome of Chlamydomonas is available, facilitating the analysis of its transcriptome by RNA-seq data. This has produced a massive amount of data that remains fragmented making necessary the application of integrative approaches based on molecular systems biology. We constructed a gene co-expression network based on RNA-seq data and developed a web-based tool, ChlamyNET, for the exploration of the Chlamydomonas transcriptome. ChlamyNET exhibits a scale-free and small world topology. Applying clustering techniques, we identified nine gene clusters that capture the structure of the transcriptome under the analyzed conditions. One of the most central clusters was shown to be involved in carbon/nitrogen metabolism and signalling, whereas one of the most peripheral clusters was involved in DNA replication and cell cycle regulation. The transcription factors and regulators in the Chlamydomonas genome have been identified in ChlamyNET. The biological processes potentially regulated by them as well as their putative transcription factor binding sites were determined. The putative light regulated transcription factors and regulators in the Chlamydomonas genome were analyzed in order to provide a case study on the use of ChlamyNET. Finally, we used an independent data set to cross-validate the predictive power of ChlamyNET. The topological properties of ChlamyNET suggest that the Chlamydomonas transcriptome posseses important characteristics related to error tolerance, vulnerability and information propagation. The central part of ChlamyNET constitutes the core of the transcriptome where most authoritative hub genes are located

  6. Transcriptional modules related to hepatocellular carcinoma survival: coexpression network analysis.

    PubMed

    Xu, Xinsen; Zhou, Yanyan; Miao, Runchen; Chen, Wei; Qu, Kai; Pang, Qing; Liu, Chang

    2016-06-01

    We performed weighted gene coexpression network analysis (WGCNA) to gain insights into the molecular aspects of hepatocellular carcinoma (HCC). Raw microarray datasets (including 488 samples) were downloaded from the Gene Expression Omnibus (GEO) website. Data were normalized using the RMA algorithm. We utilized the WGCNA to identify the coexpressed genes (modules) after non-specific filtering. Correlation and survival analyses were conducted using the modules, and gene ontology (GO) enrichment was applied to explore the possible mechanisms. Eight distinct modules were identified by the WGCNA. Pink and red modules were associated with liver function, whereas turquoise and black modules were inversely correlated with tumor staging. Poor outcomes were found in the low expression group in the turquoise module and in the high expression group in the red module. In addition, GO enrichment analysis suggested that inflammation, immune, virus-related, and interferon-mediated pathways were enriched in the turquoise module. Several potential biomarkers, such as cyclin-dependent kinase 1 (CDK1), topoisomerase 2α (TOP2A), and serpin peptidase inhibitor clade C (antithrombin) member 1 (SERPINC1), were also identified. In conclusion, gene signatures identified from the genome-based assays could contribute to HCC stratification. WGCNA was able to identify significant groups of genes associated with cancer prognosis.

  7. Regulatory Networks:. Inferring Functional Relationships Through Co-Expression

    NASA Astrophysics Data System (ADS)

    Wanke, Dierk; Hahn, Achim; Kilian, Joachim; Harter, Klaus; Berendzen, Kenneth W.

    2010-01-01

    Gene expression data not only provide us insights into discrete transcript abundance of specific genes, but contain cryptic information that can not readily be assessed without interpretation. We again used data of the plant Arabidopsis thaliana as our reference organism, yet the analysis presented herein can be performed with any organism with various data sources. Within the cell, information is transduced via different signaling cascades and results in differential gene expression responses. The incoming signals are perceived from upstream signaling components and handed to downstream messengers that further deliver the signals to effector proteins which can directly influence gene expression. In most cases, we can assume that proteins, which are connected to other signaling components within such a regulatory network, exhibit similar expression trajectories. Thus, we extracted a known functional network from literature and demonstrated that it is possible to superimpose microarray expression data onto the pathways. Thereby, we could follow the information flow through time reflected by gene expression changes. This allowed us to predict, whether the upstream signal was transmitted from known elements contained in the network or relayed from outside components. We next conducted the vice versa approach and used large scale microarray expression data to build a co-expression matrix for all genes present on the array. From this, we computed a regulatory network, which allowed us to deduce known and novel signaling pathways.

  8. Weighted gene co-expression network analysis of colorectal cancer liver metastasis genome sequencing data and screening of anti-metastasis drugs.

    PubMed

    Gao, Bo; Shao, Qin; Choudhry, Hani; Marcus, Victoria; Dong, Kung; Ragoussis, Jiannis; Gao, Zu-Hua

    2016-09-01

    Approximately 9% of cancer-related deaths are caused by colorectal cancer (CRC). CRC patients are prone to liver metastasis, which is the most important cause for the high CRC mortality rate. Understanding the molecular mechanism of CRC liver metastasis could help us to find novel targets for the effective treatment of this deadly disease. Using weighted gene co-expression network analysis on the sequencing data of CRC with and with metastasis, we identified 5 colorectal cancer liver metastasis related modules which were labeled as brown, blue, grey, yellow and turquoise. In the brown module, which represents the metastatic tumor in the liver, gene ontology (GO) analysis revealed functions including the G-protein coupled receptor protein signaling pathway, epithelial cell differentiation and cell surface receptor linked signal transduction. In the blue module, which represents the primary CRC that has metastasized, GO analysis showed that the genes were mainly enriched in GO terms including G-protein coupled receptor protein signaling pathway, cell surface receptor linked signal transduction, and negative regulation of cell differentiation. In the yellow and turquoise modules, which represent the primary non-metastatic CRC, 13 downregulated CRC liver metastasis-related candidate miRNAs were identified (e.g. hsa-miR-204, hsa-miR-455, etc.). Furthermore, analyzing the DrugBank database and mining the literature identified 25 and 12 candidate drugs that could potentially block the metastatic processes of the primary tumor and inhibit the progression of metastatic tumors in the liver, respectively. Data generated from this study not only furthers our understanding of the genetic alterations that drive the metastatic process, but also guides the development of molecular-targeted therapy of colorectal cancer liver metastasis.

  9. Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets

    PubMed Central

    Rahmatallah, Yasir; Emmert-Streib, Frank; Glazko, Galina

    2014-01-01

    Motivation: To date, gene set analysis approaches primarily focus on identifying differentially expressed gene sets (pathways). Methods for identifying differentially coexpressed pathways also exist but are mostly based on aggregated pairwise correlations or other pairwise measures of coexpression. Instead, we propose Gene Sets Net Correlations Analysis (GSNCA), a multivariate differential coexpression test that accounts for the complete correlation structure between genes. Results: In GSNCA, weight factors are assigned to genes in proportion to the genes’ cross-correlations (intergene correlations). The problem of finding the weight vectors is formulated as an eigenvector problem with a unique solution. GSNCA tests the null hypothesis that for a gene set there is no difference in the weight vectors of the genes between two conditions. In simulation studies and the analyses of experimental data, we demonstrate that GSNCA captures changes in the structure of genes’ cross-correlations rather than differences in the averaged pairwise correlations. Thus, GSNCA infers differences in coexpression networks, however, bypassing method-dependent steps of network inference. As an additional result from GSNCA, we define hub genes as genes with the largest weights and show that these genes correspond frequently to major and specific pathway regulators, as well as to genes that are most affected by the biological difference between two conditions. In summary, GSNCA is a new approach for the analysis of differentially coexpressed pathways that also evaluates the importance of the genes in the pathways, thus providing unique information that may result in the generation of novel biological hypotheses. Availability and implementation: Implementation of the GSNCA test in R is available upon request from the authors. Contact: YRahmatallah@uams.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24292935

  10. Co-expression network-based analysis of hippocampal expression data associated with Alzheimer's disease using a novel algorithm

    PubMed Central

    YUE, HONG; YANG, BO; YANG, FANG; HU, XIAO-LI; KONG, FAN-BIN

    2016-01-01

    Recent progress in bioinformatics has facilitated the clarification of biological processes associated with complex diseases. Numerous methods of co-expression analysis have been proposed for use in the study of pairwise relationships among genes. In the present study, a combined network based on gene pairs was constructed following the conversion and combination of gene pair score values using a novel algorithm across multiple approaches. Three hippocampal expression profiles of patients with Alzheimer's disease (AD) and normal controls were extracted from the ArrayExpress database, and a total of 144 differentially expressed (DE) genes across multiple studies were identified by a rank product (RP) method. Five groups of co-expression gene pairs and five networks were identified and constructed using four existing methods [weighted gene co-expression network analysis (WGCNA), empirical Bayesian (EB), differentially co-expressed genes and links (DCGL), search tool for the retrieval of interacting genes/proteins database (STRING)] and a novel rank-based algorithm with combined score, respectively. Topological analysis indicated that the co-expression network constructed by the WGCNA method had the tendency to exhibit small-world characteristics, and the combined co-expression network was confirmed to be a scale-free network. Functional analysis of the co-expression gene pairs was conducted by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. The co-expression gene pairs were mostly enriched in five pathways, namely proteasome, oxidative phosphorylation, Parkinson's disease, Huntington's disease and AD. This study provides a new perspective to co-expression analysis. Since different methods of analysis often present varying abilities, the novel combination algorithm may provide a more credible and robust outcome, and could be used to complement to traditional co-expression analysis. PMID:27168792

  11. ImmuCo: a database of gene co-expression in immune cells

    PubMed Central

    Wang, Pingzhang; Qi, Huiying; Song, Shibin; Li, Shuang; Huang, Ningyu; Han, Wenling; Ma, Dalong

    2015-01-01

    Current gene co-expression databases and correlation networks do not support cell-specific analysis. Gene co-expression and expression correlation are subtly different phenomena, although both are likely to be functionally significant. Here, we report a new database, ImmuCo (http://immuco.bjmu.edu.cn), which is a cell-specific database that contains information about gene co-expression in immune cells, identifying co-expression and correlation between any two genes. The strength of co-expression of queried genes is indicated by signal values and detection calls, whereas expression correlation and strength are reflected by Pearson correlation coefficients. A scatter plot of the signal values is provided to directly illustrate the extent of co-expression and correlation. In addition, the database allows the analysis of cell-specific gene expression profile across multiple experimental conditions and can generate a list of genes that are highly correlated with the queried genes. Currently, the database covers 18 human cell groups and 10 mouse cell groups, including 20 283 human genes and 20 963 mouse genes. More than 8.6 × 108 and 7.4 × 108 probe set combinations are provided for querying each human and mouse cell group, respectively. Sample applications support the distinctive advantages of the database. PMID:25326331

  12. Canonical correlation analysis for RNA-seq co-expression networks

    PubMed Central

    Hong, Shengjun; Chen, Xiangning; Jin, Li; Xiong, Momiao

    2013-01-01

    Digital transcriptome analysis by next-generation sequencing discovers substantial mRNA variants. Variation in gene expression underlies many biological processes and holds a key to unravelling mechanism of common diseases. However, the current methods for construction of co-expression networks using overall gene expression are originally designed for microarray expression data, and they overlook a large number of variations in gene expressions. To use information on exon, genomic positional level and allele-specific expressions, we develop novel component-based methods, single and bivariate canonical correlation analysis, for construction of co-expression networks with RNA-seq data. To evaluate the performance of our methods for co-expression network inference with RNA-seq data, they are applied to lung squamous cell cancer expression data from TCGA database and our bipolar disorder and schizophrenia RNA-seq study. The preliminary results demonstrate that the co-expression networks constructed by canonical correlation analysis and RNA-seq data provide rich genetic and molecular information to gain insight into biological processes and disease mechanism. Our new methods substantially outperform the current statistical methods for co-expression network construction with microarray expression data or RNA-seq data based on overall gene expression levels. PMID:23460206

  13. Pathways of Lipid Metabolism in Marine Algae, Co-Expression Network, Bottlenecks and Candidate Genes for Enhanced Production of EPA and DHA in Species of Chromista

    PubMed Central

    Mühlroth, Alice; Li, Keshuai; Røkke, Gunvor; Winge, Per; Olsen, Yngvar; Hohmann-Marriott, Martin F.; Vadstein, Olav; Bones, Atle M.

    2013-01-01

    The importance of n-3 long chain polyunsaturated fatty acids (LC-PUFAs) for human health has received more focus the last decades, and the global consumption of n-3 LC-PUFA has increased. Seafood, the natural n-3 LC-PUFA source, is harvested beyond a sustainable capacity, and it is therefore imperative to develop alternative n-3 LC-PUFA sources for both eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3). Genera of algae such as Nannochloropsis, Schizochytrium, Isochrysis and Phaedactylum within the kingdom Chromista have received attention due to their ability to produce n-3 LC-PUFAs. Knowledge of LC-PUFA synthesis and its regulation in algae at the molecular level is fragmentary and represents a bottleneck for attempts to enhance the n-3 LC-PUFA levels for industrial production. In the present review, Phaeodactylum tricornutum has been used to exemplify the synthesis and compartmentalization of n-3 LC-PUFAs. Based on recent transcriptome data a co-expression network of 106 genes involved in lipid metabolism has been created. Together with recent molecular biological and metabolic studies, a model pathway for n-3 LC-PUFA synthesis in P. tricornutum has been proposed, and is compared to industrialized species of Chromista. Limitations of the n-3 LC-PUFA synthesis by enzymes such as thioesterases, elongases, acyl-CoA synthetases and acyltransferases are discussed and metabolic bottlenecks are hypothesized such as the supply of the acetyl-CoA and NADPH. A future industrialization will depend on optimization of chemical compositions and increased biomass production, which can be achieved by exploitation of the physiological potential, by selective breeding and by genetic engineering. PMID:24284429

  14. Weighted gene co-expression based biomarker discovery for psoriasis detection.

    PubMed

    Sundarrajan, Sudharsana; Arumugam, Mohanapriya

    2016-11-15

    Psoriasis is a chronic inflammatory disease of the skin with an unknown aetiology. The disease manifests itself as red and silvery scaly plaques distributed over the scalp, lower back and extensor aspects of the limbs. After receiving scant consideration for quite a few years, psoriasis has now become a prominent focus for new drug development. A group of closely connected and differentially co-expressed genes may act in a network and may serve as molecular signatures for an underlying phenotype. A weighted gene coexpression network analysis (WGCNA), a system biology approach has been utilized for identification of new molecular targets for psoriasis. Gene coexpression relationships were investigated in 58 psoriatic lesional samples resulting in five gene modules, clustered based on the gene coexpression patterns. The coexpression pattern was validated using three psoriatic datasets. 10 highly connected and informative genes from each module was selected and termed as psoriasis specific hub signatures. A random forest based binary classifier built using the expression profiles of signature genes robustly distinguished psoriatic samples from the normal samples in the validation set with an accuracy of 0.95 to 1. These signature genes may serve as potential candidates for biomarker discovery leading to new therapeutic targets. WGCNA, the network based approach has provided an alternative path to mine out key controllers and drivers of psoriasis. The study principle from the current work can be extended to other pathological conditions.

  15. An atlas of tissue-specific conserved coexpression for functional annotation and disease gene prediction.

    PubMed

    Piro, Rosario Michael; Ala, Ugo; Molineris, Ivan; Grassi, Elena; Bracco, Chiara; Perego, Gian Paolo; Provero, Paolo; Di Cunto, Ferdinando

    2011-11-01

    Gene coexpression relationships that are phylogenetically conserved between human and mouse have been shown to provide important clues about gene function that can be efficiently used to identify promising candidate genes for human hereditary disorders. In the past, such approaches have considered mostly generic gene expression profiles that cover multiple tissues and organs. The individual genes of multicellular organisms, however, can participate in different transcriptional programs, operating at scales as different as single-cell types, tissues, organs, body regions or the entire organism. Therefore, systematic analysis of tissue-specific coexpression could be, in principle, a very powerful strategy to dissect those functional relationships among genes that emerge only in particular tissues or organs. In this report, we show that, in fact, conserved coexpression as determined from tissue-specific and condition-specific data sets can predict many functional relationships that are not detected by analyzing heterogeneous microarray data sets. More importantly, we find that, when combined with disease networks, the simultaneous use of both generic (multi-tissue) and tissue-specific conserved coexpression allows a more efficient prediction of human disease genes than the use of generic conserved coexpression alone. Using this strategy, we were able to identify high-probability candidates for 238 orphan disease loci. We provide proof of concept that this combined use of generic and tissue-specific conserved coexpression can be very useful to prioritize the mutational candidates obtained from deep-sequencing projects, even in the case of genetic disorders as heterogeneous as XLMR.

  16. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  17. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research

    PubMed Central

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies. PMID:27597964

  18. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research.

    PubMed

    Li, Junyi; Li, Yi-Xue; Li, Yuan-Yuan

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies.

  19. Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene.

    PubMed

    Narise, Takafumi; Sakurai, Nozomu; Obayashi, Takeshi; Ohta, Hiroyuki; Shibata, Daisuke

    2017-06-05

    Gene co-expression, the similarity of gene expression profiles under various experimental conditions, has been used as an indicator of functional relationships between genes, and many co-expression databases have been developed for predicting gene functions. These databases usually provide users with a co-expression network and a list of strongly co-expressed genes for a query gene. Several of these databases also provide functional information on a set of strongly co-expressed genes (i.e., provide biological processes and pathways that are enriched in these strongly co-expressed genes), which is generally analyzed via over-representation analysis (ORA). A limitation of this approach may be that users can predict gene functions only based on the strongly co-expressed genes. In this study, we developed a new co-expression database that enables users to predict the function of tomato genes from the results of functional enrichment analyses of co-expressed genes while considering the genes that are not strongly co-expressed. To achieve this, we used the ORA approach with several thresholds to select co-expressed genes, and performed gene set enrichment analysis (GSEA) applied to a ranked list of genes ordered by the co-expression degree. We found that internal correlation in pathways affected the significance levels of the enrichment analyses. Therefore, we introduced a new measure for evaluating the relationship between the gene and pathway, termed the percentile (p)-score, which enables users to predict functionally relevant pathways without being affected by the internal correlation in pathways. In addition, we evaluated our approaches using receiver operating characteristic curves, which concluded that the p-score could improve the performance of the ORA. We developed a new database, named Co-expressed Pathways DataBase for Tomato, which is available at http://cox-path-db.kazusa.or.jp/tomato . The database allows users to predict pathways that are relevant to a

  20. Co-expression analysis reveals a group of genes potentially involved in regulation of plant response to iron-deficiency.

    PubMed

    Li, Hua; Wang, Lei; Yang, Zhi Min

    2015-01-01

    Iron (Fe) is an essential element for plant growth and development. Iron deficiency results in abnormal metabolisms from respiration to photosynthesis. Exploration of Fe-deficient responsive genes and their networks is critically important to understand molecular mechanisms leading to the plant adaptation to soil Fe-limitation. Co-expression genes are a cluster of genes that have a similar expression pattern to execute relatively biological functions at a stage of development or under a certain environmental condition. They may share a common regulatory mechanism. In this study, we investigated Fe-starved-related co-expression genes from Arabidopsis. From the biological process GO annotation of TAIR (The Arabidopsis Information Resource), 180 iron-deficient responsive genes were detected. Using ATTED-II database, we generated six gene co-expression networks. Among these, two modules of PYE and IRT1 were successfully constructed. There are 30 co-expression genes that are incorporated in the two modules (12 in PYE-module and 18 in IRT1-module). Sixteen of the co-expression genes were well characterized. The remaining genes (14) are poorly or not functionally identified with iron stress. Validation of the 14 genes using real-time PCR showed differential expression under iron-deficiency. Most of the co-expression genes (23/30) could be validated in pye and fit mutant plants with iron-deficiency. We further identified iron-responsive cis-elements upstream of the co-expression genes and found that 22 out of 30 genes contain the iron-responsive motif IDE1. Furthermore, some auxin and ethylene-responsive elements were detected in the promoters of the co-expression genes. These results suggest that some of the genes can be also involved in iron stress response through the phytohormone-responsive pathways.

  1. Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis.

    PubMed

    Clarke, Colin; Madden, Stephen F; Doolan, Padraig; Aherne, Sinead T; Joyce, Helena; O'Driscoll, Lorraine; Gallagher, William M; Hennessy, Bryan T; Moriarty, Michael; Crown, John; Kennedy, Susan; Clynes, Martin

    2013-10-01

    Weighted gene coexpression network analysis (WGCNA) is a powerful 'guilt-by-association'-based method to extract coexpressed groups of genes from large heterogeneous messenger RNA expression data sets. We have utilized WGCNA to identify 11 coregulated gene clusters across 2342 breast cancer samples from 13 microarray-based gene expression studies. A number of these transcriptional modules were found to be correlated to clinicopathological variables (e.g. tumor grade), survival endpoints for breast cancer as a whole (disease-free survival, distant disease-free survival and overall survival) and also its molecular subtypes (luminal A, luminal B, HER2+ and basal-like). Examples of findings arising from this work include the identification of a cluster of proliferation-related genes that when upregulated correlated to increased tumor grade and were associated with poor survival in general. The prognostic potential of novel genes, for example, ubiquitin-conjugating enzyme E2S (UBE2S) within this group was confirmed in an independent data set. In addition, gene clusters were also associated with survival for breast cancer molecular subtypes including a cluster of genes that was found to correlate with prognosis exclusively for basal-like breast cancer. The upregulation of several single genes within this coexpression cluster, for example, the potassium channel, subfamily K, member 5 (KCNK5) was associated with poor outcome for the basal-like molecular subtype. We have developed an online database to allow user-friendly access to the coexpression patterns and the survival analysis outputs uncovered in this study (available at http://glados.ucd.ie/Coexpression/).

  2. Exploring Plant Co-Expression and Gene-Gene Interactions with CORNET 3.0.

    PubMed

    Van Bel, Michiel; Coppens, Frederik

    2017-01-01

    Selecting and filtering a reference expression and interaction dataset when studying specific pathways and regulatory interactions can be a very time-consuming and error-prone task. In order to reduce the duplicated efforts required to amass such datasets, we have created the CORNET (CORrelation NETworks) platform which allows for easy access to a wide variety of data types: coexpression data, protein-protein interactions, regulatory interactions, and functional annotations. The CORNET platform outputs its results in either text format or through the Cytoscape framework, which is automatically launched by the CORNET website.CORNET 3.0 is the third iteration of the web platform designed for the user exploration of the coexpression space of plant genomes, with a focus on the model species Arabidopsis thaliana. Here we describe the platform: the tools, data, and best practices when using the platform. We indicate how the platform can be used to infer networks from a set of input genes, such as upregulated genes from an expression experiment. By exploring the network, new target and regulator genes can be discovered, allowing for follow-up experiments and more in-depth study. We also indicate how to avoid common pitfalls when evaluating the networks and how to avoid over interpretation of the results.All CORNET versions are available at http://bioinformatics.psb.ugent.be/cornet/ .

  3. CoExpNetViz: Comparative Co-Expression Networks Construction and Visualization Tool.

    PubMed

    Tzfadia, Oren; Diels, Tim; De Meyer, Sam; Vandepoele, Klaas; Aharoni, Asaph; Van de Peer, Yves

    2015-01-01

    Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. We introduce CoExpNetViz, a computational tool that uses a set of query or "bait" genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non-bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platforms.

  4. Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks.

    PubMed

    Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin

    2010-10-25

    Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large

  5. Novel structural co-expression analysis linking the NPM1-associated ribosomal biogenesis network to chronic myelogenous leukemia

    PubMed Central

    Chan, Lawrence WC; Lin, Xihong; Yung, Godwin; Lui, Thomas; Chiu, Ya Ming; Wang, Fengfeng; Tsui, Nancy BY; Cho, William CS; Yip, SP; Siu, Parco M.; Wong, SC Cesar; Yung, Benjamin YM

    2015-01-01

    Co-expression analysis reveals useful dysregulation patterns of gene cooperativeness for understanding cancer biology and identifying new targets for treatment. We developed a structural strategy to identify co-expressed gene networks that are important for chronic myelogenous leukemia (CML). This strategy compared the distributions of expressional correlations between CML and normal states, and it identified a data-driven threshold to classify strongly co-expressed networks that had the best coherence with CML. Using this strategy, we found a transcriptome-wide reduction of co-expression connectivity in CML, reflecting potentially loosened molecular regulation. Conversely, when we focused on nucleophosmin 1 (NPM1) associated networks, NPM1 established more co-expression linkages with BCR-ABL pathways and ribosomal protein networks in CML than normal. This finding implicates a new role of NPM1 in conveying tumorigenic signals from the BCR-ABL oncoprotein to ribosome biogenesis, affecting cellular growth. Transcription factors may be regulators of the differential co-expression patterns between CML and normal. PMID:26205693

  6. Lentiviral vector system for coordinated constitutive and drug controlled tetracycline-regulated gene co-expression.

    PubMed

    Stahlhut, Maike; Schwarzer, Adrian; Eder, Matthias; Yang, Min; Li, Zhixiong; Morgan, Michael; Schambach, Axel; Kustikova, Olga S

    2015-09-01

    Constitutive co-expression of cooperating transgenes using retroviral integrating vectors is frequently used for genetic modification of different cell types to establish therapeutic or cancer models. However, such approaches are unable to dissect the influence of dose, order and reversibility of transgene expression on the fate of newly developed therapeutic/malignant phenotypes. We present a modular lentiviral vector system, which provides expression of constitutive and inducible components. To demonstrate its functionality, we constitutively expressed the well-described transcription factor Meis1 followed by inducible co-expression of collaborating partner Hoxa9 under the control of tetracycline responsive promoters in murine fibroblasts and primary hematopoietic progenitor cells (HPCs). Fluorescent markers to track transgene co-expression revealed tightly controlled, efficiently inducible and reversible but cell type dependent gene transfer over time. We demonstrated dose-dependent blockade of myeloid differentiation when both Meis1/Hoxa9 were concomitantly overexpressed in primary HPCs in vitro, but the absence of the transformed phenotype in non-induced samples or when Hoxa9 expression was down-regulated. This system combines the advantages of lentiviral gene transfer and the opportunity for drug-controlled co-expression of multiple transgenes to dissect, among others, gene networks governing complex cell behavior, such as proto-oncogene dose-dependent leukemogenic pathways or collaborating mechanisms of genes enhancing competitive fitness of hematopoietic cells.

  7. Differential co-expression analysis reveals a novel prognostic gene module in ovarian cancer.

    PubMed

    Gov, Esra; Arga, Kazim Yalcin

    2017-07-10

    Ovarian cancer is one of the most significant disease among gynecological disorders that women suffered from over the centuries. However, disease-specific and effective biomarkers were still not available, since studies have focused on individual genes associated with ovarian cancer, ignoring the interactions and associations among the gene products. Here, ovarian cancer differential co-expression networks were reconstructed via meta-analysis of gene expression data and co-expressed gene modules were identified in epithelial cells from ovarian tumor and healthy ovarian surface epithelial samples to propose ovarian cancer associated genes and their interactions. We propose a novel, highly interconnected, differentially co-expressed, and co-regulated gene module in ovarian cancer consisting of 84 prognostic genes. Furthermore, the specificity of the module to ovarian cancer was shown through analyses of datasets in nine other cancers. These observations underscore the importance of transcriptome based systems biomarkers research in deciphering the elusive pathophysiology of ovarian cancer, and here, we present reciprocal interplay between candidate ovarian cancer genes and their transcriptional regulatory dynamics. The corresponding gene module might provide new insights on ovarian cancer prognosis and treatment strategies that continue to place a significant burden on global health.

  8. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    PubMed

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  9. Co-expression network with protein-protein interaction and transcription regulation in malaria parasite Plasmodium falciparum.

    PubMed

    Yu, Fu-Dong; Yang, Shao-You; Li, Yuan-Yuan; Hu, Wei

    2013-04-10

    Malaria continues to be one of the most severe global infectious diseases, as a major threat to human health and economic development. Network-based biological analysis is a promising approach to uncover key genes and biological processes from a network viewpoint, which could not be recognized from individual gene-based signatures. We integrated gene co-expression profile with protein-protein interaction and transcriptional regulation information to construct a comprehensive gene co-expression network of Plasmodium falciparum. Based on this network, we identified 10 core modules by using ICE (Iterative Clique Enumeration) algorithm, which were essential for malaria parasite development in intraerythrocytic developmental cycle (IDC) stages. In each module, all genes were highly correlated probably due to co-regulation or formation of a protein complex. Some of these genes were recognized to be differentially coexpressed among three close-by IDC stages. The gene of prpf8 (PFD0265w) encoding pre-mRNA processing splicing factor 8 product was identified as DCGs (differentially co-expressed genes) among IDC stages, although this gene function was seldom reported in previous researches. Integrating the species-specific gene prediction and differential co-expression gene detection, we found some modules could perform species-specific functions according to some of genes in these modules were species-specific genes, like the module 10. Furthermore, in order to reveal the underlying mechanisms of the erythrocyte invasion by P. falciparum, Steiner Tree algorithm was employed to identify the invasion subnetwork from our gene co-expression network. The subnetwork-based analysis indicated that some important Plasmodium parasite specific genes could corporate with each other and be co-regulated during the parasite invasion process, which including a head-to-head gene pair of PfRH2a (PF13_0198) and PfRH2b (MAL13P1.176). This study based on gene co-expression network could shed new

  10. Identification of Transcriptional Modules and Key Genes in Chickens Infected with Salmonella enterica Serovar Pullorum Using Integrated Coexpression Analyses

    PubMed Central

    2017-01-01

    Salmonella enterica Pullorum is one of the leading causes of mortality in poultry. Understanding the molecular response in chickens in response to the infection by S. enterica is important in revealing the mechanisms of pathogenesis and disease progress. There have been studies on identifying genes associated with Salmonella infection by differential expression analysis, but the relationships among regulated genes have not been investigated. In this study, we employed weighted gene coexpression network analysis (WGCNA) and differential coexpression analysis (DCEA) to identify coexpression modules by exploring microarray data derived from chicken splenic tissues in response to the S. enterica infection. A total of 19 modules from 13,538 genes were associated with the Jak-STAT signaling pathway, the extracellular matrix, cytoskeleton organization, the regulation of the actin cytoskeleton, G-protein coupled receptor activity, Toll-like receptor signaling pathways, and immune system processes; among them, 14 differentially coexpressed modules (DCMs) and 2,856 differentially coexpressed genes (DCGs) were identified. The global expression of module genes between infected and uninfected chickens showed slight differences but considerable changes for global coexpression. Furthermore, DCGs were consistently linked to the hubs of the modules. These results will help prioritize candidate genes for future studies of Salmonella infection. PMID:28529955

  11. Identification of Transcriptional Modules and Key Genes in Chickens Infected with Salmonella enterica Serovar Pullorum Using Integrated Coexpression Analyses.

    PubMed

    Liu, Bao-Hong; Cai, Jian-Ping

    2017-01-01

    Salmonella enterica Pullorum is one of the leading causes of mortality in poultry. Understanding the molecular response in chickens in response to the infection by S. enterica is important in revealing the mechanisms of pathogenesis and disease progress. There have been studies on identifying genes associated with Salmonella infection by differential expression analysis, but the relationships among regulated genes have not been investigated. In this study, we employed weighted gene coexpression network analysis (WGCNA) and differential coexpression analysis (DCEA) to identify coexpression modules by exploring microarray data derived from chicken splenic tissues in response to the S. enterica infection. A total of 19 modules from 13,538 genes were associated with the Jak-STAT signaling pathway, the extracellular matrix, cytoskeleton organization, the regulation of the actin cytoskeleton, G-protein coupled receptor activity, Toll-like receptor signaling pathways, and immune system processes; among them, 14 differentially coexpressed modules (DCMs) and 2,856 differentially coexpressed genes (DCGs) were identified. The global expression of module genes between infected and uninfected chickens showed slight differences but considerable changes for global coexpression. Furthermore, DCGs were consistently linked to the hubs of the modules. These results will help prioritize candidate genes for future studies of Salmonella infection.

  12. CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra.

    PubMed

    Santos, Suzana de Siqueira; Galatro, Thais Fernanda de Almeida; Watanabe, Rodrigo Akira; Oba-Shinjo, Sueli Mieko; Nagahashi Marie, Suely Kazue; Fujita, André

    2015-01-01

    Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their "importance" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

  13. Whole brain and brain regional coexpression network interactions associated with predisposition to alcohol consumption.

    PubMed

    Vanderlinden, Lauren A; Saba, Laura M; Kechris, Katerina; Miles, Michael F; Hoffman, Paula L; Tabakoff, Boris

    2013-01-01

    To identify brain transcriptional networks that may predispose an animal to consume alcohol, we used weighted gene coexpression network analysis (WGCNA). Candidate coexpression modules are those with an eigengene expression level that correlates significantly with the level of alcohol consumption across a panel of BXD recombinant inbred mouse strains, and that share a genomic region that regulates the module transcript expression levels (mQTL) with a genomic region that regulates alcohol consumption (bQTL). To address a controversy regarding utility of gene expression profiles from whole brain, vs specific brain regions, as indicators of the relationship of gene expression to phenotype, we compared candidate coexpression modules from whole brain gene expression data (gathered with Affymetrix 430 v2 arrays in the Colorado laboratories) and from gene expression data from 6 brain regions (nucleus accumbens (NA); prefrontal cortex (PFC); ventral tegmental area (VTA); striatum (ST); hippocampus (HP); cerebellum (CB)) available from GeneNetwork. The candidate modules were used to construct candidate eigengene networks across brain regions, resulting in three "meta-modules", composed of candidate modules from two or more brain regions (NA, PFC, ST, VTA) and whole brain. To mitigate the potential influence of chromosomal location of transcripts and cis-eQTLs in linkage disequilibrium, we calculated a semi-partial correlation of the transcripts in the meta-modules with alcohol consumption conditional on the transcripts' cis-eQTLs. The function of transcripts that retained the correlation with the phenotype after correction for the strong genetic influence, implicates processes of protein metabolism in the ER and Golgi as influencing susceptibility to variation in alcohol consumption. Integration of these data with human GWAS provides further information on the function of polymorphisms associated with alcohol-related traits.

  14. Whole Brain and Brain Regional Coexpression Network Interactions Associated with Predisposition to Alcohol Consumption

    PubMed Central

    Vanderlinden, Lauren A.; Saba, Laura M.; Kechris, Katerina; Miles, Michael F.; Hoffman, Paula L.; Tabakoff, Boris

    2013-01-01

    To identify brain transcriptional networks that may predispose an animal to consume alcohol, we used weighted gene coexpression network analysis (WGCNA). Candidate coexpression modules are those with an eigengene expression level that correlates significantly with the level of alcohol consumption across a panel of BXD recombinant inbred mouse strains, and that share a genomic region that regulates the module transcript expression levels (mQTL) with a genomic region that regulates alcohol consumption (bQTL). To address a controversy regarding utility of gene expression profiles from whole brain, vs specific brain regions, as indicators of the relationship of gene expression to phenotype, we compared candidate coexpression modules from whole brain gene expression data (gathered with Affymetrix 430 v2 arrays in the Colorado laboratories) and from gene expression data from 6 brain regions (nucleus accumbens (NA); prefrontal cortex (PFC); ventral tegmental area (VTA); striatum (ST); hippocampus (HP); cerebellum (CB)) available from GeneNetwork. The candidate modules were used to construct candidate eigengene networks across brain regions, resulting in three “meta-modules”, composed of candidate modules from two or more brain regions (NA, PFC, ST, VTA) and whole brain. To mitigate the potential influence of chromosomal location of transcripts and cis-eQTLs in linkage disequilibrium, we calculated a semi-partial correlation of the transcripts in the meta-modules with alcohol consumption conditional on the transcripts' cis-eQTLs. The function of transcripts that retained the correlation with the phenotype after correction for the strong genetic influence, implicates processes of protein metabolism in the ER and Golgi as influencing susceptibility to variation in alcohol consumption. Integration of these data with human GWAS provides further information on the function of polymorphisms associated with alcohol-related traits. PMID:23894363

  15. Dynamic functional modules in co-expressed protein interaction networks of dilated cardiomyopathy

    PubMed Central

    2010-01-01

    Background Molecular networks represent the backbone of molecular activity within cells and provide opportunities for understanding the mechanism of diseases. While protein-protein interaction data constitute static network maps, integration of condition-specific co-expression information provides clues to the dynamic features of these networks. Dilated cardiomyopathy is a leading cause of heart failure. Although previous studies have identified putative biomarkers or therapeutic targets for heart failure, the underlying molecular mechanism of dilated cardiomyopathy remains unclear. Results We developed a network-based comparative analysis approach that integrates protein-protein interactions with gene expression profiles and biological function annotations to reveal dynamic functional modules under different biological states. We found that hub proteins in condition-specific co-expressed protein interaction networks tended to be differentially expressed between biological states. Applying this method to a cohort of heart failure patients, we identified two functional modules that significantly emerged from the interaction networks. The dynamics of these modules between normal and disease states further suggest a potential molecular model of dilated cardiomyopathy. Conclusions We propose a novel framework to analyze the interaction networks in different biological states. It successfully reveals network modules closely related to heart failure; more importantly, these network dynamics provide new insights into the cause of dilated cardiomyopathy. The revealed molecular modules might be used as potential drug targets and provide new directions for heart failure therapy. PMID:20950417

  16. Genome-Wide Tissue-Specific Gene Expression, Co-expression and Regulation of Co-expressed Genes in Adult Nematode Ascaris suum

    PubMed Central

    Rosa, Bruce A.; Jasmer, Douglas P.; Mitreva, Makedonka

    2014-01-01

    Background Caenorhabditis elegans has traditionally been used as a model for studying nematode biology, but its small size limits the ability for researchers to perform some experiments such as high-throughput tissue-specific gene expression studies. However, the dissection of individual tissues is possible in the parasitic nematode Ascaris suum due to its relatively large size. Here, we take advantage of the recent genome sequencing of Ascaris suum and the ability to physically dissect its separate tissues to produce a wide-scale tissue-specific nematode RNA-seq datasets, including data on three non-reproductive tissues (head, pharynx, and intestine) in both male and female worms, as well as four reproductive tissues (testis, seminal vesicle, ovary, and uterus). We obtained fundamental information about the biology of diverse cell types and potential interactions among tissues within this multicellular organism. Methodology/Principal Findings Overexpression and functional enrichment analyses identified many putative biological functions enriched in each tissue studied, including functions which have not been previously studied in detail in nematodes. Putative tissue-specific transcriptional factors and corresponding binding motifs that regulate expression in each tissue were identified, including the intestine-enriched ELT-2 motif/transcription factor previously described in nematode intestines. Constitutively expressed and novel genes were also characterized, with the largest number of novel genes found to be overexpressed in the testis. Finally, a putative acetylcholine-mediated transcriptional network connecting biological activity in the head to the male reproductive system is described using co-expression networks, along with a similar ecdysone-mediated system in the female. Conclusions/Significance The expression profiles, co-expression networks and co-expression regulation of the 10 tissues studied and the tissue-specific analysis presented here are a

  17. DTW-MIC Coexpression Networks from Time-Course Data

    PubMed Central

    Riccadonna, Samantha; Jurman, Giuseppe; Visintainer, Roberto; Filosi, Michele; Furlanello, Cesare

    2016-01-01

    When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. PMID:27031641

  18. Differential co-expression network centrality and machine learning feature selection for identifying susceptibility hubs in networks with scale-free structure.

    PubMed

    Lareau, Caleb A; White, Bill C; Oberg, Ann L; McKinney, Brett A

    2015-01-01

    Biological insights into group differences, such as disease status, have been achieved through differential co-expression analysis of microarray data. Additional understanding of group differences may be achieved by integrating the connectivity structure of the differential co-expression network and per-gene differential expression between phenotypic groups. Such a global differential co-expression network strategy may increase sensitivity to detect gene-gene interactions (or expression epistasis) that may act as candidates for rewiring susceptibility co-expression networks. We test two methods for inferring Genetic Association Interaction Networks (GAIN) incorporating both differential co-expression effects and differential expression effects: a generalized linear model (GLM) regression method with interaction effects (reGAIN) and a Fisher test method for correlation differences (dcGAIN). We rank the importance of each gene with complete interaction network centrality (CINC), which integrates each gene's differential co-expression effects in the GAIN model along with each gene's individual differential expression measure. We compare these methods with statistical learning methods Relief-F, Random Forests and Lasso. We also develop a mixture model and permutation approach for determining significant importance score thresholds for network centralities, Relief-F and Random Forest. We introduce a novel simulation strategy that generates microarray case-control data with embedded differential co-expression networks and underlying correlation structure based on scale-free or Erdos-Renyi (ER) random networks. Using the network simulation strategy, we find that Relief-F and reGAIN provide the best balance between detecting interactions and main effects, plus reGAIN has the ability to adjust for covariates and model quantitative traits. The dcGAIN approach performs best at finding differential co-expression effects by design but worst for main effects, and it does not

  19. Functional annotation of novel lineage-specific genes using co-expression and promoter analysis

    PubMed Central

    2010-01-01

    Background The diversity of placental architectures within and among mammalian orders is believed to be the result of adaptive evolution. Although, the genetic basis for these differences is unknown, some may arise from rapidly diverging and lineage-specific genes. Previously, we identified 91 novel lineage-specific transcripts (LSTs) from a cow term-placenta cDNA library, which are excellent candidates for adaptive placental functions acquired by the ruminant lineage. The aim of the present study was to infer functions of previously uncharacterized lineage-specific genes (LSGs) using co-expression, promoter, pathway and network analysis. Results Clusters of co-expressed genes preferentially expressed in liver, placenta and thymus were found using 49 previously uncharacterized LSTs as seeds. Over-represented composite transcription factor binding sites (TFBS) in promoters of clustered LSGs and known genes were then identified computationally. Functions were inferred for nine previously uncharacterized LSGs using co-expression analysis and pathway analysis tools. Our results predict that these LSGs may function in cell signaling, glycerophospholipid/fatty acid metabolism, protein trafficking, regulatory processes in the nucleus, and processes that initiate parturition and immune system development. Conclusions The placenta is a rich source of lineage-specific genes that function in the adaptive evolution of placental architecture and functions. We have shown that co-expression, promoter, and gene network analyses are useful methods to infer functions of LSGs with heretofore unknown functions. Our results indicate that many LSGs are involved in cellular recognition and developmental processes. Furthermore, they provide guidance for experimental approaches to validate the functions of LSGs and to study their evolution. PMID:20214810

  20. Functional annotation of novel lineage-specific genes using co-expression and promoter analysis.

    PubMed

    Kumar, Charu G; Everts, Robin E; Loor, Juan J; Lewin, Harris A

    2010-03-09

    The diversity of placental architectures within and among mammalian orders is believed to be the result of adaptive evolution. Although, the genetic basis for these differences is unknown, some may arise from rapidly diverging and lineage-specific genes. Previously, we identified 91 novel lineage-specific transcripts (LSTs) from a cow term-placenta cDNA library, which are excellent candidates for adaptive placental functions acquired by the ruminant lineage. The aim of the present study was to infer functions of previously uncharacterized lineage-specific genes (LSGs) using co-expression, promoter, pathway and network analysis. Clusters of co-expressed genes preferentially expressed in liver, placenta and thymus were found using 49 previously uncharacterized LSTs as seeds. Over-represented composite transcription factor binding sites (TFBS) in promoters of clustered LSGs and known genes were then identified computationally. Functions were inferred for nine previously uncharacterized LSGs using co-expression analysis and pathway analysis tools. Our results predict that these LSGs may function in cell signaling, glycerophospholipid/fatty acid metabolism, protein trafficking, regulatory processes in the nucleus, and processes that initiate parturition and immune system development. The placenta is a rich source of lineage-specific genes that function in the adaptive evolution of placental architecture and functions. We have shown that co-expression, promoter, and gene network analyses are useful methods to infer functions of LSGs with heretofore unknown functions. Our results indicate that many LSGs are involved in cellular recognition and developmental processes. Furthermore, they provide guidance for experimental approaches to validate the functions of LSGs and to study their evolution.

  1. Identification of PEG-induced water stress responsive transcripts using co-expression network in Eucalyptus grandis.

    PubMed

    Ghosh Dasgupta, Modhumita; Dharanishanthi, Veeramuthu

    2017-09-05

    Ecophysiological studies in Eucalyptus have shown that water is the principal factor limiting stem growth. Effect of water deficit conditions on physiological and biochemical parameters has been extensively reported in Eucalyptus. The present study was conducted to identify major polyethylene glycol induced water stress responsive transcripts in Eucalyptus grandis using gene co-expression network. A customized array representing 3359 water stress responsive genes was designed to document their expression in leaves of E. grandis cuttings subjected to -0.225MPa of PEG treatment. The differentially expressed transcripts were documented and significantly co-expressed transcripts were used for construction of network. The co-expression network was constructed with 915 nodes and 3454 edges with degree ranging from 2 to 45. Ninety four GO categories and 117 functional pathways were identified in the network. MCODE analysis generated 27 modules and module 6 with 479 nodes and 1005 edges was identified as the biologically relevant network. The major water responsive transcripts represented in the module included dehydrin, osmotin, LEA protein, expansin, arabinogalactans, heat shock proteins, major facilitator proteins, ARM repeat proteins, raffinose synthase, tonoplast intrinsic protein and transcription factors like DREB2A, ARF9, AGL24, UNE12, WLIM1 and MYB66, MYB70, MYB 55, MYB 16 and MYB 103. The coordinated analysis of gene expression patterns and coexpression networks developed in this study identified an array of transcripts that may regulate PEG induced water stress responses in E. grandis. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Mining differential top-k co-expression patterns from time course comparative gene expression datasets

    PubMed Central

    2013-01-01

    Background Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. Unfortunately, too many itemsets (co-expressed genes) are identified by this analysis method since it does not consider the importance of each gene within biological processes to a cellular response and does not take into account temporal properties under biological treatment-control matched conditions in a microarray dataset. Results We propose a method termed TIIM (Top-k Impactful Itemsets Miner), which only requires specifying a user-defined number k to explore the top k itemsets with the most significantly differentially co-expressed genes between 2 conditions in a time course. To give genes different weights, a table with impact degrees for each gene was constructed based on the number of neighboring genes that are differently expressed in the dataset within gene regulatory networks. Finally, the resulting top-k impactful itemsets were manually evaluated using previous literature and analyzed by a Gene Ontology enrichment method. Conclusions In this study, the proposed method was evaluated in 2 publicly available time course microarray datasets with 2 different experimental conditions. Both datasets identified potential itemsets with co-expressed genes evaluated from the literature and showed higher accuracies compared to the 2 corresponding control methods: i) performing TIIM without considering the gene expression differentiation between 2 different experimental conditions and impact degrees, and ii) performing TIIM with a constant impact degree for each gene. Our proposed method found that several new gene regulations involved in these itemsets were useful for biologists and provided further insights into the mechanisms underpinning biological processes. The Java source code and other related materials used in this study are available at

  3. Construction and application of a co-expression network in Mycobacterium tuberculosis

    PubMed Central

    Jiang, Jun; Sun, Xian; Wu, Wei; Li, Li; Wu, Hai; Zhang, Lu; Yu, Guohua; Li, Yao

    2016-01-01

    Because of its high pathogenicity and infectivity, tuberculosis is a serious threat to human health. Some information about the functions of the genes in Mycobacterium tuberculosis genome was currently available, but it was not enough to explore transcriptional regulatory mechanisms. Here, we applied the WGCNA (Weighted Gene Correlation Network Analysis) algorithm to mine pooled microarray datasets for the M. tuberculosis H37Rv strain. We constructed a co-expression network that was subdivided into 78 co-expression gene modules. The different response to two kinds of vitro models (a constant 0.2% oxygen hypoxia model and a Wayne model) were explained based on these modules. We identified potential transcription factors based on high Pearson’s correlation coefficients between the modules and genes. Three modules that may be associated with hypoxic stimulation were identified, and their potential transcription factors were predicted. In the validation experiment, we determined the expression levels of genes in the modules under hypoxic condition and under overexpression of potential transcription factors (Rv0081, furA (Rv1909c), Rv0324, Rv3334, and Rv3833). The experimental results showed that the three identified modules related to hypoxia and that the overexpression of transcription factors could significantly change the expression levels of genes in the corresponding modules. PMID:27328747

  4. Dissecting nutrient-related co-expression networks in phosphate starved poplars

    PubMed Central

    Kavka, Mareike; Polle, Andrea

    2017-01-01

    Phosphorus (P) is an essential plant nutrient, but its availability is often limited in soil. Here, we studied changes in the transcriptome and in nutrient element concentrations in leaves and roots of poplars (Populus × canescens) in response to P deficiency. P starvation resulted in decreased concentrations of S and major cations (K, Mg, Ca), in increased concentrations of N, Zn and Al, while C, Fe and Mn were only little affected. In roots and leaves >4,000 and >9,000 genes were differently expressed upon P starvation. These genes clustered in eleven co-expression modules of which seven were correlated with distinct elements in the plant tissues. One module (4.7% of all differentially expressed genes) was strongly correlated with changes in the P concentration in the plant. In this module the GO term “response to P starvation” was enriched with phosphoenolpyruvate carboxylase kinases, phosphatases and pyrophosphatases as well as regulatory domains such as SPX, but no phosphate transporters. The P-related module was also enriched in genes of the functional category “galactolipid synthesis”. Galactolipids substitute phospholipids in membranes under P limitation. Two modules, one correlated with C and N and the other with biomass, S and Mg, were connected with the P-related module by co-expression. In these modules GO terms indicating “DNA modification” and “cell division” as well as “defense” and “RNA modification” and “signaling” were enriched; they contained phosphate transporters. Bark storage proteins were among the most strongly upregulated genes in the growth-related module suggesting that N, which could not be used for growth, accumulated in typical storage compounds. In conclusion, weighted gene coexpression network analysis revealed a hierarchical structure of gene clusters, which separated phosphate starvation responses correlated with P tissue concentrations from other gene modules, which most likely represented

  5. Differential coexpression analysis of obesity-associated networks in human subcutaneous adipose tissue.

    PubMed

    Walley, A J; Jacobson, P; Falchi, M; Bottolo, L; Andersson, J C; Petretto, E; Bonnefond, A; Vaillant, E; Lecoeur, C; Vatin, V; Jernas, M; Balding, D; Petteni, M; Park, Y S; Aitman, T; Richardson, S; Sjostrom, L; Carlsson, L M S; Froguel, P

    2012-01-01

    To use a unique obesity-discordant sib-pair study design to combine differential expression analysis, expression quantitative trait loci (eQTLs) mapping and a coexpression regulatory network approach in subcutaneous human adipose tissue to identify genes relevant to the obese state. Genome-wide transcript expression in subcutaneous human adipose tissue was measured using Affymetrix U133 Plus 2.0 microarrays (Affymetrix, Santa Clara, CA, USA), and genome-wide genotyping data was obtained using an Applied Biosystems (Applied Biosystems; Life Technologies, Carlsbad, CA, USA) SNPlex linkage panel. A total of 154 Swedish families ascertained through an obese proband (body mass index (BMI) >30 kg m(-2)) with a discordant sibling (BMI>10 kg m(-2) less than proband). Approximately one-third of the transcripts were differentially expressed between lean and obese siblings. The cellular adhesion molecules (CAMs) KEGG grouping contained the largest number of differentially expressed genes under cis-acting genetic control. By using a novel approach to contrast CAMs coexpression networks between lean and obese siblings, a subset of differentially regulated genes was identified, with the previously GWAS obesity-associated neuronal growth regulator 1 (NEGR1) as a central hub. Independent analysis using mouse data demonstrated that this finding of NEGR1 is conserved across species. Our data suggest that in addition to its reported role in the brain, NEGR1 is also expressed in subcutaneous adipose tissue and acts as a central 'hub' in an obesity-related transcript network.

  6. Protein-protein interaction and gene co-expression maps of ARFs and Aux/IAAs in Arabidopsis

    PubMed Central

    Piya, Sarbottam; Shrestha, Sandesh K.; Binder, Brad; Stewart, C. Neal; Hewezi, Tarek

    2014-01-01

    The phytohormone auxin regulates nearly all aspects of plant growth and development. Based on the current model in Arabidopsis thaliana, Auxin/indole-3-acetic acid (Aux/IAA) proteins repress auxin-inducible genes by inhibiting auxin response transcription factors (ARFs). Experimental evidence suggests that heterodimerization between Aux/IAA and ARF proteins are related to their unique biological functions. The objective of this study was to generate the Aux/IAA-ARF protein-protein interaction map using full length sequences and locate the interacting protein pairs to specific gene co-expression networks in order to define tissue-specific responses of the Aux/IAA-ARF interactome. Pairwise interactions between 19 ARFs and 29 Aux/IAAs resulted in the identification of 213 specific interactions of which 79 interactions were previously unknown. The incorporation of co-expression profiles with protein-protein interaction data revealed a strong correlation of gene co-expression for 70% of the ARF-Aux/IAA interacting pairs in at least one tissue/organ, indicative of the biological significance of these interactions. Importantly, ARF4-8 and 19, which were found to interact with almost all Aux-Aux/IAA showed broad co-expression relationships with Aux/IAA genes, thus, formed the central hubs of the co-expression network. Our analyses provide new insights into the biological significance of ARF-Aux/IAA associations in the morphogenesis and development of various plant tissues and organs. PMID:25566309

  7. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    DOE PAGES

    Baldwin, Nicole E.; Chesler, Elissa J.; Kirov, Stefan; ...

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively co-regulated genes and their annotation using gene ontology analysis and cis -regulatory element discovery.more » The causal basis for co-regulation is detected through the use of quantitative trait locus mapping.« less

  8. Co-expression networks in generation of induced pluripotent stem cells

    PubMed Central

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P.; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M.

    2016-01-01

    ABSTRACT We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation. PMID:26892236

  9. Co-expression networks in generation of induced pluripotent stem cells.

    PubMed

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M

    2016-02-18

    We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation.

  10. Co-expression network analyses identify functional modules associated with development and stress response in Gossypium arboreum

    PubMed Central

    You, Qi; Zhang, Liwei; Yi, Xin; Zhang, Kang; Yao, Dongxia; Zhang, Xueyan; Wang, Qianhua; Zhao, Xinhua; Ling, Yi; Xu, Wenying; Li, Fuguang; Su, Zhen

    2016-01-01

    Cotton is an economically important crop, essential for the agriculture and textile industries. Through integrating transcriptomic data, we discovered that multi-dimensional co-expression network analysis was powerful for predicting cotton gene functions and functional modules. Here, the recently available transcriptomic data on Gossypium arboreum, including data on multiple growth stages of tissues and stress treatment samples were applied to construct a co-expression network exploring multi-dimensional expression (development and stress) through multi-layered approaches. Based on differential gene expression and network analysis, a fibre development regulatory module of the gene GaKNL1 was found to regulate the second cell wall through repressing the activity of REVOLUTA, and a tissue-selective module of GaJAZ1a was examined in response to water stress. Moreover, comparative genomics analysis of the JAZ1-related regulatory module revealed high conservation across plant species. In addition, 1155 functional modules were identified through integrating the co-expression network, module classification and function enrichment tools, which cover functions such as metabolism, stress responses, and transcriptional regulation. In the end, an online platform was built for network analysis (http://structuralbiology.cau.edu.cn/arboreum), which could help to refine the annotation of cotton gene function and establish a data mining system to identify functional genes or modules with important agronomic traits. PMID:27922095

  11. Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction

    PubMed Central

    Shoichet, Brian K.; Gillis, Jesse

    2016-01-01

    The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of neighboring proteins to bind related ligands, may complement biologically-oriented gene networks, which are used to predict functional or disease relevance. To quantify the degree to which such ligand-based protein associations might complement functional genomic associations, including sequence similarity, physical protein-protein interactions, co-expression, and disease gene annotations, we calculated a network based on the Similarity Ensemble Approach (SEA: sea.docking.org), where protein neighbors reflect the similarity of their ligands. We also measured the similarity with functional genomic networks over a common set of 1,131 genes, and found that the networks had only small overlaps, which were significant only due to the large scale of the data. Consistent with the view that the networks contain different information, combining them substantially improved Molecular Function prediction within GO (from AUROC~0.63–0.75 for the individual data modalities to AUROC~0.8 in the aggregate). We investigated the boost in guilt-by-association gene function prediction when the networks are combined and describe underlying properties that can be further exploited. PMID:27467773

  12. KaPPA-View4: a metabolic pathway database for representation and analysis of correlation networks of gene co-expression and metabolite co-accumulation and omics data

    PubMed Central

    Sakurai, Nozomu; Ara, Takeshi; Ogata, Yoshiyuki; Sano, Ryosuke; Ohno, Takashi; Sugiyama, Kenjiro; Hiruta, Atsushi; Yamazaki, Kiyoshi; Yano, Kentaro; Aoki, Koh; Aharoni, Asaph; Hamada, Kazuki; Yokoyama, Koji; Kawamura, Shingo; Otsuka, Hirofumi; Tokimatsu, Toshiaki; Kanehisa, Minoru; Suzuki, Hideyuki; Saito, Kazuki; Shibata, Daisuke

    2011-01-01

    Correlations of gene-to-gene co-expression and metabolite-to-metabolite co-accumulation calculated from large amounts of transcriptome and metabolome data are useful for uncovering unknown functions of genes, functional diversities of gene family members and regulatory mechanisms of metabolic pathway flows. Many databases and tools are available to interpret quantitative transcriptome and metabolome data, but there are only limited ones that connect correlation data to biological knowledge and can be utilized to find biological significance of it. We report here a new metabolic pathway database, KaPPA-View4 (http://kpv.kazusa.or.jp/kpv4/), which is able to overlay gene-to-gene and/or metabolite-to-metabolite relationships as curves on a metabolic pathway map, or on a combination of up to four maps. This representation would help to discover, for example, novel functions of a transcription factor that regulates genes on a metabolic pathway. Pathway maps of the Kyoto Encyclopedia of Genes and Genomes (KEGG) and maps generated from their gene classifications are available at KaPPA-View4 KEGG version (http://kpv.kazusa.or.jp/kpv4-kegg/). At present, gene co-expression data from the databases ATTED-II, COXPRESdb, CoP and MiBASE for human, mouse, rat, Arabidopsis, rice, tomato and other plants are available. PMID:21097783

  13. Divergent and convergent modes of interaction between wheat and Puccinia graminis f. sp. tritici isolates revealed by the comparative gene co-expression network and genome analyses

    USDA-ARS?s Scientific Manuscript database

    Two opposing evolutionary constraints exert pressure on pathogens: one to diversify virulence factors in order to evade host defenses, and the other to retain virulence factors critical for maintaining a compatible interaction. To better understand how the diversified arsenals of fungal genes promot...

  14. Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks.

    PubMed

    Rahmani, Bahareh; Zimmermann, Michael T; Grill, Diane E; Kennedy, Richard B; Oberg, Ann L; White, Bill C; Poland, Gregory A; McKinney, Brett A

    2016-01-01

    Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways.

  15. Metabolic and co-expression network-based analyses associated with nitrate response in rice.

    PubMed

    Coneva, Viktoriya; Simopoulos, Caitlin; Casaretto, José A; El-Kereamy, Ashraf; Guevara, David R; Cohn, Jonathan; Zhu, Tong; Guo, Lining; Alexander, Danny C; Bi, Yong-Mei; McNicholas, Paul D; Rothstein, Steven J

    2014-12-03

    Understanding gene expression and metabolic re-programming that occur in response to limiting nitrogen (N) conditions in crop plants is crucial for the ongoing progress towards the development of varieties with improved nitrogen use efficiency (NUE). To unravel new details on the molecular and metabolic responses to N availability in a major food crop, we conducted analyses on a weighted gene co-expression network and metabolic profile data obtained from leaves and roots of rice plants adapted to sufficient and limiting N as well as after shifting them to limiting (reduction) and sufficient (induction) N conditions. A gene co-expression network representing clusters of rice genes with similar expression patterns across four nitrogen conditions and two tissue types was generated. The resulting 18 clusters were analyzed for enrichment of significant gene ontology (GO) terms. Four clusters exhibited significant correlation with limiting and reducing nitrate treatments. Among the identified enriched GO terms, those related to nucleoside/nucleotide, purine and ATP binding, defense response, sugar/carbohydrate binding, protein kinase activities, cell-death and cell wall enzymatic activity are enriched. Although a subset of functional categories are more broadly associated with the response of rice organs to limiting N and N reduction, our analyses suggest that N reduction elicits a response distinguishable from that to adaptation to limiting N, particularly in leaves. This observation is further supported by metabolic profiling which shows that several compounds in leaves change proportionally to the nitrate level (i.e. higher in sufficient N vs. limiting N) and respond with even higher levels when the nitrate level is reduced. Notably, these compounds are directly involved in N assimilation, transport, and storage (glutamine, asparagine, glutamate and allantoin) and extend to most amino acids. Based on these data, we hypothesize that plants respond by rapidly mobilizing

  16. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish

    NASA Astrophysics Data System (ADS)

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-05-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration.

  17. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish

    PubMed Central

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-01-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320

  18. ccNET: Database of co-expression networks with functional modules for diploid and polyploid Gossypium

    PubMed Central

    You, Qi; Xu, Wenying; Zhang, Kang; Zhang, Liwei; Yi, Xin; Yao, Dongxia; Wang, Chunchao; Zhang, Xueyan; Zhao, Xinhua; Provart, Nicholas J.; Li, Fuguang; Su, Zhen

    2017-01-01

    Plant genera with both diploid and polyploid species are a common evolutionary occurrence. Polyploids, especially allopolyploids such as cotton and wheat, are a great model system for heterosis research. Here, we have integrated genome sequences and transcriptome data of Gossypium species to construct co-expression networks and identified functional modules from different cotton species, including 1155 and 1884 modules in G. arboreum and G. hirsutum, respectively. We overlayed the gene expression results onto the co-expression network. We further provided network comparison analysis for orthologous genes across the diploid and allotetraploid Gossypium. We also constructed miRNA-target networks and predicted PPI networks for both cotton species. Furthermore, we integrated in-house ChIP-seq data of histone modification (H3K4me3) together with cis-element analysis and gene sets enrichment analysis tools for studying possible gene regulatory mechanism in Gossypium species. Finally, we have constructed an online ccNET database (http://structuralbiology.cau.edu.cn/gossypium) for comparative gene functional analyses at a multi-dimensional network and epigenomic level across diploid and polyploid Gossypium species. The ccNET database will be beneficial for community to yield novel insights into gene/module functions during cotton development and stress response, and might be useful for studying conservation and diversity in other polyploid plants, such as T. aestivum and Brassica napus. PMID:28053168

  19. Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation

    PubMed Central

    Li, Wenyuan; Liu, Chun-Chi; Zhang, Tong; Li, Haifeng; Waterman, Michael S.; Zhou, Xianghong Jasmine

    2011-01-01

    The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks. PMID:21698123

  20. An integrative approach predicted co-expression sub-networks regulating properties of stem cells and differentiation.

    PubMed

    Sahu, Mousumi; Mallick, Bibekanand

    2016-10-01

    The differentiation of human Embryonic Stem Cells (hESCs) is accompanied by the formation of different intermediary cells, gradually losing its stemness and acquiring differentiation. The precise mechanisms underlying hESCs integrity and its differentiation into fibroblast (Fib) are still elusive. Here, we aimed to assess important genes and co-expression sub-networks responsible for stemness, early differentiation of hESCs into embryoid bodies (EBs) and its lineage specification into Fibs. To achieve this, we compared transcriptional profiles of hESCs-EBs and EBs-Fibs and obtained differentially expressed genes (DEGs) exclusive to hESCs-EBs (early differentiation), EBs-Fibs (late differentiation) and common DEGs in hESCs-EBs and EBs-Fibs. Then, we performed gene set enrichment analysis (GSEA) followed by overrepresentation study and identified key genes for each gene category. The regulations of these genes were studied by integrating ChIP-Seq data of core transcription factors (TFs) and histone methylation marks in hESCs. Finally, we identified co-expression sub-networks from key genes of each gene category using k-clique sub-network extraction method. Our study predicted seven genes edicting core stemness properties forming a co-expression network. From the pathway analysis of sub-networks of hESCs-EBs, we hypothesize that FGF2 is contributing to pluripotent transcription network of hESCs in association with DNMT3B and JARID2 thereby facilitating cell proliferation. On the contrary, FGF2 is found to promote cell migration in Fibs along with DDR2, CAV1, DAB2, and PARVA. Moreover, our study identified three k-clique sub-networks regulating TGF-β signaling pathway thereby promoting EBs to Fibs differentiation by: (i) modulating extracellular matrix involving ITGB1, TGFB1I1 and GBP1, (ii) regulating cell cycle remodeling involving CDKN1A, JUNB and DUSP1 and (iii) helping in epithelial to mesenchymal transition (EMT) involving THBS1, INHBA and LOX. This study put

  1. Co-expression network analysis of Down's syndrome based on microarray data

    PubMed Central

    Zhao, Jianping; Zhang, Zhengguo; Ren, Shumin; Zong, Yanan; Kong, Xiangdong

    2016-01-01

    Down's syndrome (DS) is a type of chromosome disease. The present study aimed to explore the underlying molecular mechanisms of DS. GSE5390 microarray data downloaded from the gene expression omnibus database was used to identify differentially expressed genes (DEGs) in DS. Pathway enrichment analysis of the DEGs was performed, followed by co-expression network construction. Significant differential modules were mined by mutual information, followed by functional analysis. The accuracy of sample classification for the significant differential modules of DEGs was evaluated by leave-one-out cross-validation. A total of 997 DEGs, including 638 upregulated and 359 downregulated genes, were identified. Upregulated DEGs were enriched in 15 pathways, such as cell adhesion molecules, whereas downregulated DEGs were enriched in maturity onset diabetes of the young. Three significant differential modules with the highest discriminative scores (mutual information>0.35) were selected from a co-expression network. The classification accuracy of GSE16677 expression profile samples was 54.55% and 72.73% when characterized by 12 DEGs and 3 significant differential modules, respectively. Genes in significant differential modules were significantly enriched in 5 functions, including the endoplasmic reticulum (P=0.018) and regulation of apoptosis (P=0.061). The identified DEGs, in particular the 12 DEGs in the significant differential modules, such as B-cell lymphoma 2-associated transcription factor 1, heat shock protein 90 kDa beta member 1, UBX domain-containing protein 2 and transmembrane protein 50B, may serve important roles in the pathogenesis of DS. PMID:27588071

  2. A Gene Recommender Algorithm to Identify Coexpressed Genes in C. elegans

    PubMed Central

    Owen, Art B.; Stuart, Josh; Mach, Kathy; Villeneuve, Anne M.; Kim, Stuart

    2003-01-01

    One of the most important uses of whole-genome expression data is for the discovery of new genes with similar function to a given list of genes (the query) already known to have closely related function. We have developed an algorithm, called the gene recommender, that ranks genes according to how strongly they correlate with a set of query genes in those experiments for which the query genes are most strongly coregulated. We used the gene recommender to find other genes coexpressed with several sets of query genes, including genes known to function in the retinoblastoma complex. Genetic experiments confirmed that one gene (JC8.6) identified by the gene recommender acts with lin-35 Rb to regulate vulval cell fates, and that another gene (wrm-1) acts antagonistically. We find that the gene recommender returns lists of genes with better precision, for fixed levels of recall, than lists generated using the C. elegans expression topomap. PMID:12902378

  3. Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

    NASA Technical Reports Server (NTRS)

    Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

    2000-01-01

    Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.

  4. Gene coexpression as Hebbian learning in prokaryotic genomes.

    PubMed

    Vey, Gregory

    2013-12-01

    Biological interaction networks represent a powerful tool for characterizing intracellular functional relationships, such as transcriptional regulation and protein interactions. Although artificial neural networks are routinely employed for a broad range of applications across computational biology, their underlying connectionist basis has not been extensively applied to modeling biological interaction networks. In particular, the Hopfield network offers nonlinear dynamics that represent the minimization of a system energy function through temporally distinct rewiring events. Here, a scaled energy minimization model is presented to test the feasibility of deriving a composite biological interaction network from multiple constituent data sets using the Hebbian learning principle. The performance of the scaled energy minimization model is compared against the standard Hopfield model using simulated data. Several networks are also derived from real data, compared to one another, and then combined to produce an aggregate network. The utility and limitations of the proposed model are discussed, along with possible implications for a genomic learning analogy where the fundamental Hebbian postulate is rendered into its genomic equivalent: Genes that function together junction together.

  5. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism

    PubMed Central

    Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.

    2013-01-01

    SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886

  6. Correlated mRNAs and miRNAs from co-expression and regulatory networks affect porcine muscle and finally meat properties.

    PubMed

    Ponsuksili, Siriluck; Du, Yang; Hadlich, Frieder; Siengdee, Puntita; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2013-08-05

    Physiological processes aiding the conversion of muscle to meat involve many genes associated with muscle structure and metabolic processes. MicroRNAs regulate networks of genes to orchestrate cellular functions, in turn regulating phenotypes. We applied weighted gene co-expression network analysis to identify co-expression modules that correlated to meat quality phenotypes and were highly enriched for genes involved in glucose metabolism, response to wounding, mitochondrial ribosome, mitochondrion, and extracellular matrix. Negative correlation of miRNA with mRNA and target prediction were used to select transcripts out of the modules of trait-associated mRNAs to further identify those genes that are correlated with post mortem traits. Porcine muscle co-expression transcript networks that correlated to post mortem traits were identified. The integration of miRNA and mRNA expression analyses, as well as network analysis, enabled us to interpret the differentially-regulated genes from a systems perspective. Linking co-expression networks of transcripts and hierarchically organized pairs of miRNAs and mRNAs to meat properties yields new insight into several biological pathways underlying phenotype differences. These pathways may also be diagnostic for many myopathies, which are accompanied by deficient nutrient and oxygen supply of muscle fibers.

  7. Connecting genes, coexpression modules, and molecular signitures to environmental stress phenotypes in plants

    SciTech Connect

    Weston, David; Gunter, Lee E; Rogers, Alistair; Wullschleger, Stan D

    2008-01-01

    Background One of the eminent opportunities afforded by modern genomic technologies is the potential to provide a mechanistic understanding of the processes by which genetic change translates to phenotypic variation and the resultant appearance of distinct physiological traits. Indeed much progress has been made in this area, particularly in biomedicine where functional genomic information can be used to determine the physiological state (e.g., diagnosis) and predict phenotypic outcome (e.g., patient survival). Ecology currently lacks an analogous approach where genomic information can be used to diagnose the presence of a given physiological state (e.g., stress response) and then predict likely phenotypic outcomes (e.g., stress duration and tolerance, fitness). Results Here, we demonstrate that a compendium of genomic signatures can be used to classify the plant abiotic stress phenotype in Arabidopsis according to the architecture of the transcriptome, and then be linked with gene coexpression network analysis to determine the underlying genes governing the phenotypic response. Using this approach, we confirm the existence of known stress responsive pathways and marker genes, report a common abiotic stress responsive transcriptome and relate phenotypic classification to stress duration. Conclusion Linking genomic signatures to gene coexpression analysis provides a unique method of relating an observed plant phenotype to changes in gene expression that underlie that phenotype. Such information is critical to current and future investigations in plant biology and, in particular, to evolutionary ecology, where a mechanistic understanding of adaptive physiological responses to abiotic stress can provide researchers with a tool of great predictive value in understanding species and population level adaptation to climate change.

  8. Connecting genes, coexpression modules, and molecular signatures to environmental stress phenotypes in plants

    PubMed Central

    Weston, David J; Gunter, Lee E; Rogers, Alistair; Wullschleger, Stan D

    2008-01-01

    Background One of the eminent opportunities afforded by modern genomic technologies is the potential to provide a mechanistic understanding of the processes by which genetic change translates to phenotypic variation and the resultant appearance of distinct physiological traits. Indeed much progress has been made in this area, particularly in biomedicine where functional genomic information can be used to determine the physiological state (e.g., diagnosis) and predict phenotypic outcome (e.g., patient survival). Ecology currently lacks an analogous approach where genomic information can be used to diagnose the presence of a given physiological state (e.g., stress response) and then predict likely phenotypic outcomes (e.g., stress duration and tolerance, fitness). Results Here, we demonstrate that a compendium of genomic signatures can be used to classify the plant abiotic stress phenotype in Arabidopsis according to the architecture of the transcriptome, and then be linked with gene coexpression network analysis to determine the underlying genes governing the phenotypic response. Using this approach, we confirm the existence of known stress responsive pathways and marker genes, report a common abiotic stress responsive transcriptome and relate phenotypic classification to stress duration. Conclusion Linking genomic signatures to gene coexpression analysis provides a unique method of relating an observed plant phenotype to changes in gene expression that underlie that phenotype. Such information is critical to current and future investigations in plant biology and, in particular, to evolutionary ecology, where a mechanistic understanding of adaptive physiological responses to abiotic stress can provide researchers with a tool of great predictive value in understanding species and population level adaptation to climate change. PMID:18248680

  9. Identification of breast cancer candidate genes using gene co-expression and protein-protein interaction information.

    PubMed

    Yue, Zhenyu; Li, Hai-Tao; Yang, Yabing; Hussain, Sajid; Zheng, Chun-Hou; Xia, Junfeng; Chen, Yan

    2016-06-14

    Breast cancer (BC) is one of the most common malignancies that could threaten female health. As the molecular mechanism of BC has not yet been completely discovered, identification of related genes of this disease is an important area of research that could provide new insights into gene function as well as potential treatment targets. Here we used subnetwork extraction algorithms to identify novel BC related genes based on the known BC genes (seed genes), gene co-expression profiles and protein-protein interaction network. We computationally predicted seven key genes (EPHX2, GHRH, PPYR1, ALPP, KNG1, GSK3A and TRIT1) as putative genes of BC. Further analysis shows that six of these have been reported as breast cancer associated genes, and one (PPYR1) as cancer associated gene. Lastly, we developed an expression signature using these seven key genes which significantly stratified 1660 BC patients according to relapse free survival (hazard ratio [HR], 0.55; 95% confidence interval [CI], 0.46-0.65; Logrank p = 5.5e-13). The 7-genes signature could be established as a useful predictor of disease prognosis in BC patients. Overall, the identified seven genes might be useful prognostic and predictive molecular markers to predict the clinical outcome of BC patients.

  10. PLANEX: the plant co-expression database.

    PubMed

    Yim, Won Cheol; Yu, Yongbin; Song, Kitae; Jang, Cheol Seong; Lee, Byung-Moo

    2013-05-20

    The PLAnt co-EXpression database (PLANEX) is a new internet-based database for plant gene analysis. PLANEX (http://planex.plantbioinformatics.org) contains publicly available GeneChip data obtained from the Gene Expression Omnibus (GEO) of the National Center for Biotechnology Information (NCBI). PLANEX is a genome-wide co-expression database, which allows for the functional identification of genes from a wide variety of experimental designs. It can be used for the characterization of genes for functional identification and analysis of a gene's dependency among other genes. Gene co-expression databases have been developed for other species, but gene co-expression information for plants is currently limited. We constructed PLANEX as a list of co-expressed genes and functional annotations for Arabidopsis thaliana, Glycine max, Hordeum vulgare, Oryza sativa, Solanum lycopersicum, Triticum aestivum, Vitis vinifera and Zea mays. PLANEX reports Pearson's correlation coefficients (PCCs; r-values) that distribute from a gene of interest for a given microarray platform set corresponding to a particular organism. To support PCCs, PLANEX performs an enrichment test of Gene Ontology terms and Cohen's Kappa value to compare functional similarity for all genes in the co-expression database. PLANEX draws a cluster network with co-expressed genes, which is estimated using the k-mean method. To construct PLANEX, a variety of datasets were interpreted by the IBM supercomputer Advanced Interactive eXecutive (AIX) in a supercomputing center. PLANEX provides a correlation database, a cluster network and an interpretation of enrichment test results for eight plant species. A typical co-expressed gene generates lists of co-expression data that contain hundreds of genes of interest for enrichment analysis. Also, co-expressed genes can be identified and cataloged in terms of comparative genomics by using the 'Co-expression gene compare' feature. This type of analysis will help interpret

  11. Analysis of functional and pathway association of differential co-expressed genes: a case study in drug addiction.

    PubMed

    Li, Zi-hui; Liu, Yu-feng; Li, Ke-ning; Duanmu, Hui-zi; Chang, Zhi-qiang; Li, Zhen-qi; Zhang, Shan-zhen; Xu, Yan

    2012-02-01

    Drug addiction has been considered as a kind of chronic relapsing brain disease influenced by both genetic and environmental factors. At present, many causative genes and pathways related to diverse kinds of drug addiction have been discovered, while less attention has been paid to common mechanisms shared by different drugs underlying addiction. By applying a co-expression meta-analysis method to mRNA expression profiles of alcohol, cocaine, heroin addicted and normal samples, we identified significant gene co-expression pairs. As co-expression networks of drug group and control group constructed, associated function term pairs and pathway pairs reflected by co-expression pattern changes were discovered by integrating functional and pathway information respectively. The results indicated that respiratory electron transport chain, synaptic transmission, mitochondrial electron transport, signal transduction, locomotory behavior, response to amphetamine, negative regulation of cell migration, glucose regulation of insulin secretion, signaling by NGF, diabetes pathways, integration of energy metabolism, dopamine receptors may play an important role in drug addiction. In addition, the results can provide theory support for studies of addiction mechanisms.

  12. Utilizing RNA-Seq data for de novo coexpression network inference.

    PubMed

    Iancu, Ovidiu D; Kawane, Sunita; Bottomly, Daniel; Searles, Robert; Hitzemann, Robert; McWeeney, Shannon

    2012-06-15

    RNA-Seq experiments have shown great potential for transcriptome profiling. While sequencing increases the level of biological detail, integrative data analysis is also important. One avenue is the construction of coexpression networks. Because the capacity of RNA-Seq data for network construction has not been previously evaluated, we constructed a coexpression network using striatal samples, derived its network properties and compared it with microarray-based networks. The RNA-Seq coexpression network displayed scale-free, hierarchical network structure. We detected transcripts groups (modules) with correlated profiles; modules overlap distinct ontology categories. Neuroanatomical data from the Allen Brain Atlas reveal several modules with spatial colocalization. The network was compared with microarray-derived networks; correlations from RNA-Seq data were higher, likely because greater sensitivity and dynamic range. Higher correlations result in higher network connectivity, heterogeneity and centrality. For transcripts present across platforms, network structure appeared largely preserved. From this study, we present the first RNA-Seq data de novo network inference.

  13. A specific variant of the PHR1 binding site is highly enriched in the Arabidopsis phosphate-responsive phospholipase DZ2 coexpression network.

    PubMed

    Acevedo-Hernández, Gustavo; Oropeza-Aburto, Araceli; Herrera-Estrella, Luis

    2012-08-01

    PLDZ2 is a member of the Arabidopsis phospholipase D gene family that is induced in both shoot and root in response to phosphate (Pi) starvation. Recently, through deletion and gain-of-function analyses of the PLDZ2 promoter, we identified a 65 bp region (denominated enhancer EZ2) capable of conferring tissue-specific and low-Pi responses to a minimal inactive promoter. The EZ2 element contains two P1BS motifs, each of which is the binding site for PHR1 and related transcription factors. This structural organization is evolutionarily conserved in orthologous promoters within the rosid clade. To determine whether EZ2 is significantly over-represented in Arabidopsis genes coexpressed with PLDZ2, we constructed a PLDZ2 coexpression network containing 26 genes, almost half of them encoding enzymes or regulatory proteins involved in Pi recycling. A variant of the P1BS motif was found to be highly enriched in the promoter regions of these coexpressed genes, showing an EZ2-like arrangement in seven of them. No other motifs were significantly enriched. The over-representation of the EZ2 arrangement of P1BS motifs in the promoters of genes coexpressed with PLDZ2, suggests this unit has a particularly important role as a regulatory element in a coexpression network involved in the release of Pi from phospholipids and other molecules under Pi-limiting conditions.

  14. A specific variant of the PHR1 binding site is highly enriched in the Arabidopsis phosphate-responsive phospholipase DZ2 coexpression network

    PubMed Central

    Acevedo-Hernández, Gustavo; Oropeza-Aburto, Araceli; Herrera-Estrella, Luis

    2012-01-01

    PLDZ2 is a member of the Arabidopsis phospholipase D gene family that is induced in both shoot and root in response to phosphate (Pi) starvation. Recently, through deletion and gain-of-function analyses of the PLDZ2 promoter, we identified a 65 bp region (denominated enhancer EZ2) capable of conferring tissue-specific and low-Pi responses to a minimal inactive promoter. The EZ2 element contains two P1BS motifs, each of which is the binding site for PHR1 and related transcription factors. This structural organization is evolutionarily conserved in orthologous promoters within the rosid clade. To determine whether EZ2 is significantly over-represented in Arabidopsis genes coexpressed with PLDZ2, we constructed a PLDZ2 coexpression network containing 26 genes, almost half of them encoding enzymes or regulatory proteins involved in Pi recycling. A variant of the P1BS motif was found to be highly enriched in the promoter regions of these coexpressed genes, showing an EZ2-like arrangement in seven of them. No other motifs were significantly enriched. The over-representation of the EZ2 arrangement of P1BS motifs in the promoters of genes coexpressed with PLDZ2, suggests this unit has a particularly important role as a regulatory element in a coexpression network involved in the release of Pi from phospholipids and other molecules under Pi-limiting conditions. PMID:22836502

  15. EXPath tool-a system for comprehensively analyzing regulatory pathways and coexpression networks from high-throughput transcriptome data.

    PubMed

    Zheng, Han-Qin; Wu, Nai-Yun; Chow, Chi-Nga; Tseng, Kuan-Chieh; Chien, Chia-Hung; Hung, Yu-Cheng; Li, Guan-Zhen; Chang, Wen-Chi

    2017-03-13

    Next generation sequencing (NGS) has become the mainstream approach for monitoring gene expression levels in parallel with various experimental treatments. Unfortunately, there is no systematical webserver to comprehensively perform further analysis based on the huge amount of preliminary data that is obtained after finishing the process of gene annotation. Therefore, a user-friendly and effective system is required to mine important genes and regulatory pathways under specific conditions from high-throughput transcriptome data. EXPath Tool (available at: http://expathtool.itps.ncku.edu.tw/) was developed for the pathway annotation and comparative analysis of user-customized gene expression profiles derived from microarray or NGS platforms under various conditions to infer metabolic pathways for all organisms in the KEGG database. EXPath Tool contains several functions: access the gene expression patterns and the candidates of co-expression genes; dissect differentially expressed genes (DEGs) between two conditions (DEGs search), functional grouping with pathway and GO (Pathway/GO enrichment analysis), and correlation networks (co-expression analysis), and view the expression patterns of genes involved in specific pathways to infer the effects of the treatment. Additionally, the effectively of EXPath Tool has been performed by a case study on IAA-responsive genes. The results demonstrated that critical hub genes under IAA treatment could be efficiently identified.

  16. Human gene correlation analysis (HGCA): a tool for the identification of transcriptionally co-expressed genes.

    PubMed

    Michalopoulos, Ioannis; Pavlopoulos, Georgios A; Malatras, Apostolos; Karelas, Alexandros; Kostadima, Myrto-Areti; Schneider, Reinhard; Kossida, Sophia

    2012-06-06

    Bioinformatics and high-throughput technologies such as microarray studies allow the measure of the expression levels of large numbers of genes simultaneously, thus helping us to understand the molecular mechanisms of various biological processes in a cell. We calculate the Pearson Correlation Coefficient (r-value) between probe set signal values from Affymetrix Human Genome Microarray samples and cluster the human genes according to the r-value correlation matrix using the Neighbour Joining (NJ) clustering method. A hyper-geometric distribution is applied on the text annotations of the probe sets to quantify the term overrepresentations. The aim of the tool is the identification of closely correlated genes for a given gene of interest and/or the prediction of its biological function, which is based on the annotations of the respective gene cluster. Human Gene Correlation Analysis (HGCA) is a tool to classify human genes according to their coexpression levels and to identify overrepresented annotation terms in correlated gene groups. It is available at: http://biobank-informatics.bioacademy.gr/coexpression/.

  17. Co-expression network analysis reveals transcription factors associated to cell wall biosynthesis in sugarcane.

    PubMed

    Ferreira, Savio Siqueira; Hotta, Carlos Takeshi; Poelking, Viviane Guzzo de Carli; Leite, Debora Chaves Coelho; Buckeridge, Marcos Silveira; Loureiro, Marcelo Ehlers; Barbosa, Marcio Henrique Pereira; Carneiro, Monalisa Sampaio; Souza, Glaucia Mendes

    2016-05-01

    Sugarcane is a hybrid of Saccharum officinarum and Saccharum spontaneum, with minor contributions from other species in Saccharum and other genera. Understanding the molecular basis of cell wall metabolism in sugarcane may allow for rational changes in fiber quality and content when designing new energy crops. This work describes a comparative expression profiling of sugarcane ancestral genotypes: S. officinarum, S. spontaneum and S. robustum and a commercial hybrid: RB867515, linking gene expression to phenotypes to identify genes for sugarcane improvement. Oligoarray experiments of leaves, immature and intermediate internodes, detected 12,621 sense and 995 antisense transcripts. Amino acid metabolism was particularly evident among pathways showing natural antisense transcripts expression. For all tissues sampled, expression analysis revealed 831, 674 and 648 differentially expressed genes in S. officinarum, S. robustum and S. spontaneum, respectively, using RB867515 as reference. Expression of sugar transporters might explain sucrose differences among genotypes, but an unexpected differential expression of histones were also identified between high and low Brix° genotypes. Lignin biosynthetic genes and bioenergetics-related genes were up-regulated in the high lignin genotype, suggesting that these genes are important for S. spontaneum to allocate carbon to lignin, while S. officinarum allocates it to sucrose storage. Co-expression network analysis identified 18 transcription factors possibly related to cell wall biosynthesis while in silico analysis detected cis-elements involved in cell wall biosynthesis in their promoters. Our results provide information to elucidate regulatory networks underlying traits of interest that will allow the improvement of sugarcane for biofuel and chemicals production.

  18. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  19. Uncovering robust patterns of microRNA co-expression across cancers using Bayesian Relevance Networks

    PubMed Central

    2017-01-01

    Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing—with its unique statistical properties—became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca. PMID:28817636

  20. Uncovering robust patterns of microRNA co-expression across cancers using Bayesian Relevance Networks.

    PubMed

    Ramachandran, Parameswaran; Sánchez-Taltavull, Daniel; Perkins, Theodore J

    2017-01-01

    Co-expression networks have long been used as a tool for investigating the molecular circuitry governing biological systems. However, most algorithms for constructing co-expression networks were developed in the microarray era, before high-throughput sequencing-with its unique statistical properties-became the norm for expression measurement. Here we develop Bayesian Relevance Networks, an algorithm that uses Bayesian reasoning about expression levels to account for the differing levels of uncertainty in expression measurements between highly- and lowly-expressed entities, and between samples with different sequencing depths. It combines data from groups of samples (e.g., replicates) to estimate group expression levels and confidence ranges. It then computes uncertainty-moderated estimates of cross-group correlations between entities, and uses permutation testing to assess their statistical significance. Using large scale miRNA data from The Cancer Genome Atlas, we show that our Bayesian update of the classical Relevance Networks algorithm provides improved reproducibility in co-expression estimates and lower false discovery rates in the resulting co-expression networks. Software is available at www.perkinslab.ca.

  1. Genomic positions of co-expressed genes: echoes of chromosome organisation in gene expression data.

    PubMed

    Szczepińska, Teresa; Pawłowski, Krzysztof

    2013-06-13

    The relationships between gene expression and nuclear structure, chromosome territories in particular, are currently being elucidated experimentally. Each chromosome occupies an individual, spatially-limited space with a preferential position relative to the nuclear centre that may be specific to the cell and tissue type. We sought to discover whether patterns in gene expression databases might exist that would mirror prevailing or recurring nuclear structure patterns, chromosome territory interactions in particular. We used human gene expression datasets, both from a tissue expression atlas and from a large set including diverse types of perturbations. We identified groups of positional gene clusters over-represented in gene expression clusters. We show that some pairs of chromosomes and pairs of 10 Mbp long chromosome regions are significantly enriched in the expression clusters. The functions of genes involved in inter-chromosome co-expression relationships are non-random and predominantly related to cell-cell communication and reaction to external stimuli. We suggest that inter-chromosomal gene co-expression can be interpreted in the context of nuclear structure, and that even expression datasets that include very diverse conditions and cell types show consistent relationships.

  2. Differentially co-expressed genes in postmortem prefrontal cortex of individuals with alcohol use disorders: Influence on alcohol metabolism-related pathways

    PubMed Central

    Zhang, Huiping; Wang, Fan; Xu, Hongqin; Liu, Yawen; Liu, Jin; Zhao, Hongyu; Gelernter, Joel

    2014-01-01

    Chronic alcohol consumption may induce gene expression alterations in brain reward regions such as the prefrontal cortex (PFC), modulating the risk of alcohol use disorders (AUDs). Transcriptome profiles of 23 AUD cases and 23 matched controls (16 pairs of males and 7 pairs of females) in postmortem PFC were generated using Illumina’s HumanHT-12 v4 Expression BeadChip. Probe-level differentially expressed genes and gene modules in AUD subjects were identified using multiple linear regression and weighted gene co-expression network analyses. The enrichment of differentially co-expressed genes in alcohol dependence-associated genes identified by genome-wide association studies (GWAS) was examined using gene set enrichment analysis. Biological pathways overrepresented by differentially co-expressed genes were uncovered using DAVID bioinformatics resources. Three AUD-associated gene modules in males [Module 1 (561 probes mapping to 505 genes): r=0.42, Pcorrelation=0.020; Module 2 (815 probes mapping to 713 genes): r=0.41, Pcorrelation=0.020; Module 3 (1,446 probes mapping to 1,305 genes): r=−0.38, Pcorrelation=0.030] and one AUD-associated gene module in females [Module 4 (683 probes mapping to 652 genes): r=0.64, Pcorrelation=0.010] were identified. Differentially expressed genes mapped by significant expression probes (Pnominal≤0.05) clustered in Modules 1 and 2 were enriched in GWAS-identified alcohol dependence-associated genes [Module 1 (134 genes): P=0.028; Module 2 (243 genes): P=0.004]. These differentially expressed genes, including ALDH2, ALDH7A1, and ALDH9A1, are involved in cellular functions such as aldehyde detoxification, mitochondrial function, and fatty acid metabolism. Our study revealed differentially co-expressed genes in postmortem PFC of AUD subjects and demonstrated that some of these differentially co-expressed genes participate in alcohol metabolism. PMID:25073604

  3. Understanding developmental and adaptive cues in pine through metabolite profiling and co-expression network analysis

    PubMed Central

    Cañas, Rafael A.; Canales, Javier; Muñoz-Hernández, Carmen; Granados, Jose M.; Ávila, Concepción; García-Martín, María L.; Cánovas, Francisco M.

    2015-01-01

    Conifers include long-lived evergreen trees of great economic and ecological importance, including pines and spruces. During their long lives conifers must respond to seasonal environmental changes, adapt to unpredictable environmental stresses, and co-ordinate their adaptive adjustments with internal developmental programmes. To gain insights into these responses, we examined metabolite and transcriptomic profiles of needles from naturally growing 25-year-old maritime pine (Pinus pinaster L. Aiton) trees over a year. The effect of environmental parameters such as temperature and rain on needle development were studied. Our results show that seasonal changes in the metabolite profiles were mainly affected by the needles’ age and acclimation for winter, but changes in transcript profiles were mainly dependent on climatic factors. The relative abundance of most transcripts correlated well with temperature, particularly for genes involved in photosynthesis or winter acclimation. Gene network analysis revealed relationships between 14 co-expressed gene modules and development and adaptation to environmental stimuli. Novel Myb transcription factors were identified as candidate regulators during needle development. Our systems-based analysis provides integrated data of the seasonal regulation of maritime pine growth, opening new perspectives for understanding the complex regulatory mechanisms underlying conifers’ adaptive responses. Taken together, our results suggest that the environment regulates the transcriptome for fine tuning of the metabolome during development. PMID:25873654

  4. Integration of Metabolic Modeling with Gene Co-expression Reveals Transcriptionally Programmed Reactions Explaining Robustness in Mycobacterium tuberculosis

    PubMed Central

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Mittal, Inna; Mobeen, Ahmed; Ramachandran, Srinivasan

    2016-01-01

    Robustness of metabolic networks is accomplished by gene regulation, modularity, re-routing of metabolites and plasticity. Here, we probed robustness against perturbations of biochemical reactions of M. tuberculosis in the form of predicting compensatory trends. In order to investigate the transcriptional programming of genes associated with correlated fluxes, we integrated with gene co-expression network. Knock down of the reactions NADH2r and ATPS responsible for producing the hub metabolites, and Central carbon metabolism had the highest proportion of their associated genes under transcriptional co-expression with genes of their flux correlated reactions. Reciprocal gene expression correlations were observed among compensatory routes, fresh activation of alternative routes and in the multi-copy genes of Cysteine synthase and of Phosphate transporter. Knock down of 46 reactions caused the activation of Isocitrate lyase or Malate synthase or both reactions, which are central to the persistent state of M. tuberculosis. A total of 30 new freshly activated routes including Cytochrome c oxidase, Lactate dehydrogenase, and Glycine cleavage system were predicted, which could be responsible for switching into dormant or persistent state. Thus, our integrated approach of exploring transcriptional programming of flux correlated reactions has the potential to unravel features of system architecture conferring robustness. PMID:27000948

  5. Functional architecture and evolution of transcriptional elements that drive gene coexpression.

    PubMed

    Brown, Christopher D; Johnson, David S; Sidow, Arend

    2007-09-14

    Transcriptional coexpression of interacting gene products is required for complex molecular processes; however, the function and evolution of cis-regulatory elements that orchestrate coexpression remain largely unexplored. We mutagenized 19 regulatory elements that drive coexpression of Ciona muscle genes and obtained quantitative estimates of the cis-regulatory activity of the 77 motifs that comprise these elements. We found that individual motif activity ranges broadly within and among elements, and among different instantiations of the same motif type. The activity of orthologous motifs is strongly constrained, although motif arrangement, type, and activity vary greatly among the elements of different co-regulated genes. Thus, the syntactical rules governing this regulatory function are flexible but become highly constrained evolutionarily once they are established in a particular element.

  6. Gene differential coexpression analysis based on biweight correlation and maximum clique.

    PubMed

    Zheng, Chun-Hou; Yuan, Lin; Sha, Wen; Sun, Zhan-Li

    2014-01-01

    Differential coexpression analysis usually requires the definition of 'distance' or 'similarity' between measured datasets. Until now, the most common choice is Pearson correlation coefficient. However, Pearson correlation coefficient is sensitive to outliers. Biweight midcorrelation is considered to be a good alternative to Pearson correlation since it is more robust to outliers. In this paper, we introduce to use Biweight Midcorrelation to measure 'similarity' between gene expression profiles, and provide a new approach for gene differential coexpression analysis. Firstly, we calculate the biweight midcorrelation coefficients between all gene pairs. Then, we filter out non-informative correlation pairs using the 'half-thresholding' strategy and calculate the differential coexpression value of gene, The experimental results on simulated data show that the new approach performed better than three previously published differential coexpression analysis (DCEA) methods. Moreover, we use the maximum clique analysis to gene subset included genes identified by our approach and previously reported T2D-related genes, many additional discoveries can be found through our method.

  7. EPIG-Seq: extracting patterns and identifying co-expressed genes from RNA-Seq data.

    PubMed

    Li, Jianying; Bushel, Pierre R

    2016-03-22

    RNA sequencing (RNA-Seq) measures genome-wide gene expression. RNA-Seq data is count-based rendering normal distribution models for analysis inappropriate. Normalization of RNA-Seq data to transform the data has limitations which can adversely impact the analysis. Furthermore, there are a few count-based methods for analysis of RNA-Seq data but they are essentially for pairwise analysis of treatment groups or multiclasses but not pattern-based to identify co-expressed genes. We adapted our extracting patterns and identifying genes methodology for RNA-Seq (EPIG-Seq) count data. The software uses count-based correlation to measure similarity between genes, quasi-Poisson modelling to estimate dispersion in the data and a location parameter to indicate magnitude of differential expression. EPIG-Seq is different than any other software currently available for pattern analysis of RNA-Seq data in that EPIG-Seq 1) uses count level data and supports cases of inflated zeros, 2) identifies statistically significant clusters of genes that are co-expressed across experimental conditions, 3) takes into account dispersion in the replicate data and 4) provides reliable results even with small sample sizes. EPIG-Seq operates in two steps: 1) extract the pattern profiles from data as seeds for clustering co-expressed genes and 2) cluster the genes to the pattern seeds and compute statistical significance of the pattern of co-expressed genes. EPIG-Seq provides a table of the genes with bootstrapped p-values and profile plots of the patterns of co-expressed genes. In addition, EPIG-Seq provides a heat map and principal component dimension reduction plot of the clustered genes as visual aids. We demonstrate the utility of EPIG-Seq through the analysis of toxicogenomics and cancer data sets to identify biologically relevant co-expressed genes. EPIG-Seq is available at: sourceforge.net/projects/epig-seq. EPIG-Seq is unlike any other software currently available for pattern analysis of

  8. G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

    USDA-ARS?s Scientific Manuscript database

    In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...

  9. A GPU-accelerated algorithm for biclustering analysis and detection of condition-dependent coexpression network modules.

    PubMed

    Bhattacharya, Anindya; Cui, Yan

    2017-06-23

    In the analysis of large-scale gene expression data, it is important to identify groups of genes with common expression patterns under certain conditions. Many biclustering algorithms have been developed to address this problem. However, comprehensive discovery of functionally coherent biclusters from large datasets remains a challenging problem. Here we propose a GPU-accelerated biclustering algorithm, based on searching for the largest Condition-dependent Correlation Subgroups (CCS) for each gene in the gene expression dataset. We compared CCS with thirteen widely used biclustering algorithms. CCS consistently outperformed all the thirteen biclustering algorithms on both synthetic and real gene expression datasets. As a correlation-based biclustering method, CCS can also be used to find condition-dependent coexpression network modules. We implemented the CCS algorithm using C and implemented the parallelized CCS algorithm using CUDA C for GPU computing. The source code of CCS is available from https://github.com/abhatta3/Condition-dependent-Correlation-Subgroups-CCS.

  10. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers

    PubMed Central

    Zhang, Jie; Huang, Kun

    2014-01-01

    In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. PMID:27486298

  11. WeGET: predicting new genes for molecular systems by weighted co-expression

    PubMed Central

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A.

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  12. Constitutive and inducible co-expression systems for non-viral osteoinductive gene therapy.

    PubMed

    Feichtinger, G A; Hacobian, A; Hofmann, A T; Wassermann, K; Zimmermann, A; van Griensven, M; Redl, H

    2014-02-19

    Tissue regenerative gene therapy requires expression strategies that deliver therapeutic effective amounts of transgenes. As physiological expression patterns are more complex than high-level expression of a singular therapeutic gene, we aimed at constitutive or inducible co-expression of 2 transgenes simultaneously. Co-expression of human bone morphogenetic protein 2 and 7 (BMP2/7) from constitutively expressing and doxycycline inducible plasmids was evaluated in vitro in C2C12 cells with osteocalcin reporter gene assays and standard assays for osteogenic differentiation. The constitutive systems were additionally tested in an in vivo pilot for ectopic bone formation after repeated naked DNA injection to murine muscle tissue. Inductor controlled differentiation was demonstrated in vitro for inducible co-expression. Both co-expression systems, inducible and constitutive, achieved significantly better osteogenic differentiation than single factor expression. The potency of the constitutive co-expression systems was dependent on relative expression cassette topology. In vivo, ectopic bone formation was demonstrated in 6/13 animals (46% bone formation efficacy) at days 14 and 28 in hind limb muscles as proven by in vivo µCT and histological evaluation. In vitro findings demonstrated that the devised single vector BMP2/7 co-expression strategy mediates superior osteoinduction, can be applied in an inductor controlled fashion and that its efficiency is dependent on expression cassette topology. In vivo results indicatethatco-expression of BMP2/7 applied by non-viral naked DNA gene transfer effectively mediates bone formation without the application of biomaterials, cells or recombinant growth factors, offering a promising alternative to current treatment strategies with potential for clinical translation in the future.

  13. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism

    PubMed Central

    Pérez-Delgado, Carmen M.; Moyano, Tomás C.; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A.; Márquez, Antonio J.; Betti, Marco

    2016-01-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. PMID:27117340

  14. Construction of a promoter collection for genes co-expression in filamentous fungus Trichoderma reesei.

    PubMed

    Wang, Wei; Meng, Fanju; Liu, Pei; Yang, Shengli; Wei, Dongzhi

    2014-11-01

    Trichoderma reesei is the preferred organism for producing industrial cellulases. However, cellulases derived from T. reesei have their highest activity at acidic pH. When the pH value increased above 7, the enzyme activities almost disappeared, thereby limiting the application of fungal cellulases under neutral or alkaline conditions. A lot of heterologous alkaline cellulases have been successfully expressed in T. reesei to improve its cellulolytic profile. To our knowledge, there are few reports describing the co-expression of two or more heterologous cellulases in T. reesei. We designed and constructed a promoter collection for gene expression and co-expression in T. reesei. Taking alkaline cellulase as a reporter gene, we assessed our promoters with strengths ranging from 4 to 106 % as compared to the pWEF31 expression vector (Lv D, Wang W, Wei D (2012) Construction of two vectors for gene expression in Trichoderma reesei. Plasmid 67(1):67-71). The promoter collection was used in a proof-of-principle approach to achieve the co-expression of an alkaline endoglucanase and an alkaline cellobiohydrolase. We observed higher activities of both cellulose degradation and biostoning by the co-expression of an endoglucanase and a cellobiohydrolase than the activities obtained by the expression of only endoglucanase or cellobiohydrolase. This study makes the process of engineering expression of multiple genes easier in T. reesei.

  15. Characterization of Chemically Induced Liver Injuries Using Gene Co-Expression Modules

    PubMed Central

    Tawa, Gregory J.; AbdulHameed, Mohamed Diwan M.; Yu, Xueping; Kumar, Kamal; Ippolito, Danielle L.; Lewis, John A.; Stallings, Jonathan D.; Wallqvist, Anders

    2014-01-01

    Liver injuries due to ingestion or exposure to chemicals and industrial toxicants pose a serious health risk that may be hard to assess due to a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific damage and clinical outcomes via biomarkers or biomarker panels will provide the foundation for highly specific and robust diagnostic tests. Here, we have used DrugMatrix, a toxicogenomics database containing organ-specific gene expression data matched to dose-dependent chemical exposures and adverse clinical pathology assessments in Sprague Dawley rats, to identify groups of co-expressed genes (modules) specific to injury endpoints in the liver. We identified 78 such gene co-expression modules associated with 25 diverse injury endpoints categorized from clinical pathology, organ weight changes, and histopathology. Using gene expression data associated with an injury condition, we showed that these modules exhibited different patterns of activation characteristic of each injury. We further showed that specific module genes mapped to 1) known biochemical pathways associated with liver injuries and 2) clinically used diagnostic tests for liver fibrosis. As such, the gene modules have characteristics of both generalized and specific toxic response pathways. Using these results, we proposed three gene signature sets characteristic of liver fibrosis, steatosis, and general liver injury based on genes from the co-expression modules. Out of all 92 identified genes, 18 (20%) genes have well-documented relationships with liver disease, whereas the rest are novel and have not previously been associated with liver disease. In conclusion, identifying gene co-expression modules associated with chemically induced liver injuries aids in generating testable hypotheses and has the potential to identify putative biomarkers of adverse health effects. PMID:25226513

  16. Coexpression of two closely linked avian genes for purine nucleotide synthesis from a bidirectional promoter.

    PubMed Central

    Gavalas, A; Dixon, J E; Brayton, K A; Zalkin, H

    1993-01-01

    Two avian genes encoding essential steps in the purine nucleotide biosynthetic pathway are transcribed divergently from a bidirectional promoter element. The bidirectional promoter, embedded in a CpG island, directs coexpression of GPAT and AIRC genes from distinct transcriptional start sites 229 bp apart. The bidirectional promoter can be divided in half, with each half retaining partial activity towards the cognate gene. GPAT and AIRC genes encode the enzymes that catalyze step 1 and steps 6 plus 7, respectively, in the de novo purine biosynthetic pathway. This is the first report of genes coding for structurally unrelated enzymes of the same pathway that are tightly linked and transcribed divergently from a bidirectional promoter. This arrangement has the potential to provide for regulated coexpression comparable to that in a prokaryotic operon. Images PMID:8336716

  17. Resolving stem and progenitor cells in the adult mouse incisor through gene co-expression analysis

    PubMed Central

    Seidel, Kerstin; Marangoni, Pauline; Tang, Cynthia; Houshmand, Bahar; Du, Wen; Maas, Richard L; Murray, Steven; Oldham, Michael C; Klein, Ophir D

    2017-01-01

    Investigations into stem cell-fueled renewal of an organ benefit from an inventory of cell type-specific markers and a deep understanding of the cellular diversity within stem cell niches. Using the adult mouse incisor as a model for a continuously renewing organ, we performed an unbiased analysis of gene co-expression relationships to identify modules of co-expressed genes that represent differentiated cells, transit-amplifying cells, and residents of stem cell niches. Through in vivo lineage tracing, we demonstrated the power of this approach by showing that co-expression module members Lrig1 and Igfbp5 define populations of incisor epithelial and mesenchymal stem cells. We further discovered that two adjacent mesenchymal tissues, the periodontium and dental pulp, are maintained by distinct pools of stem cells. These findings reveal novel mechanisms of incisor renewal and illustrate how gene co-expression analysis of intact biological systems can provide insights into the transcriptional basis of cellular identity. DOI: http://dx.doi.org/10.7554/eLife.24712.001 PMID:28475038

  18. Gene co-expression analyses: an overview from microarray collections in Arabidopsis thaliana.

    PubMed

    Di Salle, Pasquale; Incerti, Guido; Colantuono, Chiara; Chiusano, Maria Luisa

    2017-03-01

    Bioinformatics web-based resources and databases are precious references for most biological laboratories worldwide. However, the quality and reliability of the information they provide depends on them being used in an appropriate way that takes into account their specific features. Huge collections of gene expression data are currently publicly available, ready to support the understanding of gene and genome functionalities. In this context, tools and resources for gene co-expression analyses have flourished to exploit the 'guilty by association' principle, which assumes that genes with correlated expression profiles are functionally related. In the case of Arabidopsis thaliana, the reference species in plant biology, the resources available mainly consist of microarray results. After a general overview of such resources, we tested and compared the results they offer for gene co-expression analysis. We also discuss the effect on the results when using different data sets, as well as different data normalization approaches and parameter settings, which often consider different metrics for establishing co-expression. A dedicated example analysis of different gene pools, implemented by including/excluding mutant samples in a reference data set, showed significant variation of gene co-expression occurrence, magnitude and direction. We conclude that, as the heterogeneity of the resources and methods may produce different results for the same query genes, the exploration of more than one of the available resources is strongly recommended. The aim of this article is to show how best to integrate data sources and/or merge outputs to achieve robust analyses and reliable interpretations, thereby making use of diverse data resources an opportunity for added value. © The Author 2016. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  19. Identify signature regulatory network for glioblastoma prognosis by integrative mRNA and miRNA co-expression analysis.

    PubMed

    Bing, Zhi-Tong; Yang, Guang-Hui; Xiong, Jie; Guo, Ling; Yang, Lei

    2016-12-01

    Glioblastoma multiforme (GBM) is the most common and aggressive type of primary brain tumor in adults. Patients with this disease have a poor prognosis. The objective of this study is to identify survival-related individual genes (or miRNAs) and miRNA -mRNA pairs in GBM using a multi-step approach. First, the weighted gene co-expression network analysis and survival analysis are applied to identify survival-related modules from mRNA and miRNA expression profiles, respectively. Subsequently, the role of individual genes (or miRNAs) within these modules in GBM prognosis are highlighted using survival analysis. Finally, the integration analysis of miRNA and mRNA expression as well as miRNA target prediction is used to identify survival-related miRNA -mRNA regulatory network. In this study, five genes and two miRNA modules that significantly correlated to patient's survival. In addition, many individual genes (or miRNAs) assigned to these modules were found to be closely linked with survival. For instance, increased expression of neuropilin-1 gene (a member of module turquoise) indicated poor prognosis for patients and a group of miRNA -mRNA regulatory networks that comprised 38 survival-related miRNA -mRNA pairs. These findings provide a new insight into the underlying molecular regulatory mechanisms of GBM.

  20. ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression.

    PubMed

    Aoki, Yuichi; Okamura, Yasunobu; Tadaka, Shu; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    ATTED-II (http://atted.jp) is a coexpression database for plant species with parallel views of multiple coexpression data sets and network analysis tools. The user can efficiently find functional gene relationships and design experiments to identify gene functions by reverse genetics and general molecular biology techniques. Here, we report updates to ATTED-II (version 8.0), including new and updated coexpression data and analysis tools. ATTED-II now includes eight microarray- and six RNA sequencing-based coexpression data sets for seven dicot species (Arabidopsis, field mustard, soybean, barrel medick, poplar, tomato and grape) and two monocot species (rice and maize). Stand-alone coexpression analyses tend to have low reliability. Therefore, examining evolutionarily conserved coexpression is a more effective approach from the viewpoints of reliability and evolutionary importance. In contrast, the reliability of species-specific coexpression data remains poor. Our assessment scores for individual coexpression data sets indicated that the quality of the new coexpression data sets in ATTED-II is higher than for any previous coexpression data set. In addition, five species (Arabidopsis, soybean, tomato, rice and maize) in ATTED-II are now supported by both microarray- and RNA sequencing-based coexpression data, which has increased the reliability. Consequently, ATTED-II can now provide lineage-specific coexpression information. As an example of the use of ATTED-II to explore lineage-specific coexpression, we demonstrate monocot- and dicot-specific coexpression of cell wall genes. With the expanded coexpression data for multilevel evaluation, ATTED-II provides new opportunities to investigate lineage-specific evolution in plants. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  1. Co-expression of FBN1 with mesenchyme-specific genes in mouse cell lines: implications for phenotypic variability in Marfan syndrome

    PubMed Central

    Summers, Kim M; Raza, Sobia; van Nimwegen, Erik; Freeman, Thomas C; Hume, David A

    2010-01-01

    Mutations in the human FBN1 gene cause Marfan syndrome, a complex disease affecting connective tissues but with a highly variable phenotype. To identify genes that might participate in epistatic interactions with FBN1, and could therefore explain the observed phenotypic variability, we have looked for genes that are co-expressed with Fbn1 in the mouse. Microarray expression data derived from a range of primary mouse cells and cell lines were analysed using the network analysis tool BioLayout Express3D. A cluster of 205 genes, including Fbn1, were selectively expressed by mouse cell lines of different mesenchymal lineages and by mouse primary mesenchymal cells (preadipocytes, myoblasts, fibroblasts, osteoblasts). Promoter analysis of this gene set identified several candidate transcriptional regulators. Genes within this co-expressed cluster are candidate genetic modifiers for Marfan syndrome and for other connective tissue diseases. PMID:20551991

  2. Overlapping gene coexpression patterns in human medullary thymic epithelial cells generate self-antigen diversity.

    PubMed

    Pinto, Sheena; Michel, Chloé; Schmidt-Glenewinkel, Hannah; Harder, Nathalie; Rohr, Karl; Wild, Stefan; Brors, Benedikt; Kyewski, Bruno

    2013-09-10

    Promiscuous expression of numerous tissue-restricted self-antigens (TRAs) in medullary thymic epithelial cells (mTECs) is essential to safeguard self-tolerance. A distinct feature of promiscuous gene expression is its mosaic pattern (i.e., at a given time, each self-antigen is expressed only in 1-3% of mTECs). How this mosaic pattern is generated at the single-cell level is currently not understood. Here, we show that subsets of human mTECs expressing a particular TRA coexpress distinct sets of genes. We identified three coexpression groups comprising overlapping and complementary gene sets, which preferentially mapped to certain chromosomes and intrachromosomal gene clusters. Coexpressed gene loci tended to colocalize to the same nuclear subdomain. The TRA subsets aligned along progressive differentiation stages within the mature mTEC subset and, in vitro, interconverted along this sequence. Our data suggest that single mTECs shift through distinct gene pools, thus scanning a sizeable fraction of the overall repertoire of promiscuously expressed self-antigens. These findings have implications for the temporal and spatial (re)presentation of self-antigens in the medulla in the context of tolerance induction.

  3. Coexpression Network Analysis in Abdominal and Gluteal Adipose Tissue Reveals Regulatory Genetic Loci for Metabolic Syndrome and Related Phenotypes

    PubMed Central

    Min, Josine L.; Nicholson, George; Halgrimsdottir, Ingileif; Almstrup, Kristian; Petri, Andreas; Barrett, Amy; Travers, Mary; Rayner, Nigel W.; Mägi, Reedik; Pettersson, Fredrik H.; Broxholme, John; Neville, Matt J.; Wills, Quin F.; Cheeseman, Jane; Allen, Maxine; Holmes, Chris C.; Spector, Tim D.; Fleckner, Jan; McCarthy, Mark I.; Karpe, Fredrik; Lindgren, Cecilia M.; Zondervan, Krina T.

    2012-01-01

    Metabolic Syndrome (MetS) is highly prevalent and has considerable public health impact, but its underlying genetic factors remain elusive. To identify gene networks involved in MetS, we conducted whole-genome expression and genotype profiling on abdominal (ABD) and gluteal (GLU) adipose tissue, and whole blood (WB), from 29 MetS cases and 44 controls. Co-expression network analysis for each tissue independently identified nine, six, and zero MetS–associated modules of coexpressed genes in ABD, GLU, and WB, respectively. Of 8,992 probesets expressed in ABD or GLU, 685 (7.6%) were expressed in ABD and 51 (0.6%) in GLU only. Differential eigengene network analysis of 8,256 shared probesets detected 22 shared modules with high preservation across adipose depots (DABD-GLU = 0.89), seven of which were associated with MetS (FDR P<0.01). The strongest associated module, significantly enriched for immune response–related processes, contained 94/620 (15%) genes with inter-depot differences. In an independent cohort of 145/141 twins with ABD and WB longitudinal expression data, median variability in ABD due to familiality was greater for MetS–associated versus un-associated modules (ABD: 0.48 versus 0.18, P = 0.08; GLU: 0.54 versus 0.20, P = 7.8×10−4). Cis-eQTL analysis of probesets associated with MetS (FDR P<0.01) and/or inter-depot differences (FDR P<0.01) provided evidence for 32 eQTLs. Corresponding eSNPs were tested for association with MetS–related phenotypes in two GWAS of >100,000 individuals; rs10282458, affecting expression of RARRES2 (encoding chemerin), was associated with body mass index (BMI) (P = 6.0×10−4); and rs2395185, affecting inter-depot differences of HLA-DRB1 expression, was associated with high-density lipoprotein (P = 8.7×10−4) and BMI–adjusted waist-to-hip ratio (P = 2.4×10−4). Since many genes and their interactions influence complex traits such as MetS, integrated analysis of genotypes and

  4. Increased co-expression of genes harboring the damaging de novo mutations in Chinese schizophrenic patients during prenatal development.

    PubMed

    Wang, Qiang; Li, Miaoxin; Yang, Zhenxing; Hu, Xun; Wu, Hei-Man; Ni, Peiyan; Ren, Hongyan; Deng, Wei; Li, Mingli; Ma, Xiaohong; Guo, Wanjun; Zhao, Liansheng; Wang, Yingcheng; Xiang, Bo; Lei, Wei; Sham, Pak C; Li, Tao

    2015-12-15

    Schizophrenia is a heritable, heterogeneous common psychiatric disorder. In this study, we evaluated the hypothesis that de novo variants (DNVs) contribute to the pathogenesis of schizophrenia. We performed exome sequencing in Chinese patients (N = 45) with schizophrenia and their unaffected parents (N = 90). Forty genes were found to contain DNVs. These genes had enriched transcriptional co-expression profile in prenatal frontal cortex (Bonferroni corrected p < 9.1 × 10(-3)), and in prenatal temporal and parietal regions (Bonferroni corrected p < 0.03). Also, four prenatal anatomical subregions (VCF, MFC, OFC and ITC) have shown significant enrichment of connectedness in co-expression networks. Moreover, four genes (LRP1, MACF1, DICER1 and ABCA2) harboring the damaging de novo mutations are strongly prioritized as susceptibility genes by multiple evidences. Our findings in Chinese schizophrenic patients indicate the pathogenic role of DNVs, supporting the hypothesis that schizophrenia is a neurodevelopmental disease.

  5. Transcriptional coexpression network reveals the involvement of varying stem cell features with different dysregulations in different gastric cancer subtypes.

    PubMed

    Kalamohan, Kalaivani; Periasamy, Jayaprakash; Bhaskar Rao, Divya; Barnabas, Georgina D; Ponnaiyan, Srigayatri; Ganesan, Kumaresan

    2014-10-01

    Despite the advancements in the cancer therapeutics, gastric cancer ranks as the second most common cancers with high global mortality rate. Integrative functional genomic investigation is a powerful approach to understand the major dysregulations and to identify the potential targets toward the development of targeted therapeutics for various cancers. Intestinal and diffuse type gastric tumors remain the major subtypes and the molecular determinants and drivers of these distinct subtypes remain unidentified. In this investigation, by exploring the network of gene coexpression association in gastric tumors, mRNA expressions of 20,318 genes across 200 gastric tumors were categorized into 21 modules. The genes and the hub genes of the modules show gastric cancer subtype specific expression. The expression patterns of the modules were correlated with intestinal and diffuse subtypes as well as with the differentiation status of gastric tumors. Among these, G1 module has been identified as a major driving force of diffuse type gastric tumors with the features of (i) enriched mesenchymal, mesenchymal stem cell like, and mesenchymal derived multiple lineages, (ii) elevated OCT1 mediated transcription, (iii) involvement of Notch activation, and (iv) reduced polycomb mediated epigenetic repression. G13 module has been identified as key factor in intestinal type gastric tumors and found to have the characteristic features of (i) involvement of embryonic stem cell like properties, (ii) Wnt, MYC and E2F mediated transcription programs, and (iii) involvement of polycomb mediated repression. Thus the differential transcription programs, differential epigenetic regulation and varying stem cell features involved in two major subtypes of gastric cancer were delineated by exploring the gene coexpression network. The identified subtype specific dysregulations could be optimally employed in developing subtype specific therapeutic targeting strategies for gastric cancer.

  6. Co-Expression Network and Pathway Analyses Reveal Important Modules of miRNAs Regulating Milk Yield and Component Traits.

    PubMed

    Do, Duy N; Dudemaine, Pier-Luc; Li, Ran; Ibeagha-Awemu, Eveline M

    2017-07-18

    Co-expression network analyses provide insights into the molecular interactions underlying complex traits and diseases. In this study, co-expression network analysis was performed to detect expression patterns (modules or clusters) of microRNAs (miRNAs) during lactation, and to identify miRNA regulatory mechanisms for milk yield and component traits (fat, protein, somatic cell count (SCC), lactose, and milk urea nitrogen (MUN)) via miRNA target gene enrichment analysis. miRNA expression (713 miRNAs), and milk yield and components (Fat%, Protein%, lactose, SCC, MUN) data of nine cows at each of six different time points (day 30 (D30), D70, D130, D170, D230 and D290) of an entire lactation curve were used. Four modules or clusters (GREEN, BLUE, RED and TURQUOISE) of miRNAs were identified as important for milk yield and component traits. The GREEN and BLUE modules were significantly correlated (|r| > 0.5) with milk yield and lactose, respectively. The RED and TURQUOISE modules were significantly correlated (|r| > 0.5) with both SCC and lactose. In the GREEN module, three abundantly expressed miRNAs (miR-148a, miR-186 and miR-200a) were most significantly correlated to milk yield, and are probably the most important miRNAs for this trait. DDR1 and DDHX1 are hub genes for miRNA regulatory networks controlling milk yield, while HHEX is an important transcription regulator for these networks. miR-18a, miR-221/222 cluster, and transcription factors HOXA7, and NOTCH 3 and 4, are important for the regulation of lactose. miR-142, miR-146a, and miR-EIA17-14144 (a novel miRNA), and transcription factors in the SMAD family and MYB, are important for the regulation of SCC. Important signaling pathways enriched for target genes of miRNAs of significant modules, included protein kinase A and PTEN signaling for milk yield, eNOS and Noth signaling for lactose, and TGF β, HIPPO, Wnt/β-catenin and cell cycle signaling for SCC. Relevant enriched gene ontology (GO)-terms related to

  7. Co-Expression Network and Pathway Analyses Reveal Important Modules of miRNAs Regulating Milk Yield and Component Traits

    PubMed Central

    Do, Duy N.; Dudemaine, Pier-Luc; Li, Ran; Ibeagha-Awemu, Eveline M.

    2017-01-01

    Co-expression network analyses provide insights into the molecular interactions underlying complex traits and diseases. In this study, co-expression network analysis was performed to detect expression patterns (modules or clusters) of microRNAs (miRNAs) during lactation, and to identify miRNA regulatory mechanisms for milk yield and component traits (fat, protein, somatic cell count (SCC), lactose, and milk urea nitrogen (MUN)) via miRNA target gene enrichment analysis. miRNA expression (713 miRNAs), and milk yield and components (Fat%, Protein%, lactose, SCC, MUN) data of nine cows at each of six different time points (day 30 (D30), D70, D130, D170, D230 and D290) of an entire lactation curve were used. Four modules or clusters (GREEN, BLUE, RED and TURQUOISE) of miRNAs were identified as important for milk yield and component traits. The GREEN and BLUE modules were significantly correlated (|r| > 0.5) with milk yield and lactose, respectively. The RED and TURQUOISE modules were significantly correlated (|r| > 0.5) with both SCC and lactose. In the GREEN module, three abundantly expressed miRNAs (miR-148a, miR-186 and miR-200a) were most significantly correlated to milk yield, and are probably the most important miRNAs for this trait. DDR1 and DDHX1 are hub genes for miRNA regulatory networks controlling milk yield, while HHEX is an important transcription regulator for these networks. miR-18a, miR-221/222 cluster, and transcription factors HOXA7, and NOTCH 3 and 4, are important for the regulation of lactose. miR-142, miR-146a, and miR-EIA17-14144 (a novel miRNA), and transcription factors in the SMAD family and MYB, are important for the regulation of SCC. Important signaling pathways enriched for target genes of miRNAs of significant modules, included protein kinase A and PTEN signaling for milk yield, eNOS and Noth signaling for lactose, and TGF β, HIPPO, Wnt/β-catenin and cell cycle signaling for SCC. Relevant enriched gene ontology (GO)-terms related to

  8. Identification of Drosophila Mitotic Genes by Combining Co-Expression Analysis and RNA Interference

    PubMed Central

    Somma, Maria Patrizia; Ceprani, Francesca; Bucciarelli, Elisabetta; Naim, Valeria; De Arcangelis, Valeria; Piergentili, Roberto; Palena, Antonella; Ciapponi, Laura; Giansanti, Maria Grazia; Pellacani, Claudia; Petrucci, Romano; Cenci, Giovanni; Vernì, Fiammetta; Fasulo, Barbara; Goldberg, Michael L.; Di Cunto, Ferdinando; Gatti, Maurizio

    2008-01-01

    RNAi screens have, to date, identified many genes required for mitotic divisions of Drosophila tissue culture cells. However, the inventory of such genes remains incomplete. We have combined the powers of bioinformatics and RNAi technology to detect novel mitotic genes. We found that Drosophila genes involved in mitosis tend to be transcriptionally co-expressed. We thus constructed a co-expression–based list of 1,000 genes that are highly enriched in mitotic functions, and we performed RNAi for each of these genes. By limiting the number of genes to be examined, we were able to perform a very detailed phenotypic analysis of RNAi cells. We examined dsRNA-treated cells for possible abnormalities in both chromosome structure and spindle organization. This analysis allowed the identification of 142 mitotic genes, which were subdivided into 18 phenoclusters. Seventy of these genes have not previously been associated with mitotic defects; 30 of them are required for spindle assembly and/or chromosome segregation, and 40 are required to prevent spontaneous chromosome breakage. We note that the latter type of genes has never been detected in previous RNAi screens in any system. Finally, we found that RNAi against genes encoding kinetochore components or highly conserved splicing factors results in identical defects in chromosome segregation, highlighting an unanticipated role of splicing factors in centromere function. These findings indicate that our co-expression–based method for the detection of mitotic functions works remarkably well. We can foresee that elaboration of co-expression lists using genes in the same phenocluster will provide many candidate genes for small-scale RNAi screens aimed at completing the inventory of mitotic proteins. PMID:18797514

  9. Phenotype-Dependent Coexpression Gene Clusters: Application to Normal and Premature Ageing.

    PubMed

    Wang, Kun; Das, Avinash; Xiong, Zheng-Mei; Cao, Kan; Hannenhalli, Sridhar

    2015-01-01

    Hutchinson Gilford progeria syndrome (HGPS) is a rare genetic disease with symptoms of aging at a very early age. Its molecular basis is not entirely clear, although profound gene expression changes have been reported, and there are some known and other presumed overlaps with normal aging process. Identification of genes with agingor HGPS-associated expression changes is thus an important problem. However, standard regression approaches are currently unsuitable for this task due to limited sample sizes, thus motivating development of alternative approaches. Here, we report a novel iterative multiple regression approach that leverages co-expressed gene clusters to identify gene clusters whose expression co-varies with age and/or HGPS. We have applied our approach to novel RNA-seq profiles in fibroblast cell cultures at three different cellular ages, both from HGPS patients and normal samples. After establishing the robustness of our approach, we perform a comparative investigation of biological processes underlying normal aging and HGPS. Our results recapitulate previously known processes underlying aging as well as suggest numerous unique processes underlying aging and HGPS. The approach could also be useful in detecting phenotype-dependent co-expression gene clusters in other contexts with limited sample sizes.

  10. A Systems Approach Implicates a Brain Mitochondrial Oxidative Homeostasis Co-expression Network in Genetic Vulnerability to Alcohol Withdrawal

    PubMed Central

    Walter, Nicole A. R.; Denmark, DeAunne L.; Kozell, Laura B.; Buck, Kari J.

    2017-01-01

    Genetic factors significantly affect vulnerability to alcohol dependence (alcoholism). We previously identified quantitative trait loci on distal mouse chromosome 1 with large effects on predisposition to alcohol physiological dependence and associated withdrawal following both chronic and acute alcohol exposure in mice (Alcdp1 and Alcw1, respectively). We fine-mapped these loci to a 1.1–1.7 Mb interval syntenic with human 1q23.2-23.3. Alcw1/Alcdp1 interval genes show remarkable genetic variation among mice derived from the C57BL/6J and DBA/2J strains, the two most widely studied genetic animal models for alcohol-related traits. Here, we report the creation of a novel recombinant Alcw1/Alcdp1 congenic model (R2) in which the Alcw1/Alcdp1 interval from a donor C57BL/6J strain is introgressed onto a uniform, inbred DBA/2J genetic background. As expected, R2 mice demonstrate significantly less severe alcohol withdrawal compared to wild-type littermates. Additionally, comparing R2 and background strain animals, as well as reciprocal congenic (R8) and appropriate background strain animals, we assessed Alcw1/Alcdp1 dependent brain gene expression using microarray and quantitative PCR analyses. To our knowledge this includes the first Weighted Gene Co-expression Network Analysis using reciprocal congenic models. Importantly, this allows detection of co-expression patterns limited to one or common to both genetic backgrounds with high or low predisposition to alcohol withdrawal severity. The gene expression patterns (modules) in common contain genes related to oxidative phosphorylation, building upon human and animal model studies that implicate involvement of oxidative phosphorylation in alcohol use disorders (AUDs). Finally, we demonstrate that administration of N-acetylcysteine, an FDA-approved antioxidant, significantly reduces symptoms of alcohol withdrawal (convulsions) in mice, thus validating a phenotypic role for this network. Taken together, these studies

  11. Coexpression Analysis Reveals Key Gene Modules and Pathway of Human Coronary Heart Disease.

    PubMed

    Tang, Yu; Ke, Zun-Ping; Peng, Yi-Gen; Cai, Ping-Tai

    2017-08-31

    Coronary heart disease is a kind of disease which causes great injury to people world-widely. Although gene expression analyses had been performed previously, to our best knowledge, systemic co-expression analysis for this disease is still lacking to date. Microarray data of coronary heart disease was downloaded from NCBI with the accession number of GSE20681. Co-expression modules were constructed by WGCNA. Besides, the connectivity degree of eigengenes was analyzed. Furthermore, GO and KEGG enrichment analysis was performed on these eigengenes in these constructed modules. A total of 11 co-expression modules were constructed by the 3,000 up-regulated genes from the 99 samples with coronary heart disease. The average number of genes in these modules was 270. The interaction analysis indicated the relative independence of gene expression in these modules. The functional enrichment analysis showed that there was a significant difference in the enriched terms and degree among these 11 modules. The results showed that module 9 and module 10 played critical roles in the occurrence of coronary disease. Pathways of hsa00190(Oxidative phosphorylation)and (hsa01130: Biosynthesis of antibiotics) were thought to be closely related to the occurrence and development of coronary heart disease. Our result demonstrated that module 9 and module 10 were the most critical modules in the occurrence of coronary heart disease. Pathways as hsa00190(Oxidative phosphorylation) and (hsa01130: Biosynthesis of antibiotics) had the potential to serve as the prognostic and predictive marker of coronary heart disease. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  12. Analysis of differentially co-expressed genes based on microarray data of hepatocellular carcinoma.

    PubMed

    Wang, Y; Jiang, T; Li, Z; Lu, L; Zhang, R; Zhang, D; Wang, X; Tan, J

    2017-01-01

    Hepatocellular carcinoma (HCC) is the third leading cause of cancer related death worldwide. Although great progress in diagnosis and management of HCC have been made, the exact molecular mechanisms remain poorly understood. The study aims to identify potential biomarkers for HCC progression, mainly at transcription level. In this study, chip data GSE 29721 was utilized, which contains 10 HCC samples and 10 normal adjacent tissue samples. Differentially expressed genes (DEGs) between two sample types were selected by t-test method. Following, the differentially co-expressed genes (DCGs) and differentially co-expressed Links (DCLs) were identified by DCGL package in R with the threshold of q < 0.25. Afterwards, pathway enrichment analysis of the DCGs was carried out by DAVID. Then, DCLs were mapped to TRANSFAC database to reveal associations between relevant transcriptional factors (TFs) and their target genes. Quantitative real-time RT-PCR was performed for TFs or genes of interest. As a result, a total of 388 DCGs and 35,771 DCLs were obtained. The predominant pathways enriched by these genes were Cytokine-cytokine receptor interaction, ECM-receptor interaction and TGF-β signaling pathway. Three TF-target interactions, LEF1-NCAM1, EGR1-FN1 and FOS-MT2A were predicted. Compared with control, expressions of the TF genes EGR1, FOS and ETS2 were all up-regulated in the HCC cell line, HepG2; while LEF1 was down-regulated. Except NCAM1, all the target genes were up-regulated in HepG2. Our findings suggest these TFs and genes might play important roles in the pathogenesis of HCC and may be used as therapeutic targets for HCC management.

  13. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism.

    PubMed

    Pérez-Delgado, Carmen M; Moyano, Tomás C; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A; Márquez, Antonio J; Betti, Marco

    2016-05-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  14. Identifying the optimal gene and gene set in hepatocellular carcinoma based on differential expression and differential co-expression algorithm.

    PubMed

    Dong, Li-Yang; Zhou, Wei-Zhong; Ni, Jun-Wei; Xiang, Wei; Hu, Wen-Hao; Yu, Chang; Li, Hai-Yan

    2017-02-01

    The objective of this study was to identify the optimal gene and gene set for hepatocellular carcinoma (HCC) utilizing differential expression and differential co-expression (DEDC) algorithm. The DEDC algorithm consisted of four parts: calculating differential expression (DE) by absolute t-value in t-statistics; computing differential co-expression (DC) based on Z-test; determining optimal thresholds on the basis of Chi-squared (χ2) maximization and the corresponding gene was the optimal gene; and evaluating functional relevance of genes categorized into different partitions to determine the optimal gene set with highest mean minimum functional information (FI) gain (Δ*G). The optimal thresholds divided genes into four partitions, high DE and high DC (HDE-HDC), high DE and low DC (HDE-LDC), low DE and high DC (LDE‑HDC), and low DE and low DC (LDE-LDC). In addition, the optimal gene was validated by conducting reverse transcription-polymerase chain reaction (RT-PCR) assay. The optimal threshold for DC and DE were 1.032 and 1.911, respectively. Using the optimal gene, the genes were divided into four partitions including: HDE-HDC (2,053 genes), HED-LDC (2,822 genes), LDE-HDC (2,622 genes), and LDE-LDC (6,169 genes). The optimal gene was microtubule‑associated protein RP/EB family member 1 (MAPRE1), and RT-PCR assay validated the significant difference between the HCC and normal state. The optimal gene set was nucleoside metabolic process (GO\\GO:0009116) with Δ*G = 18.681 and 24 HDE-HDC partitions in total. In conclusion, we successfully investigated the optimal gene, MAPRE1, and gene set, nucleoside metabolic process, which may be potential biomarkers for targeted therapy and provide significant insight for revealing the pathological mechanism underlying HCC.

  15. MGMT enrichment and second gene co-expression in hematopoietic progenitor cells using separate or dual-gene lentiviral vectors.

    PubMed

    Roth, Justin C; Alberti, Michael O; Ismail, Mourad; Lingas, Karen T; Reese, Jane S; Gerson, Stanton L

    2015-01-22

    The DNA repair gene O(6)-methylguanine-DNA methyltransferase (MGMT) allows efficient in vivo enrichment of transduced hematopoietic stem cells (HSC). Thus, linking this selection strategy to therapeutic gene expression offers the potential to reconstitute diseased hematopoietic tissue with gene-corrected cells. However, different dual-gene expression vector strategies are limited by poor expression of one or both transgenes. To evaluate different co-expression strategies in the context of MGMT-mediated HSC enrichment, we compared selection and expression efficacies in cells cotransduced with separate single-gene MGMT and GFP lentivectors to those obtained with dual-gene vectors employing either encephalomyocarditis virus (EMCV) internal ribosome entry site (IRES) or foot and mouth disease virus (FMDV) 2A elements for co-expression strategies. Each strategy was evaluated in vitro and in vivo using equivalent multiplicities of infection (MOI) to transduce 5-fluorouracil (5-FU) or Lin(-)Sca-1(+)c-kit(+) (LSK)-enriched murine bone marrow cells (BMCs). The highest dual-gene expression (MGMT(+)GFP(+)) percentages were obtained with the FMDV-2A dual-gene vector, but half of the resulting gene products existed as fusion proteins. Following selection, dual-gene expression percentages in single-gene vector cotransduced and dual-gene vector transduced populations were similar. Equivalent MGMT expression levels were obtained with each strategy, but GFP expression levels derived from the IRES dual-gene vector were significantly lower. In mice, vector-insertion averages were similar among cells enriched after dual-gene vectors and those cotransduced with single-gene vectors. These data demonstrate the limitations and advantages of each strategy in the context of MGMT-mediated selection, and may provide insights into vector design with respect to a particular therapeutic gene or hematologic defect.

  16. VSNL1 Co-Expression Networks in Aging Include Calcium Signaling, Synaptic Plasticity, and Alzheimer’s Disease Pathways

    PubMed Central

    Lin, Chien-Wei; Chang, Lun-Ching; Tseng, George C.; Kirkwood, Caitlin M.; Sibille, Etienne L.; Sweet, Robert A.

    2015-01-01

    The visinin-like 1 (VSNL1) gene encodes visinin-like protein 1, a peripheral biomarker for Alzheimer disease (AD). Little is known, however, about normal VSNL1 expression in brain and the biologic networks in which it participates. Frontal cortex gray matter obtained from 209 subjects without neurodegenerative or psychiatric illness, ranging in age from 16 to 91, was processed on Affymetrix GeneChip 1.1 ST and Human SNP Array 6.0. VSNL1 expression was unaffected by age and sex, and not significantly associated with SNPs in cis or trans. VSNL1 was significantly co-expressed with genes in pathways for calcium signaling, AD, long-term potentiation, long-term depression, and trafficking of AMPA receptors. The association with AD was driven, in part, by correlation with amyloid precursor protein (APP) expression. These findings provide an unbiased link between VSNL1 and molecular mechanisms of AD, including pathways implicated in synaptic pathology in AD. Whether APP may drive increased VSNL1 expression, VSNL1 drives increased APP expression, or both are downstream of common pathogenic regulators will need to be evaluated in model systems. PMID:25806004

  17. TF-Cluster: A pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM)

    PubMed Central

    2011-01-01

    Background Identifying the key transcription factors (TFs) controlling a biological process is the first step toward a better understanding of underpinning regulatory mechanisms. However, due to the involvement of a large number of genes and complex interactions in gene regulatory networks, identifying TFs involved in a biological process remains particularly difficult. The challenges include: (1) Most eukaryotic genomes encode thousands of TFs, which are organized in gene families of various sizes and in many cases with poor sequence conservation, making it difficult to recognize TFs for a biological process; (2) Transcription usually involves several hundred genes that generate a combination of intrinsic noise from upstream signaling networks and lead to fluctuations in transcription; (3) A TF can function in different cell types or developmental stages. Currently, the methods available for identifying TFs involved in biological processes are still very scarce, and the development of novel, more powerful methods is desperately needed. Results We developed a computational pipeline called TF-Cluster for identifying functionally coordinated TFs in two steps: (1) Construction of a shared coexpression connectivity matrix (SCCM), in which each entry represents the number of shared coexpressed genes between two TFs. This sparse and symmetric matrix embodies a new concept of coexpression networks in which genes are associated in the context of other shared coexpressed genes; (2) Decomposition of the SCCM using a novel heuristic algorithm termed "Triple-Link", which searches the highest connectivity in the SCCM, and then uses two connected TF as a primer for growing a TF cluster with a number of linking criteria. We applied TF-Cluster to microarray data from human stem cells and Arabidopsis roots, and then demonstrated that many of the resulting TF clusters contain functionally coordinated TFs that, based on existing literature, accurately represent a biological process

  18. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation.

    PubMed

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches.

  19. Gene co-expression modules as clinically relevant hallmarks of breast cancer diversity.

    PubMed

    Wolf, Denise M; Lenburg, Marc E; Yau, Christina; Boudreau, Aaron; van 't Veer, Laura J

    2014-01-01

    Co-expression modules are groups of genes with highly correlated expression patterns. In cancer, differences in module activity potentially represent the heterogeneity of phenotypes important in carcinogenesis, progression, or treatment response. To find gene expression modules active in breast cancer subpopulations, we assembled 72 breast cancer-related gene expression datasets containing ∼5,700 samples altogether. Per dataset, we identified genes with bimodal expression and used mixture-model clustering to ultimately define 11 modules of genes that are consistently co-regulated across multiple datasets. Functionally, these modules reflected estrogen signaling, development/differentiation, immune signaling, histone modification, ERBB2 signaling, the extracellular matrix (ECM) and stroma, and cell proliferation. The Tcell/Bcell immune modules appeared tumor-extrinsic, with coherent expression in tumors but not cell lines; whereas most other modules, interferon and ECM included, appeared intrinsic. Only four of the eleven modules were represented in the PAM50 intrinsic subtype classifier and other well-established prognostic signatures; although the immune modules were highly correlated to previously published immune signatures. As expected, the proliferation module was highly associated with decreased recurrence-free survival (RFS). Interestingly, the immune modules appeared associated with RFS even after adjustment for receptor subtype and proliferation; and in a multivariate analysis, the combination of Tcell/Bcell immune module down-regulation and proliferation module upregulation strongly associated with decreased RFS. Immune modules are unusual in that their upregulation is associated with a good prognosis without chemotherapy and a good response to chemotherapy, suggesting the paradox of high immune patients who respond to chemotherapy but would do well without it. Other findings concern the ECM/stromal modules, which despite common themes were associated

  20. Module Based Differential Coexpression Analysis Method for Type 2 Diabetes

    PubMed Central

    Yuan, Lin; Zheng, Chun-Hou; Xia, Jun-Feng; Huang, De-Shuang

    2015-01-01

    More and more studies have shown that many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional biological pathway or network and are highly correlated. Differential coexpression analysis, as a more comprehensive technique to the differential expression analysis, was raised to research gene regulatory networks and biological pathways of phenotypic changes through measuring gene correlation changes between disease and normal conditions. In this paper, we propose a gene differential coexpression analysis algorithm in the level of gene sets and apply the algorithm to a publicly available type 2 diabetes (T2D) expression dataset. Firstly, we calculate coexpression biweight midcorrelation coefficients between all gene pairs. Then, we select informative correlation pairs using the “differential coexpression threshold” strategy. Finally, we identify the differential coexpression gene modules using maximum clique concept and k-clique algorithm. We apply the proposed differential coexpression analysis method on simulated data and T2D data. Two differential coexpression gene modules about T2D were detected, which should be useful for exploring the biological function of the related genes. PMID:26339648

  1. Different substrate regimes determine transcriptional profiles and gene co-expression in Methanosarcina barkeri (DSM 800).

    PubMed

    Lin, Qiang; Fang, Xiaoyu; Ho, Adrian; Li, Jiaying; Yan, Xuefeng; Tu, Bo; Li, Chaonan; Li, Jiabao; Yao, Minjie; Li, Xiangzhen

    2017-08-21

    Methanosarcina barkeri (DSM 800) is a metabolically versatile methanogen and shows distinct metabolic status under different substrate regimes. However, the mechanisms underlying distinct transcriptional profiles under different substrate regimes remain elusive. In this study, based on transcriptional analysis, the growth performances and gene expressions of M. barkeri fed on acetate, H2 + CO2, and methanol, respectively, were investigated. M. barkeri showed higher growth performances under methanol, followed by H2 + CO2 and acetate, which corresponded well with the variations of gene expressions. The α diversity (evenness) of gene expressions was highest under the acetate regime, followed by H2 + CO2 and methanol, and significantly and negatively correlated with growth performances. The gene co-expression analysis showed that "Energy production and conversion," "Coenzyme transport and metabolism," and "Translation, ribosomal structure, and biogenesis" showed deterministic cooperation patterns of intra- and inter-functional classes. However, "Posttranslational modification, protein turnover, chaperones" showed exclusion with other functional classes. The gene expressions and especially the relationships among them potentially drove the shifts of metabolic status under different substrate regimes. Consequently, this study revealed the diversity-related ecological strategies that a high α diversity probably provided more fitness and tolerance under natural environments and oppositely a low α diversity strengthened some specific physiological functions, as well as the co-responses of gene expressions to different substrate regimes.

  2. Evolutionary conserved gene co-expression drives generation of self-antigen diversity in medullary thymic epithelial cells.

    PubMed

    Rattay, Kristin; Meyer, Hannah Verena; Herrmann, Carl; Brors, Benedikt; Kyewski, Bruno

    2016-02-01

    Promiscuous expression of a plethora of tissue-restricted antigens (TRAs) in medullary thymic epithelial cells (mTECs) is essential for central tolerance. This promiscuous gene expression (pGE) is characterized by inclusion of a broad range of TRAs and by its mosaic expression patterns, i.e. each antigen is only expressed in 1-3% of mTECs. It is currently unclear to which extent random and/or deterministic mechanisms are involved in the regulation of pGE. In order to address this issue, we deconstructed the transcriptional heterogeneity in mTEC to minor subsets expressing a particular TRA. We identified six delineable co-expression groups in mouse mTECs. These co-expression groups displayed a variable degree of mutual overlap and mapped to different stages of mTEC development. Co-expressed genes showed chromosomal preference and clustered within delimited genomic regions. Moreover, co-expression groups in mice and humans selected by a pair of orthologous genes preferentially co-expressed sets of orthologous genes attesting to the species conservation of pGE between mouse and human. Furthermore, co-expressed genes were enriched for specific transcription factor binding motifs concomitant with up-regulation of the corresponding transcription factors, implicating additional factors in the regulation of pGE besides the Autoimmune Regulator (Aire). Thus promiscuous transcription of self-antigens in mTECs entails a highly coordinated process, which is evolutionary strictly conserved between species. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. FlyExpress 7: An Integrated Discovery Platform To Study Coexpressed Genes Using in Situ Hybridization Images in Drosophila.

    PubMed

    Kumar, Sudhir; Konikoff, Charlotte; Sanderford, Maxwell; Liu, Li; Newfeld, Stuart; Ye, Jieping; Kulathinal, Rob J

    2017-08-07

    Gene expression patterns assayed across development can offer key clues about a gene's function and regulatory role. Drosophila melanogaster is ideal for such investigations as multiple individual and high-throughput efforts have captured the spatiotemporal patterns of thousands of embryonic expressed genes in the form of in situ images. FlyExpress (www.flyexpress.net), a knowledgebase based on a massive and unique digital library of standardized images and a simple search engine to find coexpressed genes, was created to facilitate the analytical and visual mining of these patterns. Here, we introduce the next generation of FlyExpress resources to facilitate the integrative analysis of sequence data and spatiotemporal patterns of expression from images. FlyExpress 7 now includes over 100,000 standardized in situ images and implements a more efficient, user-defined search algorithm to identify coexpressed genes via Genomewide Expression Maps (GEMs). Shared motifs found in the upstream 5' regions of any pair of coexpressed genes can be visualized in an interactive dotplot. Additional webtools and link-outs to assist in the downstream validation of candidate motifs are also provided. Together, FlyExpress 7 represents our largest effort yet to accelerate discovery via the development and dispersal of new webtools that allow researchers to perform data-driven analyses of coexpression (image) and genomic (sequence) data. Copyright © 2017 Kumar et al.

  4. Novel role of ZmaNAC36 in co-expression of starch synthetic genes in maize endosperm.

    PubMed

    Zhang, Junjie; Chen, Jiang; Yi, Qiang; Hu, Yufeng; Liu, Hanmei; Liu, Yinghong; Huang, Yubi

    2014-02-01

    Starch is an essential commodity that is widely used as food, feed, fuel and in industry. However, its mechanism of synthesis is not fully understood, especially in terms of the expression and regulation of the starch synthetic genes. It was reported that the starch synthetic genes were co-expressed during maize endosperm development; however, the mechanism of the co-expression was not reported. In this paper, the ZmaNAC36 gene was amplified by homology-based cloning, and its expression vector was constructed for transient expression. The nuclear localization, transcriptional activation and target sites of the ZmaNAC36 protein were identified. The expression profile of ZmaNAC36 showed that it was strongly expressed in the maize endosperm and was co-expressed with most of the starch synthetic genes. Moreover, the expressions of many starch synthesis genes in the endosperm were upregulated when ZmaNAC36 was transiently overexpressed. All our results indicated that NAC36 might be a transcription factor and play a potential role in the co-expression of starch synthetic genes in the maize endosperm.

  5. Harnessing gene expression networks to prioritize candidate epileptic encephalopathy genes.

    PubMed

    Oliver, Karen L; Lukic, Vesna; Thorne, Natalie P; Berkovic, Samuel F; Scheffer, Ingrid E; Bahlo, Melanie

    2014-01-01

    We apply a novel gene expression network analysis to a cohort of 182 recently reported candidate Epileptic Encephalopathy genes to identify those most likely to be true Epileptic Encephalopathy genes. These candidate genes were identified as having single variants of likely pathogenic significance discovered in a large-scale massively parallel sequencing study. Candidate Epileptic Encephalopathy genes were prioritized according to their co-expression with 29 known Epileptic Encephalopathy genes. We utilized developing brain and adult brain gene expression data from the Allen Human Brain Atlas (AHBA) and compared this to data from Celsius: a large, heterogeneous gene expression data warehouse. We show replicable prioritization results using these three independent gene expression resources, two of which are brain-specific, with small sample size, and the third derived from a heterogeneous collection of tissues with large sample size. Of the nineteen genes that we predicted with the highest likelihood to be true Epileptic Encephalopathy genes, two (GNAO1 and GRIN2B) have recently been independently reported and confirmed. We compare our results to those produced by an established in silico prioritization approach called Endeavour, and finally present gene expression networks for the known and candidate Epileptic Encephalopathy genes. This highlights sub-networks of gene expression, particularly in the network derived from the adult AHBA gene expression dataset. These networks give clues to the likely biological interactions between Epileptic Encephalopathy genes, potentially highlighting underlying mechanisms and avenues for therapeutic targets.

  6. Harnessing Gene Expression Networks to Prioritize Candidate Epileptic Encephalopathy Genes

    PubMed Central

    Oliver, Karen L.; Lukic, Vesna; Thorne, Natalie P.; Berkovic, Samuel F.; Scheffer, Ingrid E.; Bahlo, Melanie

    2014-01-01

    We apply a novel gene expression network analysis to a cohort of 182 recently reported candidate Epileptic Encephalopathy genes to identify those most likely to be true Epileptic Encephalopathy genes. These candidate genes were identified as having single variants of likely pathogenic significance discovered in a large-scale massively parallel sequencing study. Candidate Epileptic Encephalopathy genes were prioritized according to their co-expression with 29 known Epileptic Encephalopathy genes. We utilized developing brain and adult brain gene expression data from the Allen Human Brain Atlas (AHBA) and compared this to data from Celsius: a large, heterogeneous gene expression data warehouse. We show replicable prioritization results using these three independent gene expression resources, two of which are brain-specific, with small sample size, and the third derived from a heterogeneous collection of tissues with large sample size. Of the nineteen genes that we predicted with the highest likelihood to be true Epileptic Encephalopathy genes, two (GNAO1 and GRIN2B) have recently been independently reported and confirmed. We compare our results to those produced by an established in silico prioritization approach called Endeavour, and finally present gene expression networks for the known and candidate Epileptic Encephalopathy genes. This highlights sub-networks of gene expression, particularly in the network derived from the adult AHBA gene expression dataset. These networks give clues to the likely biological interactions between Epileptic Encephalopathy genes, potentially highlighting underlying mechanisms and avenues for therapeutic targets. PMID:25014031

  7. Characterization of Tusc5, an adipocyte gene co-expressed in peripheral neurons.

    PubMed

    Oort, Pieter J; Warden, Craig H; Baumann, Thomas K; Knotts, Trina A; Adams, Sean H

    2007-09-30

    Tumor suppressor candidate 5 (Tusc5, also termed brain endothelial cell derived gene-1 or BEC-1), a CD225 domain-containing, cold-repressed gene identified during brown adipose tissue (BAT) transcriptome analyses was found to be robustly-expressed in mouse white adipose tissue (WAT) and BAT, with similarly high expression in human adipocytes. Tusc5 mRNA was markedly increased from trace levels in pre-adipocytes to significant levels in developing 3T3-L1 adipocytes, coincident with several mature adipocyte markers (phosphoenolpyruvate carboxykinase 1, GLUT4, adipsin, leptin). The Tusc5 transcript levels were increased by the peroxisome proliferator activated receptor-gamma (PPARgamma) agonist GW1929 (1microg/mL, 18h) by >10-fold (pre-adipocytes) to approximately 1.5-fold (mature adipocytes) versus controls (p<0.0001). Taken together, these results suggest an important role for Tusc5 in maturing adipocytes. Intriguingly, we discovered robust co-expression of the gene in peripheral nerves (primary somatosensory neurons). In light of the marked repression of the gene observed after cold exposure, these findings may point to participation of Tusc5 in shared adipose-nervous system functions linking environmental cues, CNS signals, and WAT-BAT physiology. Characterization of such links is important for clarifying the molecular basis for adipocyte proliferation and could have implications for understanding the biology of metabolic disease-related neuropathies.

  8. Integrative Network Biology: Graph Prototyping for Co-Expression Cancer Networks

    PubMed Central

    Kugler, Karl G.; Mueller, Laurin A. J.; Graber, Armin; Dehmer, Matthias

    2011-01-01

    Network-based analysis has been proven useful in biologically-oriented areas, e.g., to explore the dynamics and complexity of biological networks. Investigating a set of networks allows deriving general knowledge about the underlying topological and functional properties. The integrative analysis of networks typically combines networks from different studies that investigate the same or similar research questions. In order to perform an integrative analysis it is often necessary to compare the properties of matching edges across the data set. This identification of common edges is often burdensome and computational intensive. Here, we present an approach that is different from inferring a new network based on common features. Instead, we select one network as a graph prototype, which then represents a set of comparable network objects, as it has the least average distance to all other networks in the same set. We demonstrate the usefulness of the graph prototyping approach on a set of prostate cancer networks and a set of corresponding benign networks. We further show that the distances within the cancer group and the benign group are statistically different depending on the utilized distance measure. PMID:21829532

  9. Gene Co-Expression Modules as Clinically Relevant Hallmarks of Breast Cancer Diversity

    PubMed Central

    Yau, Christina; Boudreau, Aaron; van ‘t Veer, Laura J.

    2014-01-01

    Co-expression modules are groups of genes with highly correlated expression patterns. In cancer, differences in module activity potentially represent the heterogeneity of phenotypes important in carcinogenesis, progression, or treatment response. To find gene expression modules active in breast cancer subpopulations, we assembled 72 breast cancer-related gene expression datasets containing ∼5,700 samples altogether. Per dataset, we identified genes with bimodal expression and used mixture-model clustering to ultimately define 11 modules of genes that are consistently co-regulated across multiple datasets. Functionally, these modules reflected estrogen signaling, development/differentiation, immune signaling, histone modification, ERBB2 signaling, the extracellular matrix (ECM) and stroma, and cell proliferation. The Tcell/Bcell immune modules appeared tumor-extrinsic, with coherent expression in tumors but not cell lines; whereas most other modules, interferon and ECM included, appeared intrinsic. Only four of the eleven modules were represented in the PAM50 intrinsic subtype classifier and other well-established prognostic signatures; although the immune modules were highly correlated to previously published immune signatures. As expected, the proliferation module was highly associated with decreased recurrence-free survival (RFS). Interestingly, the immune modules appeared associated with RFS even after adjustment for receptor subtype and proliferation; and in a multivariate analysis, the combination of Tcell/Bcell immune module down-regulation and proliferation module upregulation strongly associated with decreased RFS. Immune modules are unusual in that their upregulation is associated with a good prognosis without chemotherapy and a good response to chemotherapy, suggesting the paradox of high immune patients who respond to chemotherapy but would do well without it. Other findings concern the ECM/stromal modules, which despite common themes were associated

  10. Mining Temporal Protein Complex Based on the Dynamic PIN Weighted with Connected Affinity and Gene Co-Expression.

    PubMed

    Shen, Xianjun; Yi, Li; Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Yang, Jincai

    2016-01-01

    The identification of temporal protein complexes would make great contribution to our knowledge of the dynamic organization characteristics in protein interaction networks (PINs). Recent studies have focused on integrating gene expression data into static PIN to construct dynamic PIN which reveals the dynamic evolutionary procedure of protein interactions, but they fail in practice for recognizing the active time points of proteins with low or high expression levels. We construct a Time-Evolving PIN (TEPIN) with a novel method called Deviation Degree, which is designed to identify the active time points of proteins based on the deviation degree of their own expression values. Owing to the differences between protein interactions, moreover, we weight TEPIN with connected affinity and gene co-expression to quantify the degree of these interactions. To validate the efficiencies of our methods, ClusterONE, CAMSE and MCL algorithms are applied on the TEPIN, DPIN (a dynamic PIN constructed with state-of-the-art three-sigma method) and SPIN (the original static PIN) to detect temporal protein complexes. Each algorithm on our TEPIN outperforms that on other networks in terms of match degree, sensitivity, specificity, F-measure and function enrichment etc. In conclusion, our Deviation Degree method successfully eliminates the disadvantages which exist in the previous state-of-the-art dynamic PIN construction methods. Moreover, the biological nature of protein interactions can be well described in our weighted network. Weighted TEPIN is a useful approach for detecting temporal protein complexes and revealing the dynamic protein assembly process for cellular organization.

  11. Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics

    PubMed Central

    2014-01-01

    Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353

  12. Co-expression network analysis of toxin-antitoxin loci in Mycobacterium tuberculosis reveals key modulators of cellular stress.

    PubMed

    Gupta, Amita; Venkataraman, Balaji; Vasudevan, Madavan; Gopinath Bankar, Kiran

    2017-07-19

    Research on toxin-antitoxin loci (TA loci) is gaining impetus due to their ubiquitous presence in bacterial genomes and their observed roles in stress survival, persistence and drug tolerance. The present study investigates the expression profile of all the seventy-nine TA loci found in Mycobacterium tuberculosis. The bacterium was subjected to multiple stress conditions to identify key players of cellular stress response and elucidate a TA-coexpression network. This study provides direct experimental evidence for transcriptional activation of each of the seventy-nine TA loci following mycobacterial exposure to growth-limiting environments clearly establishing TA loci as stress-responsive modules in M. tuberculosis. TA locus activation was found to be stress-specific with multiple loci activated in a duration-based response to a particular stress. Conditions resulting in arrest of cellular translation led to greater up-regulation of TA genes suggesting that TA loci have a primary role in arresting translation in the cell. Our study identifed higBA2 and vapBC46 as key loci that were activated in all the conditions tested. Besides, relBE1, higBA3, vapBC35, vapBC22 and higBA1 were also upregulated in multpile stresses. Certain TA modules exhibited co-activation across multiple conditions suggestive of a common regulatory mechanism.

  13. Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

    PubMed

    Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed

  14. Nearest hyperplane distance neighbor clustering algorithm applied to gene co-expression analysis in Alzheimer's disease.

    PubMed

    Pasluosta, Cristian F; Dua, Prerna; Lukiw, Walter J

    2011-01-01

    Microarray analysis can contribute considerably to the understanding of biologically significant cellular mechanisms that yield novel information regarding co-regulated sets of gene patterns. Clustering is one of the most popular tools for analyzing DNA microarray data. In this paper, we present an unsupervised clustering algorithm based on the K-local hyperplane distance nearest-neighbor classifier (HKNN). We adapted the well-known nearest neighbor clustering algorithm for use with hyperplane distance. The result is a simple and computationally inexpensive unsupervised clustering algorithm that can be applied to high-dimensional data. It has been reported that the NFkB1 gene is progressively over-expressed in moderate-to-severe Alzheimer's disease (AD) cases, and that the NF-kB complex plays a key role in neuroinflammatory responses in AD pathogenesis. In this study, we apply the proposed clustering algorithm to identify co-expression patterns with the NFkB1 in gene expression data from hippocampal tissue samples. Finally, we validate our experiments with biomedical literature search.

  15. lncRNA co-expression network model for the prognostic analysis of acute myeloid leukemia

    PubMed Central

    Pan, Jia-Qi; Zhang, Yan-Qing; Wang, Jing-Hua; Xu, Ping; Wang, Wei

    2017-01-01

    Acute myeloid leukemia (AML) is a highly heterogeneous hematologic malignancy with great variability of prognostic behaviors. Previous studies have reported that long non-coding RNAs (lncRNAs) play an important role in AML and may thus be used as potential prognostic biomarkers. However, thus use of lncRNAs as prognostic biomarkers in AML and their detailed mechanisms of action in this disease have not yet been well characterized. For this purpose, in the present study, the expression levels of lncRNAs and mRNAs were calculated using the RNA-seq V2 data for AML, following which a lncRNA-lncRNA co-expression network (LLCN) was constructed. This revealed a total of 8 AML prognosis-related lncRNA modules were identified, which displayed a significant correlation with patient survival (p≤0.05). Subsequently, a prognosis-related lncRNA module pathway network was constructed to interpret the functional mechanism of the prognostic modules in AML. The results indicated that these prognostic modules were involved in the AML pathway, chemokine signaling pathway and WNT signaling pathway, all of which play important roles in AML. Furthermore, the investigation of lncRNAs in these prognostic modules suggested that an lncRNA (ZNF571-AS1) may be involved in AML via the Janus kinase (JAK)/signal transducer and activator of transcription (STAT) signaling pathway by regulating KIT and STAT5. The results of the present study not only provide potential lncRNA modules as prognostic biomarkers, but also provide further insight into the molecular mechanisms of action of lncRNAs. PMID:28204819

  16. Coexpression of bile salt hydrolase gene and catalase gene remarkably improves oxidative stress and bile salt resistance in Lactobacillus casei.

    PubMed

    Wang, Guohong; Yin, Sheng; An, Haoran; Chen, Shangwu; Hao, Yanling

    2011-08-01

    Lactic acid bacteria (LAB) encounter various types of stress during industrial processes and gastrointestinal transit. Catalase (CAT) and bile salt hydrolase (BSH) can protect bacteria from oxidative stress or damage caused by bile salts by decomposing hydrogen peroxide (H(2)O(2)) or deconjugating the bile salts, respectively. Lactobacillus casei is a valuable probiotic strain and is often deficient in both CAT and BSH. In order to improve the resistance of L. casei to both oxidative and bile salts stress, the catalase gene katA from L. sakei and the bile salt hydrolase gene bsh1 from L. plantarum were coexpressed in L. casei HX01. The enzyme activities of CAT and BSH were 2.41 μmol H(2)O(2)/min/10(8) colony-forming units (CFU) and 2.11 μmol glycine/min/ml in the recombinant L. casei CB, respectively. After incubation with 8 mM H(2)O(2), survival ratio of L. casei CB was 40-fold higher than that of L. casei CK. Treatment of L. casei CB with various concentrations of sodium glycodeoxycholate (GDCA) showed that ~10(5) CFU/ml cells survived after incubation with 0.5% GDCA, whereas almost all the L. casei CK cells were killed when treaded with 0.4% GDCA. These results indicate that the coexpression of CAT and BSH confers high-level resistance to both oxidative and bile salts stress conditions in L. casei HX01.

  17. Gene co-expression analysis identifies brain regions and cell types involved in migraine pathophysiology: a GWAS-based study using the Allen Human Brain Atlas.

    PubMed

    Eising, Else; Huisman, Sjoerd M H; Mahfouz, Ahmed; Vijfhuizen, Lisanne S; Anttila, Verneri; Winsvold, Bendik S; Kurth, Tobias; Ikram, M Arfan; Freilinger, Tobias; Kaprio, Jaakko; Boomsma, Dorret I; van Duijn, Cornelia M; Järvelin, Marjo-Riitta R; Zwart, John-Anker; Quaye, Lydia; Strachan, David P; Kubisch, Christian; Dichgans, Martin; Davey Smith, George; Stefansson, Kari; Palotie, Aarno; Chasman, Daniel I; Ferrari, Michel D; Terwindt, Gisela M; de Vries, Boukje; Nyholt, Dale R; Lelieveldt, Boudewijn P F; van den Maagdenberg, Arn M J M; Reinders, Marcel J T

    2016-04-01

    Migraine is a common disabling neurovascular brain disorder typically characterised by attacks of severe headache and associated with autonomic and neurological symptoms. Migraine is caused by an interplay of genetic and environmental factors. Genome-wide association studies (GWAS) have identified over a dozen genetic loci associated with migraine. Here, we integrated migraine GWAS data with high-resolution spatial gene expression data of normal adult brains from the Allen Human Brain Atlas to identify specific brain regions and molecular pathways that are possibly involved in migraine pathophysiology. To this end, we used two complementary methods. In GWAS data from 23,285 migraine cases and 95,425 controls, we first studied modules of co-expressed genes that were calculated based on human brain expression data for enrichment of genes that showed association with migraine. Enrichment of a migraine GWAS signal was found for five modules that suggest involvement in migraine pathophysiology of: (i) neurotransmission, protein catabolism and mitochondria in the cortex; (ii) transcription regulation in the cortex and cerebellum; and (iii) oligodendrocytes and mitochondria in subcortical areas. Second, we used the high-confidence genes from the migraine GWAS as a basis to construct local migraine-related co-expression gene networks. Signatures of all brain regions and pathways that were prominent in the first method also surfaced in the second method, thus providing support that these brain regions and pathways are indeed involved in migraine pathophysiology.

  18. Dynamic co-expression network analysis of lncRNAs and mRNAs associated with venous congestion

    PubMed Central

    Li, Jinshun; Xu, Yuqin; Xu, Jia; Wang, Jinhua; Wu, Liying

    2016-01-01

    Venous congestion and volume overload are important in cardiorenal syndromes, in which multiple regulated factors are involved, including long non-coding RNAs (lncRNAs). To investigate the underlying role of lncRNAs in regulating the development of venous congestion, an Affymetrix microarray associated with peripheral venous congestion was annotated, then a bipartite dynamic lncRNA-mRNA co-expression network was constructed in which nodes indicated lncRNAs or mRNAs. The nodes were connected when the lncRNAs or mRNAs were dynamically co-expressed. Following functional analysis of this network, several dynamic alternative pathways were identified, including the calcium signaling pathway during venous congestion development. Additionally, certain lncRNAs (LINC00523, LINC01210 and RP11-435O5.5) were identified that may potentially dynamically regulate certain proteins, including plasma membrane calcium ATPase (PMCA) and G protein-coupled receptor (GPCR), in the calcium signaling pathway. Particularly, the dynamically regulated switch of LINC00523 from co-expression with PMCA to GPCR may be involved in damage to steady state intracellular calcium. In brief, the current study demonstrated a potential novel mechanism of lncRNA function during venous congestion. PMID:27431002

  19. CD8 single-cell gene coexpression reveals three different effector types present at distinct phases of the immune response

    PubMed Central

    Peixoto, António; Evaristo, César; Munitic, Ivana; Monteiro, Marta; Charbit, Alain; Rocha, Benedita; Veiga-Fernandes, Henrique

    2007-01-01

    To study in vivo CD8 T cell differentiation, we quantified the coexpression of multiple genes in single cells throughout immune responses. After in vitro activation, CD8 T cells rapidly express effector molecules and cease their expression when the antigen is removed. Gene behavior after in vivo activation, in contrast, was quite heterogeneous. Different mRNAs were induced at very different time points of the response, were transcribed during different time periods, and could decline or persist independently of the antigen load. Consequently, distinct gene coexpression patterns/different cell types were generated at the various phases of the immune responses. During primary stimulation, inflammatory molecules were induced and down-regulated shortly after activation, generating early cells that only mediated inflammation. Cytotoxic T cells were generated at the peak of the primary response, when individual cells simultaneously expressed multiple killer molecules, whereas memory cells lost killer capacity because they no longer coexpressed killer genes. Surprisingly, during secondary responses gene transcription became permanent. Secondary cells recovered after antigen elimination were more efficient killers than cytotoxic T cells present at the peak of the primary response. Thus, primary responses produced two transient effector types. However, after boosting, CD8 T cells differentiate into long-lived killer cells that persist in vivo in the absence of antigen. PMID:17485515

  20. Inferring gene correlation networks from transcription factor binding sites.

    PubMed

    Mahdevar, Ghasem; Nowzari-Dalini, Abbas; Sadeghi, Mehdi

    2013-01-01

    Gene expression is a highly regulated biological process that is fundamental to the existence of phenotypes of any living organism. The regulatory relations are usually modeled as a network; simply, every gene is modeled as a node and relations are shown as edges between two related genes. This paper presents a novel method for inferring correlation networks, networks constructed by connecting co-expressed genes, through predicting co-expression level from genes promoter's sequences. According to the results, this method works well on biological data and its outcome is comparable to the methods that use microarray as input. The method is written in C++ language and is available upon request from the corresponding author.

  1. Ubiquinone-10 production using Agrobacterium tumefaciens dps gene in Escherichia coli by coexpression system.

    PubMed

    Zhang, Dawei; Shrestha, Binaya; Li, Zhaopeng; Tan, Tianwei

    2007-01-01

    Ubiquinone (Coenzyme Q; abbreviation, UQ) acts as a mobile component of the respiratory chain by playing an essential role in the electron transport system, and has been widely used in pharmaceuticals. The biosynthesis of UQ involves 10 sequential reactions brought about by various enzymes. In this study we have cloned, expressed the decaprenyl diphosphate synthase, designated dps gene, from Agrobacterium tumefaciens, and succeeded in detecting UQ-10 in addition to innate UQ-8 in Escherichia coli. Furthermore, the production of UQ-10 was higher than UQ-8. To establish an efficient expression system for UQ- 10 production, we used genes, including ubiC, ubiA, and ubiG involved in UQ biosynthesis in E. coli, to construct a better co-expression system. The expression coupled by dps and ubiCA was effective for increasing UQ-10 production by five times than that by expressing single dps gene in the shake flask culture. To study for a large-scale production of UQ-10 in E. coli, fed-batch fermentations were implemented to achieve a high cell density culture. A cell concentration of 85.40 g/L and 94.58 g/L dry cell weight (DCW), and UQ-10 content of 50.29 mg/L and 45.86 mg/L was obtained after 32.5 h and 27.5 h of cultivation, subsequent to isopropyl-beta-D-thiogalactopyranoside and lactose induction, respectively. In addition, plasmid stability was maintained at high level throughout the fermentation.

  2. FAD2-DGAT2 Genes Coexpressed in Endophytic Aspergillus fumigatus Derived from Tung Oilseeds

    PubMed Central

    Chen, Yi-Cun; Wang, Yang-Dong; Cui, Qin-Qin; Zhan, Zhi-Yong

    2012-01-01

    Recent efforts to genetically engineer plants that contain fatty acid desaturases to produce valuable fatty acids have made only modest progress. Diacylglycerol acyltransferase 2 (DGAT2), which catalyzes the final step in triacylglycerol (TAG) assembly, might potentially regulate the biosynthesis of desired fatty acids in TAGs. To study the effects of tung tree (Vernicia fordii) vfDGAT2 in channeling the desired fatty acids into TAG, vfDGAT2 combined with the tung tree fatty acid desaturase-2 (vfFAD2) gene was co-introduced into Aspergillus fumigatus, an endophytic fungus isolated from healthy tung oilseed. Two transformants coexpressing vfFAD2 and vfDGAT2 showed a more than 6-fold increase in linoleic acid production compared to the original A. fumigatus strain, while a nearly 2-fold increase was found in the transformant expressing only vfFAD2. Our data suggest that vfDGAT2 plays a pivotal role in promoting linoleic acid accumulation in TAGs. This holds great promise for further genetic engineering aimed at producing valuable fatty acids. PMID:22919314

  3. Preservation Analysis of Macrophage Gene Coexpression Between Human and Mouse Identifies PARK2 as a Genetically Controlled Master Regulator of Oxidative Phosphorylation in Humans

    PubMed Central

    Codoni, Veronica; Blum, Yuna; Civelek, Mete; Proust, Carole; Franzén, Oscar; Björkegren, Johan L. M.; Le Goff, Wilfried; Cambien, Francois; Lusis, Aldons J.; Trégouët, David-Alexandre

    2016-01-01

    Macrophages are key players involved in numerous pathophysiological pathways and an in-depth characterization of their gene regulatory networks can help in better understanding how their dysfunction may impact on human diseases. We here conducted a cross-species network analysis of macrophage gene expression data between human and mouse to identify conserved networks across both species, and assessed whether such networks could reveal new disease-associated regulatory mechanisms. From a sample of 684 individuals processed for genome-wide macrophage gene expression profiling, we identified 27 groups of coexpressed genes (modules). Six modules were found preserved (P < 10−4) in macrophages from 86 mice of the Hybrid Mouse Diversity Panel. One of these modules was significantly [false discovery rate (FDR) = 8.9 × 10−11] enriched for genes belonging to the oxidative phosphorylation (OXPHOS) pathway. This pathway was also found significantly (FDR < 10−4) enriched in susceptibility genes for Alzheimer, Parkinson, and Huntington diseases. We further conducted an expression quantitative trait loci analysis to identify SNP that could regulate macrophage OXPHOS gene expression in humans. This analysis identified the PARK2 rs192804963 as a trans-acting variant influencing (minimal P-value = 4.3 × 10−8) the expression of most OXPHOS genes in humans. Further experimental work demonstrated that PARK2 knockdown expression was associated with increased OXPHOS gene expression in THP1 human macrophages. This work provided strong new evidence that PARK2 participates to the regulatory networks associated with oxidative phosphorylation and suggested that PARK2 genetic variations could act as a trans regulator of OXPHOS gene macrophage expression in humans. PMID:27558669

  4. Coexpression Network Analysis of Benign and Malignant Phenotypes of SIV-Infected Sooty Mangabey and Rhesus Macaque.

    PubMed

    Yang, Zhao-Wan; Jiang, Yan-Hua; Ma, Chuang; Silvestri, Guido; Bosinger, Steven E; Li, Bai-Lian; Jong, Ambrose; Zhou, Yan-Hong; Huang, Sheng-He

    2016-01-01

    To explore the differences between the extreme SIV infection phenotypes, nonprogression (BEN: benign) to AIDS in sooty mangabeys (SMs) and progression to AIDS (MAL: malignant) in rhesus macaques (RMs), we performed an integrated dual positive-negative connectivity (DPNC) analysis of gene coexpression networks (GCN) based on publicly available big data sets in the GEO database of NCBI. The microarray-based gene expression data sets were generated, respectively, from the peripheral blood of SMs and RMs at several time points of SIV infection. Significant differences of GCN changes in DPNC values were observed in SIV-infected SMs and RMs. There are three groups of enriched genes or pathways (EGPs) that are associated with three SIV infection phenotypes (BEN+, MAL+ and mixed BEN+/MAL+). The MAL+ phenotype in SIV-infected RMs is specifically associated with eight EGPs, including the protein ubiquitin proteasome system, p53, granzyme A, gramzyme B, polo-like kinase, Glucocorticoid receptor, oxidative phosyphorylation and mitochondrial signaling. Mitochondrial (endosymbiotic) dysfunction is solely present in RMs. Specific BEN+ pattern changes in four EGPs are identified in SIV-infected SMs, including the pathways contributing to interferon signaling, BRCA1/DNA damage response, PKR/INF induction and LGALS8. There are three enriched pathways (PRR-activated IRF signaling, RIG1-like receptor and PRR pathway) contributing to the mixed (BEN+/MAL+) phenotypes of SIV infections in RMs and SMs, suggesting that these pathways play a dual role in the host defense against viral infections. Further analysis of Hub genes in these GCNs revealed that the genes LGALS8 and IL-17RA, which positively regulate the barrier function of the gut mucosa and the immune homeostasis with the gut microbiota (exosymbiosis), were significantly differentially expressed in RMs and SMs. Our data suggest that there exists an exo- (dysbiosis of the gut microbiota) and endo- (mitochondrial dysfunction

  5. Coexpression of chitinase and the cry11Aa1 toxin genes in Bacillus thuringiensis serovar israelensis.

    PubMed

    Sirichotpakorn, N; Rongnoparut, P; Choosang, K; Panbangred, W

    2001-10-01

    At the spore stage, a cloned chitinase gene was coexpressed with the regulatory gene p19 and the toxin gene cry11Aa1 in the hosts Bacillus thuringiensis serovar israelensis strains 4Q2-72 and c4Q2-72. The chitinase gene was derived from a high-chitinase producer, Bacillus licheniformis TP-1. Two transcriptional fusion plasmids between the p19 or p19-cry11Aa1 genes and the promoterless chitinase gene were constructed. In transcription order, the p16-19CHI construct contained the p19 gene together with the chitinase gene only while the p16-1968CHI construct contained p19 together with the toxin gene cry11Aa1 and the chitinase gene. The inserted sequences were regulated by a spore-specific promoter located upstream of p19. The recombinant chitinase of all transformed B. thuringiensis serovar israelensis strains was initially synthesized at low level at about 9 h of growth when a portion of the cells started to sporulate. It increased thereafter and reached maximum levels of 5.5, 4.9, and 4.7 mU/ml at 48 h, for strain 4Q2-72 transformed with p16-19CHI and p16-1968CHI and strain c4Q2-72 transformed with p16-19CHI, respectively. This activity was approximately 2 times higher than the maximum activity (2.7 mU/ml) of the parental strain, B. licheniformis TP-1. Although crude chitinase alone from B. thuringiensis serovar israelensis c4Q2-72 (p16-19CHI) at 4.5 mU/ml caused 40% mortality in second instar Aedes aegypti larvae, transformants containing the chitinase alone or in combination with cry11Aa1 resulted in lower toxicity to A. aegypti larvae than the untransformed 4Q2-72 host. For example the LC(50) for the transformed 4Q2-72 harboring the chitinase gene only (p16-19CHI) was 5.6 x 10(4) +/- 0.7 x 10(4) cells, 40 times higher than that of the untransformed host at 1.4 x 10(3) +/- 0.19 x 10(3). The lower toxicity correlated with poor sporulation in the transformants (i.e., 35 times lower than that in the untransformed host). However, the transformed 4Q2-72 strain

  6. Enhanced production of ε-caprolactone by coexpression of bacterial hemoglobin gene in recombinant Escherichia coli expressing cyclohexanone monooxygenase gene.

    PubMed

    Lee, Won-Heong; Park, Eun-Hee; Kim, Myoung-Dong

    2014-12-28

    Baeyer-Villiger (BV) oxidation of cyclohexanone to epsilon-caprolactone in a microbial system expressing cyclohexanone monooxygenase (CHMO) can be influenced by not only the efficient regeneration of NADPH but also a sufficient supply of oxygen. In this study, the bacterial hemoglobin gene from Vitreoscilla stercoraria (vhb) was introduced into the recombinant Escherichia coli expressing CHMO to investigate the effects of an oxygen-carrying protein on microbial BV oxidation of cyclohexanone. Coexpression of Vhb allowed the recombinant E. coli strain to produce a maximum epsilon-caprolactone concentration of 15.7 g/l in a fed-batch BV oxidation of cyclohexanone, which corresponded to a 43% improvement compared with the control strain expressing CHMO only under the same conditions.

  7. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    PubMed

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Interaction network of coexpressed mRNA, miRNA, and lncRNA activated by TGF‑β1 regulates EMT in human pulmonary epithelial cell.

    PubMed

    Liu, Huizhu; Zhao, Xueying; Xiang, Jing; Zhang, Jie; Meng, Chao; Zhang, Jinjin; Li, Minge; Song, Xiaodong; Lv, Changjun

    2017-09-28

    Noncoding RNAs (ncRNAs), such as microRNAs (miRNAs) and long noncoding RNAs (lncRNAs), play increasingly important roles in pathological processes involved in disease development. However, whether mRNAs interact with miRNAs and lncRNAs to form an interacting regulatory network in diseases remains unknown. In this study, the interaction of coexpressed mRNAs, miRNAs and lncRNAs during tumor growth factor‑β1‑activated (TGF‑β1) epithelial‑mesenchymal transition (EMT) was systematically analyzed in human alveolar epithelial cells. For EMT regulation, 24 mRNAs, 11 miRNAs and 33 lncRNAs were coexpressed, and interacted with one another. The interaction among coexpressed mRNAs, miRNAs and lncRNAs were further analyzed, and the results showed the lack of competing endogenous RNAs (ceRNAs) among them. The mutual regulation may be correlated with other modes, such as histone modification and transcription factor recruitment. However, the possibility of ceRNA existence cannot be ignored because of the generally low abundance of lncRNAs and frequent promiscuity of protein‑RNA interactions. Thus, conclusions need further experimental identification and validation. In this context, disrupting many altered disease pathways remains one of the challenges in obtaining effective pathway‑based therapy. The reason being that one specific mRNA, miRNA or lncRNA may target multiple genes that are potentially implicated in a disease. Nevertheless, the results of the present study provide basic mechanistic information, possible biomarkers and novel treatment strategies for diseases, particularly pulmonary tumor and fibrosis.

  9. Differential co-expression analysis of a microarray gene expression profiles of pulmonary adenocarcinoma.

    PubMed

    Fu, Shijie; Pan, Xufeng; Fang, Wentao

    2014-08-01

    Lung cancer severely reduces the quality of life worldwide and causes high socioeconomic burdens. However, key genes leading to the generation of pulmonary adenocarcinoma remain elusive despite intensive research efforts. The present study aimed to identify the potential associations between transcription factors (TFs) and differentially co‑expressed genes (DCGs) in the regulation of transcription in pulmonary adenocarcinoma. Gene expression profiles of pulmonary adenocarcinoma were downloaded from the Gene Expression Omnibus, and gene expression was analyzed using a computational method. A total of 37,094 differentially co‑expressed links (DCLs) and 251 DCGs were identified, which were significantly enriched in 10 pathways. The construction of the regulatory network and the analysis of the regulatory impact factors revealed eight crucial TFs in the regulatory network. These TFs regulated the expression of DCGs by promoting or inhibiting their expression. In addition, certain TFs and target genes associated with DCGs did not appear in the DCLs, which indicated that those TFs could be synergistic with other factors. This is likely to provide novel insights for research into pulmonary adenocarcinoma. In conclusion, the present study may enhance the understanding of disease mechanisms and lead to an improved diagnosis of lung cancer. However, further studies are required to confirm these observations.

  10. Enhanced production of shikimic acid using a multi-gene co-expression system in Escherichia coli.

    PubMed

    Liu, Xiang-Lei; Lin, Jun; Hu, Hai-Feng; Zhou, Bin; Zhu, Bao-Quan

    2016-04-01

    Shikimic acid (SA) is the key synthetic material for the chemical synthesis of Oseltamivir, which is prescribed as the front-line treatment for serious cases of influenza. Multi-gene expression vector can be used for expressing the plurality of the genes in one plasmid, so it is widely applied to increase the yield of metabolites. In the present study, on the basis of a shikimate kinase genetic defect strain Escherichia coli BL21 (ΔaroL/aroK, DE3), the key enzyme genes aroG, aroB, tktA and aroE of SA pathway were co-expressed and compared systematically by constructing a series of multi-gene expression vectors. The results showed that different gene co-expression combinations (two, three or four genes) or gene orders had different effects on the production of SA. SA production of the recombinant BL21-GBAE reached to 886.38 mg·L(-1), which was 17-fold (P < 0.05) of the parent strain BL21 (ΔaroL/aroK, DE3).

  11. Bioinformatics and co-expression network analysis of differentially expressed lncRNAs and mRNAs in hippocampus of APP/PS1 transgenic mice with Alzheimer disease

    PubMed Central

    Fang, Min; Zhang, Pei; Zhao, Yanxin; Liu, Xueyuan

    2017-01-01

    APP/PS1 transgenic mice with Alzheimer disease (AD) are widely used as a reliable animal model in studies about behaviors, physiology, biochemistry and histomorphology of AD, but few studies have been conducted to investigate the role of lncRNAs in this model. In this study, lncRNA microarray was employed to detect the gene expression profile and lncRNA expression profile in the mouse brain. Then, bioinformatics was used to predict the differentially expressed genes related to AD (n=20). Among different lncRNAs (n=249), 99 were downregulated and 150 upregulated. Co-expression network was applied to analyze the co-expression of differential lncRNAs and different genes. In network, lncRNA Gm13498 and lncRNA 1700030L20Rik correlated with the most genes and their degrees were 6 and 5, respectively. Then, the function and signal transduction pathways related to the differentially co-expressed lncRNAs were analyzed with bioinformatics, and results showed that these lncRNAs were involved in the systemic development of neurons, intercellular communication, regulation of action potential of neurons, development and differentiation of oligodendrocytes, neurotransmitters transmission, and neuronal regeneration. Realtime PCR was employed to detect the expression of relevant lncRNAs and differentially expressed RNAs in 10 samples, and results were consistent with above findings from microarray. PMID:28386363

  12. Transcriptomics of the late gestation ovine fetal brain: modeling the co-expression of immune marker genes.

    PubMed

    Rabaglino, Maria B; Keller-Wood, Maureen; Wood, Charles E

    2014-11-19

    Major changes in gene expression occur in the fetal brain to modulate the function of this organ postnatally. Thus, factors can alter the genomics of the fetal brain, predisposing to neurological disorders later in life. We hypothesized that the physiological dynamics of the immune system transcriptome of the fetal brain during the last stage of gestation will reveal patterns of immune function and development in the developing brain. In this study we applied weighted gene co-expression analysis of microarrays performed on ovine fetal brain samples, to model the changes in gene expression throughout the second half of gestation. Clusters of co-expressed genes that strongly increase in expression toward the first day of extra-uterine life are related to the hematopoietic lineage, while activation of immune pathways is induced after birth. Moreover, the pattern of gene expression suggests induction of tolerance mechanisms, probably necessary to protect highly produced proteins--such as myelin basic protein--from an autoimmune attack. This study provides insight into the dramatic changes in gene expression that take place in the brain during the fetal life, especially during the last stage of gestation, and suggests that the immune system may have an important role in maturation of the fetal brain, which if disrupted or altered, could have negative consequences in postnatal life.

  13. Tetrahymena Gene Expression Database (TGED): a resource of microarray data and co-expression analyses for Tetrahymena.

    PubMed

    Xiong, Jie; Lu, XingYi; Lu, YuMing; Zeng, HongHui; Yuan, DongXia; Feng, LiFang; Chang, Yue; Bowen, Josephine; Gorovsky, Martin; Fu, ChengJie; Miao, Wei

    2011-01-01

    Tetrahymena thermophila is a model eukaryotic organism. Functional genomic analyses in Tetrahymena present rich opportunities to address fundamental questions of cell and molecular biology. The Tetrahymena Gene Expression Database (TGED; available at http://tged.ihb.ac.cn) is the first expression database of a ciliated protozoan. It covers three major physiological and developmental states: growth, starvation, and conjugation, and can be accessed through a user-friendly web interface. The gene expression profiles and candidate co-expressed genes for each gene can be retrieved using Gene ID or Gene description searches. Descriptions of standardized methods of sample preparation and the opportunity to add new Tetrahymena microarray data will be of great interest to the Tetrahymena research community. TGED is intended to be a resource for all members of the scientific research community who are interested in Tetrahymena and other ciliates.

  14. Matrix factorization reveals aging-specific co-expression gene modules in the fat and muscle tissues in nonhuman primates

    PubMed Central

    Wang, Yongcui; Zhao, Weiling; Zhou, Xiaobo

    2016-01-01

    Accurate identification of coherent transcriptional modules (subnetworks) in adipose and muscle tissues is important for revealing the related mechanisms and co-regulated pathways involved in the development of aging-related diseases. Here, we proposed a systematically computational approach, called ICEGM, to Identify the Co-Expression Gene Modules through a novel mathematical framework of Higher-Order Generalized Singular Value Decomposition (HO-GSVD). ICEGM was applied on the adipose, and heart and skeletal muscle tissues in old and young female African green vervet monkeys. The genes associated with the development of inflammation, cardiovascular and skeletal disorder diseases, and cancer were revealed by the ICEGM. Meanwhile, genes in the ICEGM modules were also enriched in the adipocytes, smooth muscle cells, cardiac myocytes, and immune cells. Comprehensive disease annotation and canonical pathway analysis indicated that immune cells, adipocytes, cardiomyocytes, and smooth muscle cells played a synergistic role in cardiac and physical functions in the aged monkeys by regulation of the biological processes associated with metabolism, inflammation, and atherosclerosis. In conclusion, the ICEGM provides an efficiently systematic framework for decoding the co-expression gene modules in multiple tissues. Analysis of genes in the ICEGM module yielded important insights on the cooperative role of multiple tissues in the development of diseases. PMID:27703186

  15. Matrix factorization reveals aging-specific co-expression gene modules in the fat and muscle tissues in nonhuman primates

    NASA Astrophysics Data System (ADS)

    Wang, Yongcui; Zhao, Weiling; Zhou, Xiaobo

    2016-10-01

    Accurate identification of coherent transcriptional modules (subnetworks) in adipose and muscle tissues is important for revealing the related mechanisms and co-regulated pathways involved in the development of aging-related diseases. Here, we proposed a systematically computational approach, called ICEGM, to Identify the Co-Expression Gene Modules through a novel mathematical framework of Higher-Order Generalized Singular Value Decomposition (HO-GSVD). ICEGM was applied on the adipose, and heart and skeletal muscle tissues in old and young female African green vervet monkeys. The genes associated with the development of inflammation, cardiovascular and skeletal disorder diseases, and cancer were revealed by the ICEGM. Meanwhile, genes in the ICEGM modules were also enriched in the adipocytes, smooth muscle cells, cardiac myocytes, and immune cells. Comprehensive disease annotation and canonical pathway analysis indicated that immune cells, adipocytes, cardiomyocytes, and smooth muscle cells played a synergistic role in cardiac and physical functions in the aged monkeys by regulation of the biological processes associated with metabolism, inflammation, and atherosclerosis. In conclusion, the ICEGM provides an efficiently systematic framework for decoding the co-expression gene modules in multiple tissues. Analysis of genes in the ICEGM module yielded important insights on the cooperative role of multiple tissues in the development of diseases.

  16. The impact of microRNAs on transcriptional heterogeneity and gene co-expression across single embryonic stem cells

    PubMed Central

    Gambardella, Gennaro; Carissimo, Annamaria; Chen, Amy; Cutillo, Luisa; Nowakowski, Tomasz J.; di Bernardo, Diego; Blelloch, Robert

    2017-01-01

    MicroRNAs act posttranscriptionally to suppress multiple target genes within a cell population. To what extent this multi-target suppression occurs in individual cells and how it impacts transcriptional heterogeneity and gene co-expression remains unknown. Here we used single-cell sequencing combined with introduction of individual microRNAs. miR-294 and let-7c were introduced into otherwise microRNA-deficient Dgcr8 knockout mouse embryonic stem cells. Both microRNAs induce suppression and correlated expression of their respective gene targets. The two microRNAs had opposing effects on transcriptional heterogeneity within the cell population, with let-7c increasing and miR-294 decreasing the heterogeneity between cells. Furthermore, let-7c promotes, whereas miR-294 suppresses, the phasing of cell cycle genes. These results show at the individual cell level how a microRNA simultaneously has impacts on its many targets and how that in turn can influence a population of cells. The findings have important implications in the understanding of how microRNAs influence the co-expression of genes and pathways, and thus ultimately cell fate. PMID:28102192

  17. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets.

    PubMed

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J; Nandi, Asoke K

    2015-06-04

    Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few

  18. Co-expression of two heterologous lactate dehydrogenases genes in Kluyveromyces marxianus for l-lactic acid production.

    PubMed

    Lee, Jae Won; In, Jung Hoon; Park, Joon-Bum; Shin, Jonghyeok; Park, Jin Hwan; Sung, Bong Hyun; Sohn, Jung-Hoon; Seo, Jin-Ho; Park, Jin-Byoung; Kim, Soo Rin; Kweon, Dae-Hyuk

    2017-01-10

    Lactic acid (LA) is a versatile compound used in the food, pharmaceutical, textile, leather, and chemical industries. Biological production of LA is possible by yeast strains expressing a bacterial gene encoding l-lactate dehydrogenase (LDH). Kluyveromyces marxianus is an emerging non-conventional yeast with various phenotypes of industrial interest. However, it has not been extensively studied for LA production. In this study, K. marxianus was engineered to express and co-express various heterologous LDH enzymes that were reported to have different pH optimums. Specifically, three LDH enzymes originating from Staphylococcus epidermidis (SeLDH; optimal at pH 5.6), Lactobacillus acidophilus (LaLDH; optimal at pH 5.3), and Bos taurus (BtLDH; optimal at pH 9.8) were functionally expressed individually and in combination in K. marxianus, and the resulting strains were compared in terms of LA production. A strain co-expressing SeLDH and LaLDH (KM5 La+SeLDH) produced 16.0g/L LA, whereas the strains expressing those enzymes individually produced only 8.4 and 6.8g/L, respectively. This co-expressing strain produced 24.0g/L LA with a yield of 0.48g/g glucose in the presence of CaCO3. Our results suggest that co-expression of LDH enzymes with different pH optimums provides sufficient LDH activity under dynamic intracellular pH conditions, leading to enhanced production of LA compared to individual expression of the LDH enzymes. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Coexpression of Nuclear Receptors and Histone Methylation Modifying Genes in the Testis: Implications for Endocrine Disruptor Modes of Action

    PubMed Central

    Anderson, Alison M.; Carter, Kim W.; Anderson, Denise; Wise, Michael J.

    2012-01-01

    Background Endocrine disruptor chemicals elicit adverse health effects by perturbing nuclear receptor signalling systems. It has been speculated that these compounds may also perturb epigenetic mechanisms and thus contribute to the early origin of adult onset disease. We hypothesised that histone methylation may be a component of the epigenome that is susceptible to perturbation. We used coexpression analysis of publicly available data to investigate the combinatorial actions of nuclear receptors and genes involved in histone methylation in normal testis and when faced with endocrine disruptor compounds. Methodology/Principal Findings The expression patterns of a set of genes were profiled across testis tissue in human, rat and mouse, plus control and exposed samples from four toxicity experiments in the rat. Our results indicate that histone methylation events are a more general component of nuclear receptor mediated transcriptional regulation in the testis than previously appreciated. Coexpression patterns support the role of a gatekeeper mechanism involving the histone methylation modifiers Kdm1, Prdm2, and Ehmt1 and indicate that this mechanism is a common determinant of transcriptional integrity for genes critical to diverse physiological endpoints relevant to endocrine disruption. Coexpression patterns following exposure to vinclozolin and dibutyl phthalate suggest that coactivity of the demethylase Kdm1 in particular warrants further investigation in relation to endocrine disruptor mode of action. Conclusions/Significance This study provides proof of concept that a bioinformatics approach that profiles genes related to a specific hypothesis across multiple biological settings can provide powerful insight into coregulatory activity that would be difficult to discern at an individual experiment level or by traditional differential expression analysis methods. PMID:22496781

  20. The Brassica rapa FLC homologue FLC2 is a key regulator of flowering time, identified through transcriptional co-expression networks.

    PubMed

    Xiao, Dong; Zhao, Jian J; Hou, Xi L; Basnet, Ram K; Carpio, Dunia P D; Zhang, Ning W; Bucher, Johan; Lin, Ke; Cheng, Feng; Wang, Xiao W; Bonnema, Guusje

    2013-11-01

    The role of many genes and interactions among genes involved in flowering time have been studied extensively in Arabidopsis, and the purpose of this study was to investigate how effectively results obtained with the model species Arabidopsis can be applied to the Brassicacea with often larger and more complex genomes. Brassica rapa represents a very close relative, with its triplicated genome, with subgenomes having evolved by genome fractionation. The question of whether this genome fractionation is a random process, or whether specific genes are preferentially retained, such as flowering time (Ft) genes that play a role in the extreme morphological variation within the B. rapa species (displayed by the diverse morphotypes), is addressed. Data are presented showing that indeed Ft genes are preferentially retained, so the next intriguing question is whether these different orthologues of Arabidopsis Ft genes play similar roles compared with Arabidopsis, and what is the role of these different orthologues in B. rapa. Using a genetical-genomics approach, co-location of flowering quantitative trait loci (QTLs) and expression QTLs (eQTLs) resulted in identification of candidate genes for flowering QTLs and visualization of co-expression networks of Ft genes and flowering time. A major flowering QTL on A02 at the BrFLC2 locus co-localized with cis eQTLs for BrFLC2, BrSSR1, and BrTCP11, and trans eQTLs for the photoperiod gene BrCO and two paralogues of the floral integrator genes BrSOC1 and BrFT. It is concluded that the BrFLC2 Ft gene is a major regulator of flowering time in the studied doubled haploid population.

  1. The Brassica rapa FLC homologue FLC2 is a key regulator of flowering time, identified through transcriptional co-expression networks

    PubMed Central

    Xiao, Dong; Zhao, Jian J.; Bonnema, Guusje

    2013-01-01

    The role of many genes and interactions among genes involved in flowering time have been studied extensively in Arabidopsis, and the purpose of this study was to investigate how effectively results obtained with the model species Arabidopsis can be applied to the Brassicacea with often larger and more complex genomes. Brassica rapa represents a very close relative, with its triplicated genome, with subgenomes having evolved by genome fractionation. The question of whether this genome fractionation is a random process, or whether specific genes are preferentially retained, such as flowering time (Ft) genes that play a role in the extreme morphological variation within the B. rapa species (displayed by the diverse morphotypes), is addressed. Data are presented showing that indeed Ft genes are preferentially retained, so the next intriguing question is whether these different orthologues of Arabidopsis Ft genes play similar roles compared with Arabidopsis, and what is the role of these different orthologues in B. rapa. Using a genetical–genomics approach, co-location of flowering quantitative trait loci (QTLs) and expression QTLs (eQTLs) resulted in identification of candidate genes for flowering QTLs and visualization of co-expression networks of Ft genes and flowering time. A major flowering QTL on A02 at the BrFLC2 locus co-localized with cis eQTLs for BrFLC2, BrSSR1, and BrTCP11, and trans eQTLs for the photoperiod gene BrCO and two paralogues of the floral integrator genes BrSOC1 and BrFT. It is concluded that the BrFLC2 Ft gene is a major regulator of flowering time in the studied doubled haploid population. PMID:24078668

  2. Network properties of human disease genes with pleiotropic effects

    PubMed Central

    2010-01-01

    Background The ability of a gene to cause a disease is known to be associated with the topological position of its protein product in the molecular interaction network. Pleiotropy, in human genetic diseases, refers to the ability of different mutations within the same gene to cause different pathological effects. Here, we hypothesized that the ability of human disease genes to cause pleiotropic effects would be associated with their network properties. Results Shared genes, with pleiotropic effects, were more central than specific genes that were associated with one disease, in the protein interaction network. Furthermore, shared genes associated with phenotypically divergent diseases (phenodiv genes) were more central than those associated with phenotypically similar diseases. Shared genes had a higher number of disease gene interactors compared to specific genes, implying higher likelihood of finding a novel disease gene in their network neighborhood. Shared genes had a relatively restricted tissue co-expression with interactors, contrary to specific genes. This could be a function of shared genes leading to pleiotropy. Essential and phenodiv genes had comparable connectivities and hence we investigated for differences in network attributes conferring lethality and pleiotropy, respectively. Essential and phenodiv genes were found to be intra-modular and inter-modular hubs with the former being highly co-expressed with their interactors contrary to the latter. Essential genes were predominantly nuclear proteins with transcriptional regulation activities while phenodiv genes were cytoplasmic proteins involved in signal transduction. Conclusion The properties of a disease gene in molecular interaction network determine its role in manifesting different and divergent diseases. PMID:20525321

  3. Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes.

    PubMed

    Baskerville, Scott; Bartel, David P

    2005-03-01

    MicroRNAs (miRNAs) are short endogenous RNAs known to post-transcriptionally repress gene expression in animals and plants. A microarray profiling survey revealed the expression patterns of 175 human miRNAs across 24 different human organs. Our results show that proximal pairs of miRNAs are generally coexpressed. In addition, an abrupt transition in the correlation between pairs of expressed miRNAs occurs at a distance of 50 kb, implying that miRNAs separated by <50 kb typically derive from a common transcript. Some microRNAs are within the introns of host genes. Intronic miRNAs are usually coordinately expressed with their host gene mRNA, implying that they also generally derive from a common transcript, and that in situ analyses of host gene expression can be used to probe the spatial and temporal localization of intronic miRNAs.

  4. Genome-Wide Identification, Evolution, and Co-expression Network Analysis of Mitogen-Activated Protein Kinase Kinase Kinases in Brachypodium distachyon

    PubMed Central

    Feng, Kewei; Liu, Fuyan; Zou, Jinwei; Xing, Guangwei; Deng, Pingchuan; Song, Weining; Tong, Wei; Nie, Xiaojun

    2016-01-01

    Mitogen-activated protein kinase (MAPK) cascades are the conserved and universal signal transduction modules in all eukaryotes, which play the vital roles in plant growth, development, and in response to multiple stresses. In this study, we used bioinformatics methods to identify 86 MAPKKK protein encoded by 73 MAPKKK genes in Brachypodium. Phylogenetic analysis of MAPKKK family from Arabidopsis, rice, and Brachypodium has classified them into three subfamilies, of which 28 belonged to MEKK, 52 to Raf, and 6 to ZIK subfamily, respectively. Conserved protein motif, exon-intron organization, and splicing intron phase in kinase domains supported the evolutionary relationships inferred from the phylogenetic analysis. And gene duplication analysis suggested the chromosomal segment duplication happened before the divergence of the rice and Brachypodium, while all of three tandem duplicated gene pairs happened after their divergence. We further demonstrated that the MAPKKKs have evolved under strong purifying selection, implying the conservation of them. The splicing transcripts expression analysis showed that the splicesome translating longest protein tended to be adopted. Furthermore, the expression analysis of BdMAPKKKs in different organs and development stages as well as heat, virus and drought stresses revealed that the MAPKKK genes were involved in various signaling pathways. And the circadian analysis suggested there were 41 MAPKKK genes in Brachypodium showing cycled expression in at least one condition, of which seven MAPKKK genes expressed in all conditions and the promoter analysis indicated these genes possessed many cis-acting regulatory elements involved in circadian and light response. Finally, the co-expression network of MAPK, MAPKK, and MAPKKK in Brachypodium was constructed using 144 microarray and RNA-seq datasets, and ten potential MAPK cascades pathway were predicted. To conclude, our study provided the important information for evolutionary and

  5. Gene Coexpression Analysis Reveals Complex Metabolism of the Monoterpene Alcohol Linalool in Arabidopsis Flowers[W][OPEN

    PubMed Central

    Ginglinger, Jean-François; Boachon, Benoit; Höfer, René; Paetz, Christian; Köllner, Tobias G.; Miesch, Laurence; Lugan, Raphael; Baltenweck, Raymonde; Mutterer, Jérôme; Ullmann, Pascaline; Beran, Franziska; Claudel, Patricia; Verstappen, Francel; Fischer, Marc J.C.; Karst, Francis; Bouwmeester, Harro; Miesch, Michel; Schneider, Bernd; Gershenzon, Jonathan; Ehlting, Jürgen; Werck-Reichhart, Danièle

    2013-01-01

    The cytochrome P450 family encompasses the largest family of enzymes in plant metabolism, and the functions of many of its members in Arabidopsis thaliana are still unknown. Gene coexpression analysis pointed to two P450s that were coexpressed with two monoterpene synthases in flowers and were thus predicted to be involved in monoterpenoid metabolism. We show that all four selected genes, the two terpene synthases (TPS10 and TPS14) and the two cytochrome P450s (CYP71B31 and CYP76C3), are simultaneously expressed at anthesis, mainly in upper anther filaments and in petals. Upon transient expression in Nicotiana benthamiana, the TPS enzymes colocalize in vesicular structures associated with the plastid surface, whereas the P450 proteins were detected in the endoplasmic reticulum. Whether they were expressed in Saccharomyces cerevisiae or in N. benthamiana, the TPS enzymes formed two different enantiomers of linalool: (−)-(R)-linalool for TPS10 and (+)-(S)-linalool for TPS14. Both P450 enzymes metabolize the two linalool enantiomers to form different but overlapping sets of hydroxylated or epoxidized products. These oxygenated products are not emitted into the floral headspace, but accumulate in floral tissues as further converted or conjugated metabolites. This work reveals complex linalool metabolism in Arabidopsis flowers, the ecological role of which remains to be determined. PMID:24285789

  6. The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter

    PubMed Central

    Darbani, Behrooz; Motawia, Mohammed Saddik; Olsen, Carl Erik; Nour-Eldin, Hussam H.; Møller, Birger Lindberg; Rook, Fred

    2016-01-01

    Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expressed with the biosynthetic genes. The predicted localisation of SbMATE2 to the vacuolar membrane was demonstrated experimentally by transient expression of a SbMATE2-YFP fusion protein and confocal microscopy. Transport studies in Xenopus laevis oocytes demonstrate that SbMATE2 is able to transport dhurrin. In addition, SbMATE2 was able to transport non-endogenous cyanogenic glucosides, but not the anthocyanin cyanidin 3-O-glucoside or the glucosinolate indol-3-yl-methyl glucosinolate. The genomic co-localisation of a transporter gene with the biosynthetic genes producing the transported compound is discussed in relation to the role self-toxicity of chemical defence compounds may play in the formation of gene clusters. PMID:27841372

  7. Use of the growing environment as a source of variation to identify the quantitative trait transcripts and modules of co-expressed genes that determine chlorogenic acid accumulation

    PubMed Central

    JOËT, THIERRY; SALMONA, JORDI; LAFFARGUE, ANDRÉINA; DESCROIX, FRÉDÉRIC; DUSSERT, STÉPHANE

    2010-01-01

    Developing Coffea arabica seeds accumulate large amounts of chlorogenic acids (CGAs) as a storage form of phenylpropanoid derivatives, making coffee a valuable model to investigate the metabolism of these widespread plant phenolics. However, developmental and environmental regulations of CGA metabolism are poorly understood. In the present work, the expression of selected phenylpropanoid genes, together with CGA isomer profiles, was monitored throughout seed development across a wide set of contrasted natural environments. Although CGA metabolism was controlled by major developmental factors, the mean temperature during seed development had a direct impact on the time-window of CGA biosynthesis, as well as on final CGA isomer composition through subtle transcriptional regulations. We provide evidence that the variability induced by the environment is a useful tool to test whether CGA accumulation is quantitatively modulated at the transcriptional level, hence enabling detection of rate-limiting transcriptional steps [quantitative trait transcripts (QTTs)] for CGA biosynthesis. Variations induced by the environment also enabled a better description of the phenylpropanoid gene transcriptional network throughout seed development, as well as the detection of three temporally distinct modules of quantitatively co-expressed genes. Finally, analysis of metabolite-to-metabolite relationships revealed new biochemical characteristics of the isomerization steps that remain uncharacterized at the gene level. PMID:20199615

  8. Co-expression of the Na(+)/H(+)-antiporter and H(+)-ATPase genes of the salt-tolerant yeast Zygosaccharomyces rouxii in Saccharomyces cerevisiae.

    PubMed

    Watanabe, Yasuo; Oshima, Naoko; Tamai, Youichi

    2005-02-01

    We cloned two genes from the salt-tolerant yeast Zygosaccharomyces rouxii: ZrSOD2 for the cell membrane Na(+)/H(+)-antiporter and ZrPMA1 for the cell membrane H(+)-ATPase. The products of these genes play cooperative roles in the salt-tolerance of Z. rouxii, and the function of the ZrPMA1 product is regulated at the transcription level. We constructed a yeast expression vector that is able to co-express the ZrSOD2 and ZrPMA1 genes. Single expression of ZrSOD2 was effective in conferring salt-tolerance, and although a slight synergic effect was observed with co-expression of ZrSOD2 and ZrPMA1, the usefulness of this co-expression is likely to be minimal with regard to salt-tolerance.

  9. Co-expression of Cyanobacterial Genes for Arsenic Methylation and Demethylation in Escherichia coli Offers Insights into Arsenic Resistance

    PubMed Central

    Yan, Yu; Xue, Xi-Mei; Guo, Yu-Qing; Zhu, Yong-Guan; Ye, Jun

    2017-01-01

    Arsenite [As(III)] and methylarsenite [MAs(III)] are the most toxic inorganic and methylated arsenicals, respectively. As(III) and MAs(III) can be interconverted in the unicellular cyanobacterium Nostoc sp. PCC 7120 (Nostoc), which has both the arsM gene (NsarsM), which is responsible for arsenic methylation, and the arsI gene (NsarsI), which is responsible for MAs(III) demethylation. It is not clear how the cells prevent a futile cycle of methylation and demethylation. To investigate the relationship between arsenic methylation and demethylation, we constructed strains of Escherichia coli AW3110 (ΔarsRBC) expressing NsarsM or/and NsarsI. Expression of NsarsI conferred MAs(III) resistance through MAs(III) demethylation. Compared to NsArsI, NsArsM conferred higher resistance to As(III) and lower resistance to MAs(III) by methylating both As(III) and MAs(III). The major species found in solution was dimethylarsenate [DMAs(V)]. Co-expression of NsarsM and NsarsI conferred As(III) resistance at levels similar to that with NsarsM alone, although the main species found in solution after As(III) biotransformation was methylarsenate [MAs(V)] rather than DMAs(V). Co-expression of NsarsM and NsarsI conferred a higher level of resistance to MAs(III) than found with expression of NsarsM alone but lower than expression of only NsarsI. Cells co-expressing both genes converted MAs(III) to a mixture of As(III) and DMAs(V). In Nostoc NsarsM is constitutively expressed, while NsarsI is inducible by either As(III) or MAs(III). Thus, our results suggest that at low concentrations of arsenic, NsArsM activity predominates, while NsArsI activity predominates at high concentrations. We propose that coexistence of arsM and arsI genes in Nostoc could be advantageous for several reasons. First, it confers a broader spectrum of resistance to both As(III) and MAs(III). Second, at low concentrations of arsenic, the MAs(III) produced by NsArsM will possibly have antibiotic-like properties and

  10. Genome-Wide Identification, Phylogenetic and Co-Expression Analysis of OsSET Gene Family in Rice

    PubMed Central

    Lu, Zhanhua; Huang, Xiaolong; Ouyang, Yidan; Yao, Jialing

    2013-01-01

    Background SET domain is responsible for the catalytic activity of histone lysine methyltransferases (HKMTs) during developmental process. Histone lysine methylation plays a crucial and diverse regulatory function in chromatin organization and genome function. Although several SET genes have been identified and characterized in plants, the understanding of OsSET gene family in rice is still very limited. Methodology/Principal Findings In this study, a systematic analysis was performed and revealed the presence of at least 43 SET genes in rice genome. Phylogenetic and structural analysis grouped SET proteins into five classes, and supposed that the domains out of SET domain were significant for the specific of histone lysine methylation, as well as the recognition of methylated histone lysine. Based on the global microarray, gene expression profile revealed that the transcripts of OsSET genes were accumulated differentially during vegetative and reproductive developmental stages and preferentially up or down-regulated in different tissues. Cis-elements identification, co-expression analysis and GO analysis of expression correlation of 12 OsSET genes suggested that OsSET genes might be involved in cell cycle regulation and feedback. Conclusions/Significance This study will facilitate further studies on OsSET family and provide useful clues for functional validation of OsSETs. PMID:23762371

  11. Co-expression of G2-EPSPS and glyphosate acetyltransferase GAT genes conferring high tolerance to glyphosate in soybean

    PubMed Central

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Jin, Longguo; Zhang, Lijuan; Chang, Ru-Zhen; Lu, Wei; Lin, Min; Qiu, Li-Juan

    2015-01-01

    Glyphosate is a widely used non-selective herbicide with broad spectrum of weed control around the world. At present, most of the commercial glyphosate tolerant soybeans utilize glyphosate tolerant gene CP4-EPSPS or glyphosate acetyltransferase gene GAT separately. In this study, both glyphosate tolerant gene G2-EPSPS and glyphosate degraded gene GAT were co-transferred into soybean and transgenic plants showed high tolerance to glyphosate. Molecular analysis including PCR, Sothern blot, qRT-PCR, and Western blot revealed that target genes have been integrated into genome and expressed effectively at both mRNA and protein levels. Furthermore, the glyphosate tolerance analysis showed that no typical symptom was observed when compared with a glyphosate tolerant line HJ06-698 derived from GR1 transgenic soybean even at fourfold labeled rate of Roundup. Chlorophyll and shikimic acid content analysis of transgenic plant also revealed that these two indexes were not significantly altered after glyphosate application. These results indicated that co-expression of G2-EPSPS and GAT conferred high tolerance to the herbicide glyphosate in soybean. Therefore, combination of tolerant and degraded genes provides a new strategy for developing glyphosate tolerant transgenic crops. PMID:26528311

  12. Co-expression of G2-EPSPS and glyphosate acetyltransferase GAT genes conferring high tolerance to glyphosate in soybean.

    PubMed

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Jin, Longguo; Zhang, Lijuan; Chang, Ru-Zhen; Lu, Wei; Lin, Min; Qiu, Li-Juan

    2015-01-01

    Glyphosate is a widely used non-selective herbicide with broad spectrum of weed control around the world. At present, most of the commercial glyphosate tolerant soybeans utilize glyphosate tolerant gene CP4-EPSPS or glyphosate acetyltransferase gene GAT separately. In this study, both glyphosate tolerant gene G2-EPSPS and glyphosate degraded gene GAT were co-transferred into soybean and transgenic plants showed high tolerance to glyphosate. Molecular analysis including PCR, Sothern blot, qRT-PCR, and Western blot revealed that target genes have been integrated into genome and expressed effectively at both mRNA and protein levels. Furthermore, the glyphosate tolerance analysis showed that no typical symptom was observed when compared with a glyphosate tolerant line HJ06-698 derived from GR1 transgenic soybean even at fourfold labeled rate of Roundup. Chlorophyll and shikimic acid content analysis of transgenic plant also revealed that these two indexes were not significantly altered after glyphosate application. These results indicated that co-expression of G2-EPSPS and GAT conferred high tolerance to the herbicide glyphosate in soybean. Therefore, combination of tolerant and degraded genes provides a new strategy for developing glyphosate tolerant transgenic crops.

  13. SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

    PubMed

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.

  14. A conserved BDNF, glutamate- and GABA-enriched gene module related to human depression identified by coexpression meta-analysis and DNA variant genome-wide association studies.

    PubMed

    Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne

    2014-01-01

    Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules

  15. Gene transcriptional networks integrate microenvironmental signals in human breast cancer.

    PubMed

    Xu, Ren; Mao, Jian-Hua

    2011-04-01

    A significant amount of evidence shows that microenvironmental signals generated from extracellular matrix (ECM) molecules, soluble factors, and cell-cell adhesion complexes cooperate at the extra- and intracellular level. This synergetic action of microenvironmental cues is crucial for normal mammary gland development and breast malignancy. To explore how the microenvironmental genes coordinate in human breast cancer at the genome level, we have performed gene co-expression network analysis in three independent microarray datasets and identified two microenvironment networks in human breast cancer tissues. Network I represents crosstalk and cooperation of ECM microenvironment and soluble factors during breast malignancy. The correlated expression of cytokines, chemokines, and cell adhesion proteins in Network II implicates the coordinated action of these molecules in modulating the immune response in breast cancer tissues. These results suggest that microenvironmental cues are integrated with gene transcriptional networks to promote breast cancer development.

  16. Evolution of akirin family in gene and genome levels and coexpressed patterns among family members and rel gene in croaker.

    PubMed

    Liu, Tianxing; Gao, Yunhang; Xu, Tianjun

    2015-09-01

    Akirins, which are highly conserved nuclear proteins, are present throughout the metazoan and regulate innate immunity, embryogenesis, myogenesis, and carcinogenesis. This study reports all akirin genes from miiuy croaker and analyzes comprehensively the akirin gene family combined with akirin genes from other species. A second nuclear localization signal (NLS) is observed in akirin2 homologues, which is not in akirin1 homologues in all teleosts and most other vertebrates. Thus, we deduced that the loss of second NLS in akirin1 homologues in teleosts likely occurred in an ancestor to all Osteichthyes after splitting with cartilaginous fish. Significantly, the akirin2(2) gene included six exons interrupted by five introns in the miiuy croaker, which may be caused by the intron insertion event as a novel evidence for the variation of akirin gene structure in some species. In addition, comparison of the genomic neighborhood genes of akirin1, akirin2(1), and akirin2(2) demonstrates a strong level of conserved synteny across the teleost classes, which further proved the deduction of Macqueen and Johnston 2009 that the produce of akirin paralogues can be attributed to whole-genome duplications and the loss of some akirin paralogues after genome duplications. Furthermore, akirin gene family members and relish gene are ubiquitously expressed across all tissues, and their expression levels are increased in three immune tissues after infection with Vibrio anguillarum. Combined with the expression patterns of LEAP-1 and LEAP-2 from miiuy croaker, an intricate network of co-regulation among family members is established. Thus, it is further proved that akirins acted in concert with the relish protein to induce the expression of a subset of downstream pathway elements in the NF-kB dependent signaling pathway. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Gene cloning and soluble expression of Aspergillus niger phytase in E. coli cytosol via chaperone co-expression.

    PubMed

    Ushasree, Mrudula Vasudevan; Vidya, Jalaja; Pandey, Ashok

    2014-01-01

    A phytase gene from Aspergillus niger was isolated and two Escherichia coli expression systems, based on T7 RNA polymerase promoter and tac promoter, were used for its recombinant expression. Co-expression of molecular chaperone, GroES/EL, aided functional cytosolic expression of the phytase in E. coli BL21 (DE3). Untagged and maltose-binding protein-tagged recombinant phytase showed an activity band of ~49 and 92 kDa, respectively, on a zymogram. Heterologously-expressed phytase was fractionated from endogenous E. coli phytase by (NH4)2SO4 precipitation. The enzyme had optimum activity at 50 °C and pH 6.5.

  18. Inferring gene regression networks with model trees

    PubMed Central

    2010-01-01

    Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate

  19. Co-expression of perforin and granzyme B genes induces apoptosis and inhibits the tumorigenicity of laryngeal cancer cell line Hep-2

    PubMed Central

    Li, Xiu-Ying; Li, Zhi; An, Gui-Jie; Liu, Sha; Lai, Yan-Dong

    2014-01-01

    Granzyme B and perforin, two of the most important components, have shown anticancer properties in various cancers, but their effects in laryngeal cancer remain unexplored. Here we decided to examine the effects of Granzyme B and perforin in Hep-2 cells and clarify the role of perforin and granzyme B in the tumorigenicity of laryngeal cancer cell line. Hep-2 cells were transfected with pVAX1-PIG co-expression vector (comprising perforin and granzyme B genes), and then the growth and apoptosis of these Hep-2 cells were evaluated. The tumorigenicity of Hep-2 cell line co-expressing perforin and granzyme B genes was tested in BALB/c nu/nu mice. We found that the co-expression of perforin and granzyme B genes could obviously inhibit cell focus formation and induce cell apoptosis in Hep-2 cells. Furthermore, after subcutaneous injection of Hep-2 cells transfected with pVAX1-PIG, an extensive delay in tumor growth was observed in BALB/c-nu/nu mice. Moreover, our studies demonstrated that the anticancer activity of perforin and granzyme B was sustainable in vivo as tumor development by inducing cell apoptosis. Taken together, our data indicate that the co-expression of perforin and granzyme B genes exhibits anticancer potential, and hopefully provide potential therapeutic applications in laryngeal cancer. PMID:24696715

  20. Co-expression of perforin and granzyme B genes induces apoptosis and inhibits the tumorigenicity of laryngeal cancer cell line Hep-2.

    PubMed

    Li, Xiu-Ying; Li, Zhi; An, Gui-Jie; Liu, Sha; Lai, Yan-Dong

    2014-01-01

    Granzyme B and perforin, two of the most important components, have shown anticancer properties in various cancers, but their effects in laryngeal cancer remain unexplored. Here we decided to examine the effects of Granzyme B and perforin in Hep-2 cells and clarify the role of perforin and granzyme B in the tumorigenicity of laryngeal cancer cell line. Hep-2 cells were transfected with pVAX1-PIG co-expression vector (comprising perforin and granzyme B genes), and then the growth and apoptosis of these Hep-2 cells were evaluated. The tumorigenicity of Hep-2 cell line co-expressing perforin and granzyme B genes was tested in BALB/c nu/nu mice. We found that the co-expression of perforin and granzyme B genes could obviously inhibit cell focus formation and induce cell apoptosis in Hep-2 cells. Furthermore, after subcutaneous injection of Hep-2 cells transfected with pVAX1-PIG, an extensive delay in tumor growth was observed in BALB/c-nu/nu mice. Moreover, our studies demonstrated that the anticancer activity of perforin and granzyme B was sustainable in vivo as tumor development by inducing cell apoptosis. Taken together, our data indicate that the co-expression of perforin and granzyme B genes exhibits anticancer potential, and hopefully provide potential therapeutic applications in laryngeal cancer.

  1. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.

    PubMed

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-04

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments.

  2. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond

    PubMed Central

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Moral-Chávez, Víctor Del; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-01

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments. PMID:26527724

  3. Nearest Hyperplane Distance Neighbor Clustering algorithm Applied to Gene Co-Expression Analysis in Alzheimer’s Disease

    PubMed Central

    Pasluosta, Cristian F.; Dua, Prerna; Lukiw, Walter J.

    2013-01-01

    Microarray analysis can contribute considerably to the understanding of biologically significant cellular mechanisms that yield novel information regarding co-regulated sets of gene patterns. Clustering is one of the most popular tools for analyzing DNA microarray data. In this paper, we present an unsupervised clustering algorithm based on the K-local hyperplane distance nearest-neighbor classifier (HKNN). We adapted the well-known nearest neighbor clustering algorithm for use with hyperplane distance. The result is a simple and computationally inexpensive unsupervised clustering algorithm that can be applied to high-dimensional data. It has been reported that the NFkB1 gene is progressively over-expressed in moderate-to-severe Alzheimer’s disease (AD) cases, and that the NF-kB complex plays a key role in neuroinflammatory responses in AD pathogenesis. In this study, we apply the proposed clustering algorithm to identify co-expression patterns with the NFkB1 in gene expression data from hippocampal tissue samples. Finally, we validate our experiments with biomedical literature search. PMID:22255598

  4. CorSig: A General Framework for Estimating Statistical Significance of Correlation and Its Application to Gene Co-Expression Analysis

    PubMed Central

    Wang, Hong-Qiang; Tsai, Chung-Jui

    2013-01-01

    With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. Software

  5. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    PubMed

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary

  6. Searching for coexpressed genes in three-color cDNA microarray data using a probabilistic model-based Hough Transform.

    PubMed

    Tino, Peter; Zhao, Hongya; Yan, Hong

    2011-01-01

    The effects of a drug on the genomic scale can be assessed in a three-color cDNA microarray with the three color intensities represented through the so-called hexaMplot. In our recent study, we have shown that the Hough Transform (HT) applied to the hexaMplot can be used to detect groups of coexpressed genes in the normal-disease-drug samples. However, the standard HT is not well suited for the purpose because 1) the assayed genes need first to be hard-partitioned into equally and differentially expressed genes, with HT ignoring possible information in the former group; 2) the hexaMplot coordinates are negatively correlated and there is no direct way of expressing this in the standard HT and 3) it is not clear how to quantify the association of coexpressed genes with the line along which they cluster. We address these deficiencies by formulating a dedicated probabilistic model-based HT. The approach is demonstrated by assessing effects of the drug Rg1 on homocysteine-treated human umbilical vein endothetial cells. Compared with our previous study, we robustly detect stronger natural groupings of coexpressed genes. Moreover, the gene groups show coherent biological functions with high significance, as detected by the Gene Ontology analysis.

  7. Coexpression of pyruvate decarboxylase and alcohol dehydrogenase genes in Lactobacillus brevis.

    PubMed

    Liu, Siqing; Dien, Bruce S; Nichols, Nancy N; Bischoff, Kenneth M; Hughes, Stephen R; Cotta, Michael A

    2007-09-01

    Lactobacillus brevis ATCC367 was engineered to express pyruvate decarboxylase (PDC) and alcohol dehydrogenase (ADH) genes in order to increase ethanol fermentation from biomass-derived residues. First, a Gram-positive Sarcina ventriculi PDC gene (Svpdc) was introduced into L. brevis ATCC 367 to obtain L. brevis bbc03. The SvPDC was detected by immunoblot using an SvPDC oligo peptide antiserum, but no increased ethanol was detected in L. brevis bbc03. Then, an ADH gene from L. brevis (Bradh) was cloned behind the Svpdc gene that generated a pdc/adh-coupled ethanol cassette pBBC04. The pBBC04 restored anaerobic growth and conferred ethanol production of Escheirichia coli NZN111 (a fermentative defective strain incapable of growing anaerobically). Approximately 58 kDa (SvPDC) and 28 kDa (BrADH) recombinant proteins were observed in L. brevis bbc04. These results indicated that the Gram-positive ethanol production genes can be expressed in L. brevis using a Gram-positive promoter and pTRKH2 shuttle vector. This work provides evidence that expressing Gram-positive ethanol genes in pentose utilizing L. brevis will further aid manipulation of this microbe toward biomass to ethanol production.

  8. Co-expression of plasmid-mediated quinolone resistance-qnrA1 and blaVEB-1 gene in a Providencia stuartii strain.

    PubMed

    Nazik, Hasan; Bektöre, Bayhan; Öngen, Betigül; Özyurt, Mustafa; Baylan, Orhan; Haznedaroğlu, Tunçer

    2011-04-01

    An extended-spectrum B-lactamase (ESBL)-producing Providencia stuartii isolate was studied. A qnrA1 gene co-expressing blaVEB-1 gene was detected. Both genes were transferred to the recipient strain. The ciprofloxacin MIC of recipient strain increased tenfold. The blaVEB-1 gene persisted in microorganisms in Turkey but it also spread with PMQR genes to other species. The combination of PMQR with multidrug resistant isolates producing ESBLs may compromise the use of valuable antibiotics. Serious efforts are necessary to detect PMQR determinants not only with common B-lactamases in widespread pathogens but also with uncommon forms that are encountered infrequently.

  9. Gene Co-Expression Analysis Inferring the Crosstalk of Ethylene and Gibberellin in Modulating the Transcriptional Acclimation of Cassava Root Growth in Different Seasons

    PubMed Central

    Saithong, Treenut; Saerue, Samorn; Kalapanulak, Saowalak; Sojikul, Punchapat; Narangajavana, Jarunya; Bhumiratana, Sakarindr

    2015-01-01

    Cassava is a crop of hope for the 21st century. Great advantages of cassava over other crops are not only the capacity of carbohydrates, but it is also an easily grown crop with fast development. As a plant which is highly tolerant to a poor environment, cassava has been believed to own an effective acclimation process, an intelligent mechanism behind its survival and sustainability in a wide range of climates. Herein, we aimed to investigate the transcriptional regulation underlying the adaptive development of a cassava root to different seasonal cultivation climates. Gene co-expression analysis suggests that AP2-EREBP transcription factor (ERF1) orthologue (D142) played a pivotal role in regulating the cellular response to exposing to wet and dry seasons. The ERF shows crosstalk with gibberellin, via ent-Kaurene synthase (D106), in the transcriptional regulatory network that was proposed to modulate the downstream regulatory system through a distinct signaling mechanism. While sulfur assimilation is likely to be a signaling regulation for dry crop growth response, calmodulin-binding protein is responsible for regulation in the wet crop. With our initiative study, we hope that our findings will pave the way towards sustainability of cassava production under various kinds of stress considering the future global climate change. PMID:26366737

  10. Ethanol production by Escherichia coli strains co-expressing Zymomonas PDC and ADH genes

    DOEpatents

    Ingram, Lonnie O.; Conway, Tyrrell; Alterthum, Flavio

    1991-01-01

    A novel operon and plasmids comprising genes which code for the alcohol dehydrogenase and pyruvate decarboxylase activities of Zymomonas mobilis are described. Also disclosed are methods for increasing the growth of microorganisms or eukaryotic cells and methods for reducing the accumulation of undesirable metabolic products in the growth medium of microorganisms or cells.

  11. Systems toxicology of chemically induced liver and kidney injuries: histopathology-associated gene co-expression modules.

    PubMed

    Te, Jerez A; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2016-09-01

    Organ injuries caused by environmental chemical exposures or use of pharmaceutical drugs pose a serious health risk that may be difficult to assess because of a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific histopathology outcomes via biomarkers will provide a foundation for designing precise and robust diagnostic tests. We identified co-expressed genes (modules) specific to injury endpoints using the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System (TG-GATEs) - a toxicogenomics database containing organ-specific gene expression data matched to dose- and time-dependent chemical exposures and adverse histopathology assessments in Sprague-Dawley rats. We proposed a protocol for selecting gene modules associated with chemical-induced injuries that classify 11 liver and eight kidney histopathology endpoints based on dose-dependent activation of the identified modules. We showed that the activation of the modules for a particular chemical exposure condition, i.e., chemical-time-dose combination, correlated with the severity of histopathological damage in a dose-dependent manner. Furthermore, the modules could distinguish different types of injuries caused by chemical exposures as well as determine whether the injury module activation was specific to the tissue of origin (liver and kidney). The generated modules provide a link between toxic chemical exposures, different molecular initiating events among underlying molecular pathways and resultant organ damage. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. Journal of Applied Toxicology published by John Wiley & Sons, Ltd. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. Journal of Applied Toxicology published by John Wiley & Sons, Ltd.

  12. Gene selection and cloning approaches for co-expression and production of recombinant protein-protein complexes.

    PubMed

    Babnigg, György; Jedrzejczak, Robert; Nocek, Boguslaw; Stein, Adam; Eschenfeldt, William; Stols, Lucy; Marshall, Norman; Weger, Alicia; Wu, Ruiying; Donnelly, Mark; Joachimiak, Andrzej

    2015-12-01

    Multiprotein complexes play essential roles in all cells and X-ray crystallography can provide unparalleled insight into their structure and function. Many of these complexes are believed to be sufficiently stable for structural biology studies, but the production of protein-protein complexes using recombinant technologies is still labor-intensive. We have explored several strategies for the identification and cloning of heterodimers and heterotrimers that are compatible with the high-throughput (HTP) structural biology pipeline developed for single proteins. Two approaches are presented and compared which resulted in co-expression of paired genes from a single expression vector. Native operons encoding predicted interacting proteins were selected from a repertoire of genomes, and cloned directly to expression vector. In an alternative approach, Helicobacter pylori proteins predicted to interact strongly were cloned, each associated with translational control elements, then linked into an artificial operon. Proteins were then expressed and purified by standard HTP protocols, resulting to date in the structure determination of two H. pylori complexes.

  13. Recombinant flavin-dependent halogenases are functional in tobacco chloroplasts without co-expression of flavin reductase genes.

    PubMed

    Fräbel, Sabine; Krischke, Markus; Staniek, Agata; Warzecha, Heribert

    2016-12-01

    Halogenation of natural compounds in planta is rare. Herein, a successful engineering of tryptophan 6-halogenation into the plant context by heterologous expression of the Streptomyces toxytricini Stth gene and localization of its enzymatic product in various tobacco cell compartments is described. When co-expressed with the flavin reductase rebF from Lechevalieria aerocolonigenes, Stth efficiently produced chlorinated tryptophan in the cytosol. Further, supplementation of KBr yielded the brominated metabolite. More strikingly, targeting of the protein to the chloroplasts enabled effective halogenation of tryptophan even in absence of the partner reductase, providing crucial evidence for sufficient, organelle-specific supply of the FADH2 cofactor to drive halogen integration. Incorporation of an alternative enzyme, the 7-halogenase RebH from L. aerocolonigenes, into the metabolic set-up resulted in the formation of 6,7-dichlorotryptophan. Finally, expression of tryptophan decarboxylase (tdc) in concert with stth led to the generation of 6-chlorotryptamine, a new-to-nature precursor of monoterpenoid indole alkaloids. In sum, the report highlights the tremendous application potential of plants as a unique chassis for the engineering of rare and valuable halogenated natural products, with chloroplasts as the cache of reduction equivalents driving metabolic reactions. Copyright © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Molecular Cloning and Co-Expression of Phytoene Synthase Gene from Kocuria gwangalliensis in Escherichia coli.

    PubMed

    Seo, Yong Bae; Choi, Seong-Seok; Lee, Jong Kyu; Kim, Nan-Hee; Choi, Mi Jin; Kim, Jong-Myoung; Jeong, Tae Hyug; Nam, Soo-Wan; Lim, Han Kyu; Kim, Gun-Do

    2015-11-01

    A phytoene synthase gene, crtB, was isolated from Kocuria gwangalliensis. The crtB with 1,092 bp full-length has a coding sequence of 948 bp and encodes a 316-amino-acids protein. The deduced amino acid sequence showed a 70.9% identity with a putative phytoene synthase from K. rhizophila. An expression plasmid, pCcrtB, containing the crtB gene was constructed, and E. coli cells containing this plasmid produced the recombinant protein of approximately 34 kDa , corresponding to the molecular mass of phytoene synthase. Biosynthesis of lycopene was confirmed when the plasmid pCcrtB was co-transformed into E. coli containing pRScrtEI carrying the crtE and crtI genes encoding lycopene biosynthetic pathway enzymes. The results obtained from this study will provide a base of knowledge about the phytoene synthase of K. gwangalliensis and can be applied to the production of carotenoids in a non-carotenoidproducing host.

  15. Informed walks: whispering hints to gene hunters inside networks' jungle.

    PubMed

    Bourdakou, Marilena M; Spyrou, George M

    2017-10-11

    Systemic approaches offer a different point of view on the analysis of several types of molecular associations as well as on the identification of specific gene communities in several cancer types. However, due to lack of sufficient data needed to construct networks based on experimental evidence, statistical gene co-expression networks are widely used instead. Many efforts have been made to exploit the information hidden in these networks. However, these approaches still need to capitalize comprehensively the prior knowledge encrypted into molecular pathway associations and improve their efficiency regarding the discovery of both exclusive subnetworks as candidate biomarkers and conserved subnetworks that may uncover common origins of several cancer types. In this study we present the development of the Informed Walks model based on random walks that incorporate information from molecular pathways to mine candidate genes and gene-gene links. The proposed model has been applied to TCGA (The Cancer Genome Atlas) datasets from seven different cancer types, exploring the reconstructed co-expression networks of the whole set of genes and driving to highlighted sub-networks for each cancer type. In the sequel, we elucidated the impact of each subnetwork on the indication of underlying exclusive and common molecular mechanisms as well as on the short-listing of drugs that have the potential to suppress the corresponding cancer type through a drug-repurposing pipeline. We have developed a method of gene subnetwork highlighting based on prior knowledge, capable to give fruitful insights regarding the underlying molecular mechanisms and valuable input to drug-repurposing pipelines for a variety of cancer types.

  16. Coexpression and Secretion of Endoglucanase and Phytase Genes in Lactobacillus reuteri

    PubMed Central

    Wang, Lei; Yang, Yuxin; Cai, Bei; Cao, Pinghua; Yang, Mingming; Chen, Yulin

    2014-01-01

    A multifunctional transgenic Lactobacillus with probiotic characteristics and an ability to degrade β-glucan and phytic acid (phytate) was engineered to improve nutrient utilization, increase production performance and decrease digestive diseases in broiler chickens. The Bacillus subtilis WL001 endoglucanase gene (celW) and Aspergillus fumigatus WL002 phytase gene (phyW) mature peptide (phyWM) were cloned into an expression vector with the lactate dehydrogenase promoter of Lactobacillus casei and the secretion signal peptide of the Lactococcus lactis usp45 gene. This construct was then transformed into Lactobacillus reuteri XC1 that had been isolated from the gastrointestinal tract of broilers. Heterologous enzyme production and feed effectiveness of this genetically modified L. reuteri strain were investigated and evaluated. Sodium dodecyl sulfate polyacrylamide gel electrophoresis analysis showed that the molecular mass of phyWM and celW was approximately 48.2 and 55 kDa, respectively, consistent with their predicted molecular weights. Endoglucanase and phytase activities in the extracellular fraction of the transformed L. reuteri culture were 0.68 and 0.42 U/mL, respectively. Transformed L. reuteri improved the feed conversion ratio of broilers from 21 to 42 days of age and over the whole feeding period. However, there was no effect on body weight gain and feed intake of chicks. Transformed L. reuteri supplementation improved levels of ash, calcium and phosphorus in tibiae at day 21 and of phosphorus at day 42. In addition, populations of Escherichia coli, Veillonella spp. and Bacteroides vulgatus were decreased, while populations of Bifidobacterium genus and Lactobacillus spp. were increased in the cecum at day 21. PMID:25050780

  17. Coexpression and secretion of endoglucanase and phytase genes in Lactobacillus reuteri.

    PubMed

    Wang, Lei; Yang, Yuxin; Cai, Bei; Cao, Pinghua; Yang, Mingming; Chen, Yulin

    2014-07-21

    A multifunctional transgenic Lactobacillus with probiotic characteristics and an ability to degrade β-glucan and phytic acid (phytate) was engineered to improve nutrient utilization, increase production performance and decrease digestive diseases in broiler chickens. The Bacillus subtilis WL001 endoglucanase gene (celW) and Aspergillus fumigatus WL002 phytase gene (phyW) mature peptide (phyWM) were cloned into an expression vector with the lactate dehydrogenase promoter of Lactobacillus casei and the secretion signal peptide of the Lactococcus lactis usp45 gene. This construct was then transformed into Lactobacillus reuteri XC1 that had been isolated from the gastrointestinal tract of broilers. Heterologous enzyme production and feed effectiveness of this genetically modified L. reuteri strain were investigated and evaluated. Sodium dodecyl sulfate polyacrylamide gel electrophoresis analysis showed that the molecular mass of phyWM and celW was approximately 48.2 and 55 kDa, respectively, consistent with their predicted molecular weights. Endoglucanase and phytase activities in the extracellular fraction of the transformed L. reuteri culture were 0.68 and 0.42 U/mL, respectively. Transformed L. reuteri improved the feed conversion ratio of broilers from 21 to 42 days of age and over the whole feeding period. However, there was no effect on body weight gain and feed intake of chicks. Transformed L. reuteri supplementation improved levels of ash, calcium and phosphorus in tibiae at day 21 and of phosphorus at day 42. In addition, populations of Escherichia coli, Veillonella spp. and Bacteroides vulgatus were decreased, while populations of Bifidobacterium genus and Lactobacillus spp. were increased in the cecum at day 21.

  18. Production of natural fragrance aromatic acids by coexpression of trans-anethole oxygenase and p-anisaldehyde dehydrogenase genes of Pseudomonas putida JYR-1 in Escherichia coli.

    PubMed

    Han, Dongfei; Kurusarttra, Somwang; Ryu, Ji-Young; Kanaly, Robert A; Hur, Hor-Gil

    2012-12-05

    A gene encoding p-anisaldehyde dehydrogenase (PAADH), which catalyzes the oxidation of p-anisaldehyde to p-anisic acid, was identified to be clustered with the trans-anethole oxygenase (tao) gene in Pseudomonas putida JYR-1. Heterologously expressed PAADH in Escherichia coli catalyzed the oxidation of vanillin, veratraldehyde, and piperonal to the corresponding aromatic acids vanillic acid, veratric acid, and piperonylic acid, respectively. Coexpression of trans-anethole oxygenase (TAO) and PAADH in E. coli also resulted in the successful transformation of trans-anethole, isoeugenol, O-methyl isoeugenol, and isosafrole to p-anisic acid, vanillic acid, veratric acid, and piperonylic acid, respectively, which are compounds found in plants as secondary metabolites. Because of the relaxed substrate specificity and high transformation rates by coexpressed TAO and PAADH in E. coli , the engineered strain has potential to be applied in the fragrance industry.

  19. Genome-scale co-expression network comparison across Escherichia coli and Salmonella enterica serovar Typhimurium reveals significant conservation at the regulon level of local regulators despite their dissimilar lifestyles.

    PubMed

    Zarrineh, Peyman; Sánchez-Rodríguez, Aminael; Hosseinkhan, Nazanin; Narimani, Zahra; Marchal, Kathleen; Masoudi-Nejad, Ali

    2014-01-01

    Availability of genome-wide gene expression datasets provides the opportunity to study gene expression across different organisms under a plethora of experimental conditions. In our previous work, we developed an algorithm called COMODO (COnserved MODules across Organisms) that identifies conserved expression modules between two species. In the present study, we expanded COMODO to detect the co-expression conservation across three organisms by adapting the statistics behind it. We applied COMODO to study expression conservation/divergence between Escherichia coli, Salmonella enterica, and Bacillus subtilis. We observed that some parts of the regulatory interaction networks were conserved between E. coli and S. enterica especially in the regulon of local regulators. However, such conservation was not observed between the regulatory interaction networks of B. subtilis and the two other species. We found co-expression conservation on a number of genes involved in quorum sensing, but almost no conservation for genes involved in pathogenicity across E. coli and S. enterica which could partially explain their different lifestyles. We concluded that despite their different lifestyles, no significant rewiring have occurred at the level of local regulons involved for instance, and notable conservation can be detected in signaling pathways and stress sensing in the phylogenetically close species S. enterica and E. coli. Moreover, conservation of local regulons seems to depend on the evolutionary time of divergence across species disappearing at larger distances as shown by the comparison with B. subtilis. Global regulons follow a different trend and show major rewiring even at the limited evolutionary distance that separates E. coli and S. enterica.

  20. Genome-Scale Co-Expression Network Comparison across Escherichia coli and Salmonella enterica Serovar Typhimurium Reveals Significant Conservation at the Regulon Level of Local Regulators Despite Their Dissimilar Lifestyles

    PubMed Central

    Zarrineh, Peyman; Sánchez-Rodríguez, Aminael; Hosseinkhan, Nazanin; Narimani, Zahra; Marchal, Kathleen; Masoudi-Nejad, Ali

    2014-01-01

    Availability of genome-wide gene expression datasets provides the opportunity to study gene expression across different organisms under a plethora of experimental conditions. In our previous work, we developed an algorithm called COMODO (COnserved MODules across Organisms) that identifies conserved expression modules between two species. In the present study, we expanded COMODO to detect the co-expression conservation across three organisms by adapting the statistics behind it. We applied COMODO to study expression conservation/divergence between Escherichia coli, Salmonella enterica, and Bacillus subtilis. We observed that some parts of the regulatory interaction networks were conserved between E. coli and S. enterica especially in the regulon of local regulators. However, such conservation was not observed between the regulatory interaction networks of B. subtilis and the two other species. We found co-expression conservation on a number of genes involved in quorum sensing, but almost no conservation for genes involved in pathogenicity across E. coli and S. enterica which could partially explain their different lifestyles. We concluded that despite their different lifestyles, no significant rewiring have occurred at the level of local regulons involved for instance, and notable conservation can be detected in signaling pathways and stress sensing in the phylogenetically close species S. enterica and E. coli. Moreover, conservation of local regulons seems to depend on the evolutionary time of divergence across species disappearing at larger distances as shown by the comparison with B. subtilis. Global regulons follow a different trend and show major rewiring even at the limited evolutionary distance that separates E. coli and S. enterica. PMID:25101984

  1. Enhanced polyhydroxybutyrate (PHB) production via the coexpressed phaCAB and vgb genes controlled by arabinose P promoter in Escherichia coli.

    PubMed

    Horng, Y-T; Chang, K-C; Chien, C-C; Wei, Y-H; Sun, Y-M; Soo, P-C

    2010-02-01

    To develop an approach to enhance polyhydroxybutyrate (PHB) production via the coexpressed phaCAB and vgb genes controlled by arabinose P(BAD) promoter in Escherichia coli. The polyhydroxyalkanoates (PHAs) synthesis operon, (phaCAB), from Ralstonia eutropha was overexpressed under the regulation of the arabinose P(BAD) promoter in Escherichia coli, and the vgb gene encoding bacterial haemoglobin from Vitreoscilla stercoraria (VHb) was further cloned at downstream of phaCAB to form an artificial operon. The cell dry weight (CDW), PHB content and PHB concentration were enhanced around 1.23-, 1.57-, and 1.93-fold in the engineered cell harbouring phaCAB-vgb (SY-2) upon 1% arabinose induction compared with noninduction (0% arabinose). Furthermore, by using a recombinant strain harbouring P(BAD) promoter-vgb along with native promoter-phaCAB construction, the effect of vgb expression level on PHB biosynthesis was positive correlation. The results exploit the possibility to improve the PHB production by fusing the genes phaCAB-vgb from different species under the arabinose regulation system in E. coli. It also demonstrates that increase in VHb level enhances the PHB production. We were successful in providing a new coexpressed system for PHB synthesis in E. coli. This coexpressed system could be regulated by arabinose inducer, and is more stable and cheaper than other induced systems (e.g. IPTG). Furthermore, it could be applied in many biotechnology or fermentation processes.

  2. Intraisolate Mitochondrial Genetic Polymorphism and Gene Variants Coexpression in Arbuscular Mycorrhizal Fungi

    PubMed Central

    Beaudet, Denis; de la Providencia, Ivan Enrique; Labridy, Manuel; Roy-Bolduc, Alice; Daubois, Laurence; Hijri, Mohamed

    2015-01-01

    Arbuscular mycorrhizal fungi (AMF) are multinucleated and coenocytic organisms, in which the extent of the intraisolate nuclear genetic variation has been a source of debate. Conversely, their mitochondrial genomes (mtDNAs) have appeared to be homogeneous within isolates in all next generation sequencing (NGS)-based studies. Although several lines of evidence have challenged mtDNA homogeneity in AMF, extensive survey to investigate intraisolate allelic diversity has not previously been undertaken. In this study, we used a conventional polymerase chain reaction -based approach on selected mitochondrial regions with a high-fidelity DNA polymerase, followed by cloning and Sanger sequencing. Two isolates of Rhizophagus irregularis were used, one cultivated in vitro for several generations (DAOM-197198) and the other recently isolated from the field (DAOM-242422). At different loci in both isolates, we found intraisolate allelic variation within the mtDNA and in a single copy nuclear marker, which highlighted the presence of several nonsynonymous mutations in protein coding genes. We confirmed that some of this variation persisted in the transcriptome, giving rise to at least four distinct nad4 transcripts in DAOM-197198. We also detected the presence of numerous mitochondrial DNA copies within nuclear genomes (numts), providing insights to understand this important evolutionary process in AMF. Our study reveals that genetic variation in Glomeromycota is higher than what had been previously assumed and also suggests that it could have been grossly underestimated in most NGS-based AMF studies, both in mitochondrial and nuclear genomes, due to the presence of low-level mutations. PMID:25527836

  3. Identification of co-expressed gene signatures in mouse B1, marginal zone and B2 B-cell populations

    PubMed Central

    Mabbott, Neil A; Gray, David

    2014-01-01

    In mice, three major B-cell subsets have been identified with distinct functionalities: B1 B cells, marginal zone B cells and follicular B2 B cells. Here, we used the growing body of publicly available transcriptomics data to create an expression atlas of 84 gene expression microarray data sets of distinct mouse B-cell subsets. These data were subjected to network-based cluster analysis using BioLayout Express3D. Using this analysis tool, genes with related functions clustered together in discrete regions of the network graph and enabled the identification of transcriptional networks that underpinned the functional activity of distinct cell populations. Some gene clusters were expressed highly by most of the cell populations included in this analysis (such as those with activity related to house-keeping functions). Others contained genes with expression patterns specific to distinct B-cell subsets. While these clusters contained many genes typically associated with the activity of the cells they were specifically expressed in, many novel B-cell-subset-specific candidate genes were identified. A large number of uncharacterized genes were also represented in these B-cell lineage-specific clusters. Further analysis of the activities of these uncharacterized candidate genes will lead to the identification of novel B-cell lineage-specific transcription factors and regulators of B-cell function. We also analysed 36 microarray data sets from distinct human B-cell populations. These data showed that mouse and human germinal centre B cells shared similar transcriptional features, whereas mouse B1 B cells were distinct from proposed human B1 B cells. PMID:24032749

  4. Introduction: Cancer Gene Networks.

    PubMed

    Clarke, Robert

    2017-01-01

    Constructing, evaluating, and interpreting gene networks generally sits within the broader field of systems biology, which continues to emerge rapidly, particular with respect to its application to understanding the complexity of signaling in the context of cancer biology. For the purposes of this volume, we take a broad definition of systems biology. Considering an organism or disease within an organism as a system, systems biology is the study of the integrated and coordinated interactions of the network(s) of genes, their variants both natural and mutated (e.g., polymorphisms, rearrangements, alternate splicing, mutations), their proteins and isoforms, and the organic and inorganic molecules with which they interact, to execute the biochemical reactions (e.g., as enzymes, substrates, products) that reflect the function of that system. Central to systems biology, and perhaps the only approach that can effectively manage the complexity of such systems, is the building of quantitative multiscale predictive models. The predictions of the models can vary substantially depending on the nature of the model and its inputoutput relationships. For example, a model may predict the outcome of a specific molecular reaction(s), a cellular phenotype (e.g., alive, dead, growth arrest, proliferation, and motility), a change in the respective prevalence of cell or subpopulations, a patient or patient subgroup outcome(s). Such models necessarily require computers. Computational modeling can be thought of as using machine learning and related tools to integrate the very high dimensional data generated from modern, high throughput omics technologies including genomics (next generation sequencing), transcriptomics (gene expression microarrays; RNAseq), metabolomics and proteomics (ultra high performance liquid chromatography, mass spectrometry), and "subomic" technologies to study the kinome, methylome, and others. Mathematical modeling can be thought of as the use of ordinary

  5. Enhanced resistance to Sclerotinia sclerotiorum in Brassica napus by co-expression of defensin and chimeric chitinase genes.

    PubMed

    Zarinpanjeh, Nasim; Motallebi, Mostafa; Zamani, Mohammad Reza; Ziaei, Mahboobeh

    2016-11-01

    Sclerotinia stem rot caused by Sclerotinia sclerotiorum is one of the major fungal diseases of Brassica napus L. To develop resistance against this fungal disease, the defensin gene from Raphanus sativus and chimeric chit42 from Trichoderma atroviride with a C-terminal fused chitin-binding domain from Serratia marcescens were co-expressed in canola via Agrobacterium-mediated transformation. Twenty transformants were confirmed to carry the two transgenes as detected by polymerase chain reaction (PCR), with 4.8 % transformation efficiency. The chitinase activity of PCR-positive transgenic plants were measured in the presence of colloidal chitin, and five transgenic lines showing the highest chitinase activity were selected for checking the copy number of the transgenes through Southern blot hybridisation. Two plants carried a single copy of the transgenes, while the remainder carried either two or three copies of the transgenes. The antifungal activity of two transgenic lines that carried a single copy of the transgenes (T4 and T10) was studied by a radial diffusion assay. It was observed that the constitutive expression of these transgenes in the T4 and T10 transgenic lines suppressed the growth of S. sclerotiorum by 49 % and 47 %, respectively. The two transgenic lines were then let to self-pollinate to produce the T2 generation. Greenhouse bioassays were performed on the transgenic T2 young leaves by challenging with S. sclerotiorum and the results revealed that the expression of defensin and chimeric chitinase from a heterologous source in canola demonstrated enhanced resistance against sclerotinia stem rot disease.

  6. Linking Gene Expression and Functional Network Data in Human Heart Failure

    PubMed Central

    Camargo, Anyela; Azuaje, Francisco

    2007-01-01

    Background Gene expression profiling and the analysis of protein-protein interaction (PPI) networks may support the identification of disease bio-markers and potential drug targets. Thus, a step forward in the development of systems approaches to medicine is the integrative analysis of these data sources in specific pathological conditions. We report such an integrative bioinformatics analysis in human heart failure (HF). A global PPI network in HF was assembled, which by itself represents a useful compendium of the current status of human HF-relevant interactions. This provided the basis for the analysis of interaction connectivity patterns in relation to a HF gene expression data set. Results Relationships between the significance of the differentiation of gene expression and connectivity degrees in the PPI network were established. In addition, relationships between gene co-expression and PPI network connectivity were analysed. Highly-connected proteins are not necessarily encoded by genes significantly differentially expressed. Genes that are not significantly differentially expressed may encode proteins that exhibit diverse network connectivity patterns. Furthermore, genes that were not defined as significantly differentially expressed may encode proteins with many interacting partners. Genes encoding network hubs may exhibit weak co-expression with the genes encoding their interacting protein partners. We also found that hubs and superhubs display a significant diversity of co-expression patterns in comparison to peripheral nodes. Gene Ontology (GO) analysis established that highly-connected proteins are likely to be engaged in higher level GO biological process terms, while low-connectivity proteins tend to be engaged in more specific disease-related processes. Conclusion This investigation supports the hypothesis that the integrative analysis of differential gene expression and PPI network analysis may facilitate a better understanding of functional roles

  7. Dynamic Visualization of Co-expression in Systems Genetics Data

    SciTech Connect

    New, Joshua Ryan; Huang, Jian; Chesler, Elissa J

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biological networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.

  8. Stress-induced co-expression of two alternative oxidase (VuAox1 and 2b) genes in Vigna unguiculata.

    PubMed

    Costa, José Hélio; Mota, Erika Freitas; Cambursano, Mariana Virginia; Lauxmann, Martin Alexander; de Oliveira, Luciana Maia Nogueira; Silva Lima, Maria da Guia; Orellano, Elena Graciela; Fernandes de Melo, Dirce

    2010-05-01

    Cowpea (Vigna unguiculata) alternative oxidase is encoded by a small multigene family (Aox1, 2a and 2b) that is orthologous to the soybean Aox family. Like most of the identified Aox genes in plants, VuAox1 and VuAox2 consist of 4 exons interrupted by 3 introns. Alignment of the orthologous Aox genes revealed high identity of exons and intron variability, which is more prevalent in Aox1. In order to determine Aox gene expression in V. unguiculata, a steady-state analysis of transcripts involved in seed development (flowers, pods and dry seeds) and germination (soaked seeds) was performed and systemic co-expression of VuAox1 and VuAox2b was observed during germination. The analysis of Aox transcripts in leaves from seedlings under different stress conditions (cold, PEG, salicylate and H2O2 revealed stress-induced co-expression of both VuAox genes. Transcripts of VuAox2a and 2b were detected in all control seedlings, which was not the case for VuAox1 mRNA. Estimation of the primary transcript lengths of V. unguiculata and soybean Aox genes showed an intron length reduction for VuAox1 and 2b, suggesting that the two genes have converged in transcribed sequence length. Indeed, a bioinformatics analysis of VuAox1 and 2b promoters revealed a conserved region related to a cis-element that is responsive to oxidative stress. Taken together, the data provide evidence for co-expression of Aox1 and Aox2b in response to stress and also during the early phase of seed germination. The dual nature of VuAox2b expression (constitutive and induced) suggests that the constitutive Aox2b gene of V. unguiculata has acquired inducible regulatory elements.

  9. A SoxC gene related to larval shell development and co-expression analysis of different shell formation genes in early larvae of oyster.

    PubMed

    Liu, Gang; Huan, Pin; Liu, Baozhong

    2017-06-01

    Among the potential larval shell formation genes in mollusks, most are expressed in cells surrounding the shell field during the early phase of shell formation. The only exception (cgi-tyr1) is expressed in the whole larval mantle and thus represents a novel type of expression pattern. This study reports another gene with such an expression pattern. The gene encoded a SoxC homolog of the Pacific oyster Crassostrea gigas and was named cgi-soxc. Whole-mount in situ hybridization revealed that the gene was highly expressed in the whole larval mantle of early larvae. Based on its spatiotemporal expression, cgi-soxc is hypothesized to be involved in periostracum biogenesis, biomineralization, and regulation of cell proliferation. Furthermore, we investigated the interrelationship between cgi-soxc expression and two additional potential shell formation genes, cgi-tyr1 and cgi-gata2/3. The results confirmed co-expression of the three genes in the larval mantle of early D-veliger. Nevertheless, cgi-gata2/3 was only expressed in the mantle edge, and the other two genes were expressed in all mantle cells. Based on the spatial expression patterns of the three genes, two cell groups were identified from the larval mantle (tyr1 (+)/soxc (+)/gata2/3 (+) cells and tyr1 (+)/soxc (+)/gata2/3 (-) cells) and are important to study the differentiation and function of this tissue. The results of this study enrich our knowledge on the structure and function of larval mantle and provide important information to understand the molecular mechanisms of larval shell formation.

  10. Co-expressed differentially expressed genes and long non-coding RNAs involved in the celecoxib treatment of gastric cancer: An RNA sequencing analysis

    PubMed Central

    Song, Bin; Du, Juan; Feng, Ye; Gao, Yong-Jian; Zhao, Ji-Sheng

    2016-01-01

    The aim of the present study was to investigate the mechanisms of long non-coding RNAs (lncRNAs) in a gastric cancer cell line treated with celecoxib. The human gastric carcinoma cell line NCI-N87 was treated with 15 µM celecoxib for 72 h (celecoxib group) and an equal volume of dimethylsulfoxide (control group), respectively. Libraries were constructed by NEBNext Ultra RNA Library Prep kit for Illumina. Paired-end RNA sequencing reads were aligned to a human hg19 reference genome using TopHat2. Differentially expressed genes (DEGs) and lncRNAs were identified using Cuffdiff. Enrichment analysis was performed using GO-function package and KEGG profile in Bioconductor. A protein-protein interaction network was constructed using STRING database and module analysis was performed using ClusterONE plugin of Cytoscape. ATP5G1, ATP5G3, COX8A, CYC1, NDUFS3, UQCRC1, UQCRC2 and UQCRFS1 were enriched in the oxidative phosphorylation pathway. CXCL1, CXCL3, CXCL5 and CXCL8 were enriched in the chemokine signaling and cytokine-cytokine receptor interaction pathways. ITGA3, ITGA6, ITGB4, ITGB5, ITGB6 and ITGB8 were enriched in the integrin-mediated signaling pathway. DEGs co-expressed with lnc-SCD-1:13, lnc-LRR1-1:2, lnc-PTMS-1:3, lnc-S100P-3:1, lnc-AP000974.1-1:1 and lnc-RAB3IL1-2:1 were enriched in the pathways associated with cancer, such as the basal cell carcinoma pathway in cancer. In conclusion, these DEGs and differentially expressed lncRNAs may be important in the celecoxib treatment of gastric cancer. PMID:27698747

  11. Gene co-expression network analysis identifies porcine genes associated with variation in Salmonella shedding

    USDA-ARS?s Scientific Manuscript database

    Salmonella enterica serovar Typhimurium is a gram-negative bacterium that can colonize the gut of humans and several species of food producing farm animals to cause enteric or septicaemic salmonellosis. While many studies have looked into the host genetic response to Salmonella infection, relatively...

  12. Co-expression of interleukin 12 enhances antitumor effects of a novel chimeric promoter-mediated suicide gene therapy in an immunocompetent mouse model

    SciTech Connect

    Xu, Yu; Liu, Zhengchun; Kong, Haiyan; Sun, Wenjie; Liao, Zhengkai; Zhou, Fuxiang; Xie, Conghua; and others

    2011-09-09

    Highlights: {yields} A novel chimeric promoter consisting of CArG element and hTERT promoter was developed. {yields} The promoter was characterized with radiation-inducibility and tumor-specificity. {yields} Suicide gene system driven by the promoter showed remarkable cytotoxicity in vitro. {yields} Co-expression of IL12 enhanced the promoter mediated suicide gene therapy in vivo. -- Abstract: The human telomerase reverse transcriptase (hTERT) promoter has been widely used in target gene therapy of cancer. However, low transcriptional activity limited its clinical application. Here, we designed a novel dual radiation-inducible and tumor-specific promoter system consisting of CArG elements and the hTERT promoter, resulting in increased expression of reporter genes after gamma-irradiation. Therapeutic and side effects of adenovirus-mediated horseradish peroxidase (HRP)/indole-3-acetic (IAA) system downstream of the chimeric promoter were evaluated in mice bearing Lewis lung carcinoma, combining with or without adenovirus-mediated interleukin 12 (IL12) gene driven by the cytomegalovirus promoter. The combination treatment showed more effective suppression of tumor growth than those with single agent alone, being associated with pronounced intratumoral T-lymphocyte infiltration and minor side effects. Our results suggest that the combination treatment with HRP/IAA system driven by the novel chimeric promoter and the co-expression of IL12 might be an effective and safe target gene therapy strategy of cancer.

  13. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  14. Single-nucleotide polymorphism-gene intermixed networking reveals co-linkers connected to multiple gene expression phenotypes

    PubMed Central

    Gong, Bin-Sheng; Zhang, Qing-Pu; Zhang, Guang-Mei; Zhang, Shao-Jun; Zhang, Wei; Lv, Hong-Chao; Zhang, Fan; Lv, Sa-Li; Li, Chuan-Xing; Rao, Shao-Qi; Li, Xia

    2007-01-01

    Gene expression profiles and single-nucleotide polymorphism (SNP) profiles are modern data for genetic analysis. It is possible to use the two types of information to analyze the relationships among genes by some genetical genomics approaches. In this study, gene expression profiles were used as expression traits. And relationships among the genes, which were co-linked to a common SNP(s), were identified by integrating the two types of information. Further research on the co-expressions among the co-linked genes was carried out after the gene-SNP relationships were established using the Haseman-Elston sib-pair regression. The results showed that the co-expressions among the co-linked genes were significantly higher if the number of connections between the genes and a SNP(s) was more than six. Then, the genes were interconnected via one or more SNP co-linkers to construct a gene-SNP intermixed network. The genes sharing more SNPs tended to have a stronger correlation. Finally, a gene-gene network was constructed with their intensities of relationships (the number of SNP co-linkers shared) as the weights for the edges. PMID:18466544

  15. Reverse engineering transcriptional gene networks.

    PubMed

    Belcastro, Vincenzo; di Bernardo, Diego

    2014-01-01

    The aim of this chapter is a step-by-step guide on how to infer gene networks from gene expression profiles. The definition of a gene network is given in Subheading 1, where the different types of networks are discussed. The chapter then guides the readers through a data-gathering process in order to build a compendium of gene expression profiles from a public repository. Gene expression profiles are then discretized and a statistical relationship between genes, called mutual information (MI), is computed. Gene pairs with insignificant MI scores are then discarded by applying one of the described pruning steps. The retained relationships are then used to build up a Boolean adjacency matrix used as input for a clustering algorithm to divide the network into modules (or communities). The gene network can then be used as a hypothesis generator for discovering gene function and analyzing gene signatures. Some case studies are presented, and an online web-tool called Netview is described.

  16. Dynamic visualization of coexpression in systems genetics data.

    PubMed

    New, Joshua; Kendall, Wesley; Huang, Jian; Chesler, Elissa

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biological networks and discover genes residing in critical positions in networks and pathways. By using a graph as a universal representation of correlation in gene expression, our system employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by compound queries, dynamic level-of-detail abstraction, and template-based fuzzy classification. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.

  17. Discovering Functional Modules across Diverse Maize Transcriptomes Using COB, the Co-Expression Browser

    PubMed Central

    Schaefer, Robert J.; Briskine, Roman; Springer, Nathan M.; Myers, Chad L.

    2014-01-01

    Tools that provide improved ability to relate genotype to phenotype have the potential to accelerate breeding for desired traits and to improve our understanding of the molecular variants that underlie phenotypes. The availability of large-scale gene expression profiles in maize provides an opportunity to advance our understanding of complex traits in this agronomically important species. We built co-expression networks based on genome-wide expression data from a variety of maize accessions as well as an atlas of different tissues and developmental stages. We demonstrate that these networks reveal clusters of genes that are enriched for known biological function and contain extensive structure which has yet to be characterized. Furthermore, we found that co-expression networks derived from developmental or tissue atlases as compared to expression variation across diverse accessions capture unique functions. To provide convenient access to these networks, we developed a public, web-based Co-expression Browser (COB), which enables interactive queries of the genome-wide networks. We illustrate the utility of this system through two specific use cases: one in which gene-centric queries are used to provide functional context for previously characterized metabolic pathways, and a second where lists of genes produced by mapping studies are further resolved and validated using co-expression networks. PMID:24922320

  18. A Genome-Wide Association Study for Culm Cellulose Content in Barley Reveals Candidate Genes Co-Expressed with Members of the CELLULOSE SYNTHASE A Gene Family

    PubMed Central

    Houston, Kelly; Burton, Rachel A.; Sznajder, Beata; Rafalski, Antoni J.; Dhugga, Kanwarpal S.; Mather, Diane E.; Taylor, Jillian; Steffenson, Brian J.; Waugh, Robbie; Fincher, Geoffrey B.

    2015-01-01

    Cellulose is a fundamentally important component of cell walls of higher plants. It provides a scaffold that allows the development and growth of the plant to occur in an ordered fashion. Cellulose also provides mechanical strength, which is crucial for both normal development and to enable the plant to withstand both abiotic and biotic stresses. We quantified the cellulose concentration in the culm of 288 two – rowed and 288 six – rowed spring type barley accessions that were part of the USDA funded barley Coordinated Agricultural Project (CAP) program in the USA. When the population structure of these accessions was analysed we identified six distinct populations, four of which we considered to be comprised of a sufficient number of accessions to be suitable for genome-wide association studies (GWAS). These lines had been genotyped with 3072 SNPs so we combined the trait and genetic data to carry out GWAS. The analysis allowed us to identify regions of the genome containing significant associations between molecular markers and cellulose concentration data, including one region cross-validated in multiple populations. To identify candidate genes we assembled the gene content of these regions and used these to query a comprehensive RNA-seq based gene expression atlas. This provided us with gene annotations and associated expression data across multiple tissues, which allowed us to formulate a supported list of candidate genes that regulate cellulose biosynthesis. Several regions identified by our analysis contain genes that are co-expressed with CELLULOSE SYNTHASE A (HvCesA) across a range of tissues and developmental stages. These genes are involved in both primary and secondary cell wall development. In addition, genes that have been previously linked with cellulose synthesis by biochemical methods, such as HvCOBRA, a gene of unknown function, were also associated with cellulose levels in the association panel. Our analyses provide new insights into the

  19. A network of genes regulated by light in cyanobacteria.

    PubMed

    Aurora, Rajeev; Hihara, Yukako; Singh, Abhay K; Pakrasi, Himadri B

    2007-01-01

    Oxygenic photosynthetic organisms require light for their growth and development. However, exposure to high light is detrimental to them. Using time series microarray data from a model cyanobacterium, Synechocystis 6803 transferred from low to high light, we generated a gene co-expression network. The network has twelve sub-networks connected hierarchically, each consisting of an interconnected hub-and-spoke architecture. Within each sub-network, edges formed between genes that recapitulate known pathways. Analysis of the expression profiles shows that the cells undergo a phase transition 6-hours post-shift to high light, characterized by core sub-network. The core sub-network is enriched in proteins that (putatively) bind Fe-S clusters and proteins that mediate iron and sulfate homeostasis. At the center of this core is a sulfate permease, suggesting sulfate is rate limiting for cells grown in high light. To validate this novel finding, we demonstrate the limited ability of cell growth in sulfate-depleted medium in high light. This study highlights how understanding the organization of the networks can provide insights into the coordination of physiologic responses.

  20. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe

    PubMed Central

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338

  1. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe.

    PubMed

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies.

  2. Coexpression of platelet-derived growth factor (PDGF) and PDGF-receptor genes by primary human astrocytomas may contribute to their development and maintenance.

    PubMed Central

    Maxwell, M; Naber, S P; Wolfe, H J; Galanopoulos, T; Hedley-Whyte, E T; Black, P M; Antoniades, H N

    1990-01-01

    The present studies investigated the expression of the two PDGF genes (c-sis/PDGF-2 and PDGF-1) and the PDGF-receptor b gene (PDGF-R) in 34 primary human astrocytomas. Northern blot analysis demonstrated the coexpression of the c-sis/PDGF-2 protooncogene and the PDGF-R gene in all astrocytomas examined. The majority of the tumors also expressed the PDGF-1 gene. There was no correlation between the expression of the two PDGF genes. Nonmalignant human brain tissue expressed the PDGF-R and PDGF-1 genes but not the c-sis/PDGF-2 protooncogene. In situ hybridization of astrocytoma tissue localized the expression of the c-sis and PDGF-R mRNA's in tumor cells. Capillary endothelial cells also expressed c-sis mRNA. In contrast, nonmalignant human brain tissue expressed only PDGF-R mRNA but not c-sis/PDGF-2 mRNA. The coexpression of a potent mitogenic growth factor protooncogene (c-sis) and its receptor gene in astrocytoma tumor cells suggests the presence of an autocrine mechanism that may contribute to the development and maintenance of astrocytomas. The expression of c-sis mRNA in tumor cells but not in nonmalignant brain cells may serve as an additional diagnostic criterion for the detection of astrocytomas in small tissue specimen using in situ hybridization for the detection of c-sis mRNA and/or immunostaining for the recognition of its protein product. Images PMID:2164040

  3. Identification and Network-Enabled Characterization of Auxin Response Factor Genes in Medicago truncatula

    PubMed Central

    Burks, David J.; Azad, Rajeev K.

    2016-01-01

    The Auxin Response Factor (ARF) family of transcription factors is an important regulator of environmental response and symbiotic nodulation in the legume Medicago truncatula. While previous studies have identified members of this family, a recent spurt in gene expression data coupled with genome update and reannotation calls for a reassessment of the prevalence of ARF genes and their interaction networks in M. truncatula. We performed a comprehensive analysis of the M. truncatula genome and transcriptome that entailed search for novel ARF genes and the co-expression networks. Our investigation revealed 8 novel M. truncatula ARF (MtARF) genes, of the total 22 identified, and uncovered novel gene co-expression networks as well. Furthermore, the topological clustering and single enrichment analysis of several network models revealed the roles of individual members of the MtARF family in nitrogen regulation, nodule initiation, and post-embryonic development through a specialized protein packaging and secretory pathway. In summary, this study not just shines new light on an important gene family, but also provides a guideline for identification of new members of gene families and their functional characterization through network analyses. PMID:28018393

  4. Identification and Network-Enabled Characterization of Auxin Response Factor Genes in Medicago truncatula.

    PubMed

    Burks, David J; Azad, Rajeev K

    2016-01-01

    The Auxin Response Factor (ARF) family of transcription factors is an important regulator of environmental response and symbiotic nodulation in the legume Medicago truncatula. While previous studies have identified members of this family, a recent spurt in gene expression data coupled with genome update and reannotation calls for a reassessment of the prevalence of ARF genes and their interaction networks in M. truncatula. We performed a comprehensive analysis of the M. truncatula genome and transcriptome that entailed search for novel ARF genes and the co-expression networks. Our investigation revealed 8 novel M. truncatula ARF (MtARF) genes, of the total 22 identified, and uncovered novel gene co-expression networks as well. Furthermore, the topological clustering and single enrichment analysis of several network models revealed the roles of individual members of the MtARF family in nitrogen regulation, nodule initiation, and post-embryonic development through a specialized protein packaging and secretory pathway. In summary, this study not just shines new light on an important gene family, but also provides a guideline for identification of new members of gene families and their functional characterization through network analyses.

  5. Identification of a gene module associated with BMD through the integration of network analysis and genome-wide association data.

    PubMed

    Farber, Charles R

    2010-11-01

    Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD.

  6. Neutralization of Bacterial YoeBSpn Toxicity and Enhanced Plant Growth in Arabidopsis thaliana via Co-Expression of the Toxin-Antitoxin Genes

    PubMed Central

    Abu Bakar, Fauziah; Yeo, Chew Chieng; Harikrishna, Jennifer Ann

    2016-01-01

    Bacterial toxin-antitoxin (TA) systems have various cellular functions, including as part of the general stress response. The genome of the Gram-positive human pathogen Streptococcus pneumoniae harbors several putative TA systems, including yefM-yoeBSpn, which is one of four systems that had been demonstrated to be biologically functional. Overexpression of the yoeBSpn toxin gene resulted in cell stasis and eventually cell death in its native host, as well as in Escherichia coli. Our previous work showed that induced expression of a yoeBSpn toxin-Green Fluorescent Protein (GFP) fusion gene apparently triggered apoptosis and was lethal in the model plant, Arabidopsis thaliana. In this study, we investigated the effects of co-expression of the yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic A. thaliana. When co-expressed in Arabidopsis, the YefMSpn antitoxin was found to neutralize the toxicity of YoeBSpn-GFP. Interestingly, the inducible expression of both yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic hybrid Arabidopsis resulted in larger rosette leaves and taller plants with a higher number of inflorescence stems and increased silique production. To our knowledge, this is the first demonstration of a prokaryotic antitoxin neutralizing its cognate toxin in plant cells. PMID:27104531

  7. Detection of gene communities in multi-networks reveals cancer drivers

    NASA Astrophysics Data System (ADS)

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-12-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.

  8. Functional interaction between co-expressed MAGE-A proteins

    PubMed Central

    Laiseca, Julieta E.; Ladelfa, María F.; Cotignola, Javier; Peche, Leticia Y.; Pascucci, Franco A.; Castaño, Bryan A.; Galigniana, Mario D.; Schneider, Claudio

    2017-01-01

    MAGE-A (Melanoma Antigen Genes-A) are tumor-associated proteins with expression in a broad spectrum of human tumors and normal germ cells. MAGE-A gene expression and function are being increasingly investigated to better understand the mechanisms by which MAGE proteins collaborate in tumorigenesis and whether their detection could be useful for disease prognosis purposes. Alterations in epigenetic mechanisms involved in MAGE gene silencing cause their frequent co-expression in tumor cells. Here, we have analyzed the effect of MAGE-A gene co-expression and our results suggest that MageA6 can potentiate the androgen receptor (AR) co-activation function of MageA11. Database search confirmed that MageA11 and MageA6 are co-expressed in human prostate cancer samples. We demonstrate that MageA6 and MageA11 form a protein complex resulting in the stabilization of MageA11 and consequently the enhancement of AR activity. The mechanism involves association of the Mage A6-MHD domain to MageA11, prevention of MageA11 ubiquitinylation on lysines 240 and 245 and decreased proteasome-dependent degradation. We experimentally demonstrate here for the first time that two MAGE-A proteins can act together in a non-redundant way to potentiate a specific oncogenic function. Overall, our results highlight the complexity of the MAGE gene networking in regulating cancer cell behavior. PMID:28542476

  9. Functional interaction between co-expressed MAGE-A proteins.

    PubMed

    Laiseca, Julieta E; Ladelfa, María F; Cotignola, Javier; Peche, Leticia Y; Pascucci, Franco A; Castaño, Bryan A; Galigniana, Mario D; Schneider, Claudio; Monte, Martin

    2017-01-01

    MAGE-A (Melanoma Antigen Genes-A) are tumor-associated proteins with expression in a broad spectrum of human tumors and normal germ cells. MAGE-A gene expression and function are being increasingly investigated to better understand the mechanisms by which MAGE proteins collaborate in tumorigenesis and whether their detection could be useful for disease prognosis purposes. Alterations in epigenetic mechanisms involved in MAGE gene silencing cause their frequent co-expression in tumor cells. Here, we have analyzed the effect of MAGE-A gene co-expression and our results suggest that MageA6 can potentiate the androgen receptor (AR) co-activation function of MageA11. Database search confirmed that MageA11 and MageA6 are co-expressed in human prostate cancer samples. We demonstrate that MageA6 and MageA11 form a protein complex resulting in the stabilization of MageA11 and consequently the enhancement of AR activity. The mechanism involves association of the Mage A6-MHD domain to MageA11, prevention of MageA11 ubiquitinylation on lysines 240 and 245 and decreased proteasome-dependent degradation. We experimentally demonstrate here for the first time that two MAGE-A proteins can act together in a non-redundant way to potentiate a specific oncogenic function. Overall, our results highlight the complexity of the MAGE gene networking in regulating cancer cell behavior.

  10. Enhancement of γ-aminobutyric acid production in recombinant Corynebacterium glutamicum by co-expressing two glutamate decarboxylase genes from Lactobacillus brevis.

    PubMed

    Shi, Feng; Jiang, Junjun; Li, Yongfu; Li, Youxin; Xie, Yilong

    2013-11-01

    γ-Aminobutyric acid (GABA), a non-protein amino acid, is a bioactive component in the food, feed and pharmaceutical fields. To establish an effective single-step production system for GABA, a recombinant Corynebacterium glutamicum strain co-expressing two glutamate decarboxylase (GAD) genes (gadB1 and gadB2) derived from Lactobacillus brevis Lb85 was constructed. Compared with the GABA production of the gadB1 or gadB2 single-expressing strains, GABA production by the gadB1-gadB2 co-expressing strain increased more than twofold. By optimising urea supplementation, the total production of L-glutamate and GABA increased from 22.57 ± 1.24 to 30.18 ± 1.33 g L⁻¹, and GABA production increased from 4.02 ± 0.95 to 18.66 ± 2.11 g L⁻¹ after 84-h cultivation. Under optimal urea supplementation, L-glutamate continued to be consumed, GABA continued to accumulate after 36 h of fermentation, and the pH level fluctuated. GABA production increased to a maximum level of 27.13 ± 0.54 g L⁻¹ after 120-h flask cultivation and 26.32 g L⁻¹ after 60-h fed-batch fermentation. The conversion ratio of L-glutamate to GABA reached 0.60-0.74 mol mol⁻¹. By co-expressing gadB1 and gadB2 and optimising the urea addition method, C. glutamicum was genetically improved for de novo biosynthesis of GABA from its own accumulated L-glutamate.

  11. Co-expressed immune and metabolic genes in visceral and subcutaneous adipose tissue from severely obese individuals are associated with plasma HDL and glucose levels: a microarray study.

    PubMed

    Wolfs, Marcel G M; Rensen, Sander S; Bruin-Van Dijk, Elinda J; Verdam, Froukje J; Greve, Jan-Willem; Sanjabi, Bahram; Bruinenberg, Marcel; Wijmenga, Cisca; van Haeften, Timon W; Buurman, Wim A; Franke, Lude; Hofker, Marten H

    2010-08-05

    Excessive accumulation of body fat, in particular in the visceral fat depot, is a major risk factor to develop a variety of diseases such as type 2 diabetes. The mechanisms underlying the increased risk of obese individuals to develop co-morbid diseases are largely unclear.We aimed to identify genes expressed in subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT) that are related to blood parameters involved in obesity co-morbidity, such as plasma lipid and glucose levels, and to compare gene expression between the fat depots. Whole-transcriptome SAT and VAT gene expression levels were determined in 75 individuals with a BMI >35 kg/m2. Modules of co-expressed genes likely to be functionally related were identified and correlated with BMI, plasma levels of glucose, insulin, HbA1c, triglycerides, non-esterified fatty acids, ALAT, ASAT, C-reactive protein, and LDL- and HDL cholesterol. Of the approximately 70 modules identified in SAT and VAT, three SAT modules were inversely associated with plasma HDL-cholesterol levels, and a fourth module was inversely associated with both plasma glucose and plasma triglyceride levels (p < 5.33 x 10(-5)). These modules were markedly enriched in immune and metabolic genes. In VAT, one module was associated with both BMI and insulin, and another with plasma glucose (p < 4.64 x 10(-5)). This module was also enriched in inflammatory genes and showed a marked overlap in gene content with the SAT modules related to HDL. Several genes differentially expressed in SAT and VAT were identified. In obese subjects, groups of co-expressed genes were identified that correlated with lipid and glucose metabolism parameters; they were enriched with immune genes. A number of genes were identified of which the expression in SAT correlated with plasma HDL cholesterol, while their expression in VAT correlated with plasma glucose. This underlines both the singular importance of these genes for lipid and glucose metabolism and the specific

  12. Buffering in cyclic gene networks

    NASA Astrophysics Data System (ADS)

    Glyzin, S. D.; Kolesov, A. Yu.; Rozov, N. Kh.

    2016-06-01

    We consider cyclic chains of unidirectionally coupled delay differential-difference equations that are mathematical models of artificial oscillating gene networks. We establish that the buffering phenomenon is realized in these system for an appropriate choice of the parameters: any given finite number of stable periodic motions of a special type, the so-called traveling waves, coexist.

  13. Gene networks controlling petal organogenesis.

    PubMed

    Huang, Tengbo; Irish, Vivian F

    2016-01-01

    One of the biggest unanswered questions in developmental biology is how growth is controlled. Petals are an excellent organ system for investigating growth control in plants: petals are dispensable, have a simple structure, and are largely refractory to environmental perturbations that can alter their size and shape. In recent studies, a number of genes controlling petal growth have been identified. The overall picture of how such genes function in petal organogenesis is beginning to be elucidated. This review will focus on studies using petals as a model system to explore the underlying gene networks that control organ initiation, growth, and final organ morphology.

  14. Reconstruction of Gene Networks of Iron Response in Shewanella oneidensis

    SciTech Connect

    Yang, Yunfeng; Harris, Daniel P; Luo, Feng; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin Koo; Gao, Haichun; Arkin, Adam; Palumbo, Anthony Vito; Zhou, Jizhong

    2009-01-01

    It is of great interest to study the iron response of the -proteobacterium Shewanella oneidensis since it possesses a high content of iron and is capable of utilizing iron for anaerobic respiration. We report here that the iron response in S. oneidensis is a rapid process. To gain more insights into the bacterial response to iron, temporal gene expression profiles were examined for iron depletion and repletion, resulting in identification of iron-responsive biological pathways in a gene co-expression network. Iron acquisition systems, including genes unique to S. oneidensis, were rapidly and strongly induced by iron depletion, and repressed by iron repletion. Some were required for iron depletion, as exemplified by the mutational analysis of the putative siderophore biosynthesis protein SO3032. Unexpectedly, a number of genes related to anaerobic energy metabolism were repressed by iron depletion and induced by repletion, which might be due to the iron storage potential of their protein products. Other iron-responsive biological pathways include protein degradation, aerobic energy metabolism and protein synthesis. Furthermore, sequence motifs enriched in gene clusters as well as their corresponding DNA-binding proteins (Fur, CRP and RpoH) were identified, resulting in a regulatory network of iron response in S. oneidensis. Together, this work provides an overview of iron response and reveals novel features in S. oneidensis, including Shewanella-specific iron acquisition systems, and suggests the intimate relationship between anaerobic energy metabolism and iron response.

  15. The Gene Network Underlying Hypodontia.

    PubMed

    Yin, W; Bian, Z

    2015-07-01

    Mammalian tooth development is a precise and complicated procedure. Several signaling pathways, such as nuclear factor (NF)-κB and WNT, are key regulators of tooth development. Any disturbance of these signaling pathways can potentially affect or block normal tooth development, and presently, there are more than 150 syndromes and 80 genes known to be related to tooth agenesis. Clarifying the interaction and crosstalk among these genes will provide important information regarding the mechanisms underlying missing teeth. In the current review, we summarize recently published findings on genes related to isolated and syndromic tooth agenesis; most of these genes function as positive regulators of cell proliferation or negative regulators of cell differentiation and apoptosis. Furthermore, we explore the corresponding networks involving these genes in addition to their implications for the clinical management of tooth agenesis. We conclude that this requires further study to improve patients' quality of life in the future. © International & American Associations for Dental Research 2015.

  16. A moth pheromone brewery: production of (Z)-11-hexadecenol by heterologous co-expression of two biosynthetic genes from a noctuid moth in a yeast cell factory

    PubMed Central

    2013-01-01

    Background Moths (Lepidoptera) are highly dependent on chemical communication to find a mate. Compared to conventional unselective insecticides, synthetic pheromones have successfully served to lure male moths as a specific and environmentally friendly way to control important pest species. However, the chemical synthesis and purification of the sex pheromone components in large amounts is a difficult and costly task. The repertoire of enzymes involved in moth pheromone biosynthesis in insecta can be seen as a library of specific catalysts that can be used to facilitate the synthesis of a particular chemical component. In this study, we present a novel approach to effectively aid in the preparation of semi-synthetic pheromone components using an engineered vector co-expressing two key biosynthetic enzymes in a simple yeast cell factory. Results We first identified and functionally characterized a ∆11 Fatty-Acyl Desaturase and a Fatty-Acyl Reductase from the Turnip moth, Agrotis segetum. The ∆11-desaturase produced predominantly Z11-16:acyl, a common pheromone component precursor, from the abundant yeast palmitic acid and the FAR transformed a series of saturated and unsaturated fatty acids into their corresponding alcohols which may serve as pheromone components in many moth species. Secondly, when we co-expressed the genes in the Brewer’s yeast Saccharomyces cerevisiae, a set of long-chain fatty acids and alcohols that are not naturally occurring in yeast were produced from inherent yeast fatty acids, and the presence of (Z)-11-hexadecenol (Z11-16:OH), demonstrated that both heterologous enzymes were active in concert. A 100 ml batch yeast culture produced on average 19.5 μg Z11-16:OH. Finally, we demonstrated that oxidized extracts from the yeast cells containing (Z)-11-hexadecenal and other aldehyde pheromone compounds elicited specific electrophysiological activity from male antennae of the Tobacco budworm, Heliothis virescens, supporting the idea that

  17. A moth pheromone brewery: production of (Z)-11-hexadecenol by heterologous co-expression of two biosynthetic genes from a noctuid moth in a yeast cell factory.

    PubMed

    Hagström, Åsa K; Wang, Hong-Lei; Liénard, Marjorie A; Lassance, Jean-Marc; Johansson, Tomas; Löfstedt, Christer

    2013-12-13

    Moths (Lepidoptera) are highly dependent on chemical communication to find a mate. Compared to conventional unselective insecticides, synthetic pheromones have successfully served to lure male moths as a specific and environmentally friendly way to control important pest species. However, the chemical synthesis and purification of the sex pheromone components in large amounts is a difficult and costly task. The repertoire of enzymes involved in moth pheromone biosynthesis in insecta can be seen as a library of specific catalysts that can be used to facilitate the synthesis of a particular chemical component. In this study, we present a novel approach to effectively aid in the preparation of semi-synthetic pheromone components using an engineered vector co-expressing two key biosynthetic enzymes in a simple yeast cell factory. We first identified and functionally characterized a ∆11 Fatty-Acyl Desaturase and a Fatty-Acyl Reductase from the Turnip moth, Agrotis segetum. The ∆11-desaturase produced predominantly Z11-16:acyl, a common pheromone component precursor, from the abundant yeast palmitic acid and the FAR transformed a series of saturated and unsaturated fatty acids into their corresponding alcohols which may serve as pheromone components in many moth species. Secondly, when we co-expressed the genes in the Brewer's yeast Saccharomyces cerevisiae, a set of long-chain fatty acids and alcohols that are not naturally occurring in yeast were produced from inherent yeast fatty acids, and the presence of (Z)-11-hexadecenol (Z11-16:OH), demonstrated that both heterologous enzymes were active in concert. A 100 ml batch yeast culture produced on average 19.5 μg Z11-16:OH. Finally, we demonstrated that oxidized extracts from the yeast cells containing (Z)-11-hexadecenal and other aldehyde pheromone compounds elicited specific electrophysiological activity from male antennae of the Tobacco budworm, Heliothis virescens, supporting the idea that genes from different

  18. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks.

  19. Gene networks and liar paradoxes.

    PubMed

    Isalan, Mark

    2009-10-01

    Network motifs are small patterns of connections, found over-represented in gene regulatory networks. An example is the negative feedback loop (e.g. factor A represses itself). This opposes its own state so that when 'on' it tends towards 'off' - and vice versa. Here, we argue that such self-opposition, if considered dimensionlessly, is analogous to the liar paradox: 'This statement is false'. When 'true' it implies 'false' - and vice versa. Such logical constructs have provided philosophical consternation for over 2000 years. Extending the analogy, other network topologies give strikingly varying outputs over different dimensions. For example, the motif 'A activates B and A. B inhibits A' can give switches or oscillators with time only, or can lead to Turing-type patterns with both space and time (spots, stripes or waves). It is argued here that the dimensionless form reduces to a variant of 'The following statement is true. The preceding statement is false'. Thus, merely having a static topological description of a gene network can lead to a liar paradox. Network diagrams are only snapshots of dynamic biological processes and apparent paradoxes can reveal important biological mechanisms that are far from paradoxical when considered explicitly in time and space.

  20. Gene networks and liar paradoxes

    PubMed Central

    Isalan, Mark

    2009-01-01

    Network motifs are small patterns of connections, found over-represented in gene regulatory networks. An example is the negative feedback loop (e.g. factor A represses itself). This opposes its own state so that when ‘on’ it tends towards ‘off’ – and vice versa. Here, we argue that such self-opposition, if considered dimensionlessly, is analogous to the liar paradox: ‘This statement is false’. When ‘true’ it implies ‘false’ – and vice versa. Such logical constructs have provided philosophical consternation for over 2000 years. Extending the analogy, other network topologies give strikingly varying outputs over different dimensions. For example, the motif ‘A activates B and A. B inhibits A’ can give switches or oscillators with time only, or can lead to Turing-type patterns with both space and time (spots, stripes or waves). It is argued here that the dimensionless form reduces to a variant of ‘The following statement is true. The preceding statement is false’. Thus, merely having a static topological description of a gene network can lead to a liar paradox. Network diagrams are only snapshots of dynamic biological processes and apparent paradoxes can reveal important biological mechanisms that are far from paradoxical when considered explicitly in time and space. PMID:19722183

  1. Co-expression analysis identifies CRC and AP1 the regulator of Arabidopsis fatty acid biosynthesis.

    PubMed

    Han, Xinxin; Yin, Linlin; Xue, Hongwei

    2012-07-01

    Fatty acids (FAs) play crucial rules in signal transduction and plant development, however, the regulation of FA metabolism is still poorly understood. To study the relevant regulatory network, fifty-eight FA biosynthesis genes including de novo synthases, desaturases and elongases were selected as "guide genes" to construct the co-expression network. Calculation of the correlation between all Arabidopsis thaliana (L.) genes with each guide gene by Arabidopsis co-expression dating mining tools (ACT) identifies 797 candidate FA-correlated genes. Gene ontology (GO) analysis of these co-expressed genes showed they are tightly correlated to photosynthesis and carbohydrate metabolism, and function in many processes. Interestingly, 63 transcription factors (TFs) were identified as candidate FA biosynthesis regulators and 8 TF families are enriched. Two TF genes, CRC and AP1, both correlating with 8 FA guide genes, were further characterized. Analyses of the ap1 and crc mutant showed the altered total FA composition of mature seeds. The contents of palmitoleic acid, stearic acid, arachidic acid and eicosadienoic acid are decreased, whereas that of oleic acid is increased in ap1 and crc seeds, which is consistent with the qRT-PCR analysis revealing the suppressed expression of the corresponding guide genes. In addition, yeast one-hybrid analysis and electrophoretic mobility shift assay (EMSA) revealed that CRC can bind to the promoter regions of KCS7 and KCS15, indicating that CRC may directly regulate FA biosynthesis. © 2012 Institute of Botany, Chinese Academy of Sciences.

  2. HEN1 and HEN2: a subgroup of basic helix-loop-helix genes that are coexpressed in a human neuroblastoma.

    PubMed Central

    Brown, L; Espinosa, R; Le Beau, M M; Siciliano, M J; Baer, R

    1992-01-01

    An important family of regulatory molecules is made up of proteins that possess the DNA-binding and dimerization motif known as the basic helix-loop-helix (bHLH) domain. The bHLH family includes subgroups of closely related proteins that share common functional properties and overlapping patterns of expression (e.g., the MyoD1 and achaete-scute subgroups). In this report we describe HEN1 and HEN2, mammalian genes that encode a distinct subgroup of bHLH proteins. The HEN1 gene was identified on the basis of cross-hybridization with TAL1, a known bHLH gene implicated in T-cell acute lymphoblastic leukemia. In situ fluorescence hybridization was used to localize the human HEN1 gene to chromosome band 1q22. HEN1 and HEN2 are coexpressed in the IMR-32 human neuroblastoma cell line, and they encode highly related proteins of 133 and 135 residues, respectively, that share 98% amino acid identity in their hHLH domains. These data imply that the bHLH protein subgroup encoded by HEN1 and HEN2 may serve important regulatory functions in the developing nervous system. Images PMID:1528853

  3. Statistical mechanics of scale-free gene expression networks

    NASA Astrophysics Data System (ADS)

    Gross, Eitan

    2012-12-01

    The gene co-expression networks of many organisms including bacteria, mice and man exhibit scale-free distribution. This heterogeneous distribution of connections decreases the vulnerability of the network to random attacks and thus may confer the genetic replication machinery an intrinsic resilience to such attacks, triggered by changing environmental conditions that the organism may be subject to during evolution. This resilience to random attacks comes at an energetic cost, however, reflected by the lower entropy of the scale-free distribution compared to the more homogenous, random network. In this study we found that the cell cycle-regulated gene expression pattern of the yeast Saccharomyces cerevisiae obeys a power-law distribution with an exponent α = 2.1 and an entropy of 1.58. The latter is very close to the maximal value of 1.65 obtained from linear optimization of the entropy function under the constraint of a constant cost function, determined by the average degree connectivity . We further show that the yeast's gene expression network can achieve scale-free distribution in a process that does not involve growth but rather via re-wiring of the connections between nodes of an ordered network. Our results support the idea of an evolutionary selection, which acts at the level of the protein sequence, and is compatible with the notion of greater biological importance of highly connected nodes in the protein interaction network. Our constrained re-wiring model provides a theoretical framework for a putative thermodynamically driven evolutionary selection process.

  4. Uncovering co-expression gene network regulating fruit acidity in divers