Science.gov

Sample records for coexpressed gene networks

  1. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  2. Arabidopsis gene co-expression network and its functional modules

    PubMed Central

    Mao, Linyong; Van Hemert, John L; Dash, Sudhansu; Dickerson, Julie A

    2009-01-01

    Background Biological networks characterize the interactions of biomolecules at a systems-level. One important property of biological networks is the modular structure, in which nodes are densely connected with each other, but between which there are only sparse connections. In this report, we attempted to find the relationship between the network topology and formation of modular structure by comparing gene co-expression networks with random networks. The organization of gene functional modules was also investigated. Results We constructed a genome-wide Arabidopsis gene co-expression network (AGCN) by using 1094 microarrays. We then analyzed the topological properties of AGCN and partitioned the network into modules by using an efficient graph clustering algorithm. In the AGCN, 382 hub genes formed a clique, and they were densely connected only to a small subset of the network. At the module level, the network clustering results provide a systems-level understanding of the gene modules that coordinate multiple biological processes to carry out specific biological functions. For instance, the photosynthesis module in AGCN involves a very large number (> 1000) of genes which participate in various biological processes including photosynthesis, electron transport, pigment metabolism, chloroplast organization and biogenesis, cofactor metabolism, protein biosynthesis, and vitamin metabolism. The cell cycle module orchestrated the coordinated expression of hundreds of genes involved in cell cycle, DNA metabolism, and cytoskeleton organization and biogenesis. We also compared the AGCN constructed in this study with a graphical Gaussian model (GGM) based Arabidopsis gene network. The photosynthesis, protein biosynthesis, and cell cycle modules identified from the GGM network had much smaller module sizes compared with the modules found in the AGCN, respectively. Conclusion This study reveals new insight into the topological properties of biological networks. The

  3. Gene Coexpression Network Analysis as a Source of Functional Annotation for Rice Genes

    PubMed Central

    Childs, Kevin L.; Davidson, Rebecca M.; Buell, C. Robin

    2011-01-01

    With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa) gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional annotation of those

  4. Random matrix analysis of localization properties of gene coexpression network

    NASA Astrophysics Data System (ADS)

    Jalan, Sarika; Solymosi, Norbert; Vattay, Gábor; Li, Baowen

    2010-04-01

    We analyze gene coexpression network under the random matrix theory framework. The nearest-neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range and deviates afterwards. Eigenvector analysis of the network using inverse participation ratio suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets: (a) The nondegenerate part that follows RMT. (b) The nondegenerate part, at both ends and at intermediate eigenvalues, which deviates from RMT and expected to contain information about important nodes in the network. (c) The degenerate part with zero eigenvalue, which fluctuates around RMT-predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties.

  5. Analysis of bHLH coding genes using gene co-expression network approach.

    PubMed

    Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

    2016-07-01

    Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species. PMID:27178572

  6. Investigating the Combinatory Effects of Biological Networks on Gene Co-expression

    PubMed Central

    Zhang, Cheng; Lee, Sunjae; Mardinoglu, Adil; Hua, Qiang

    2016-01-01

    Co-expressed genes often share similar functions, and gene co-expression networks have been widely used in studying the functionality of gene modules. Previous analysis indicated that genes are more likely to be co-expressed if they are either regulated by the same transcription factors, forming protein complexes or sharing similar topological properties in protein-protein interaction networks. Here, we reconstructed transcriptional regulatory and protein-protein networks for Saccharomyces cerevisiae using well-established databases, and we evaluated their co-expression activities using publically available gene expression data. Based on our network-dependent analysis, we found that genes that were co-regulated in the transcription regulatory networks and shared similar neighbors in the protein-protein networks were more likely to be co-expressed. Moreover, their biological functions were closely related. PMID:27445830

  7. Annotation of gene function in citrus using gene expression information and co-expression networks

    PubMed Central

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks

  8. Identification of hub genes and pathways associated with retinoblastoma based on co-expression network analysis.

    PubMed

    Wang, Q L; Chen, X; Zhang, M H; Shen, Q H; Qin, Z M

    2015-01-01

    The objective of this paper was to identify hub genes and pathways associated with retinoblastoma using centrality analysis of the co-expression network and pathway-enrichment analysis. The co-expression network of retinoblastoma was constructed by weighted gene co-expression network analysis (WGCNA) based on differentially expressed (DE) genes, and clusters were obtained through the molecular complex detection (MCODE) algorithm. Degree centrality analysis of the co-expression network was performed to explore hub genes present in retinoblastoma. Pathway-enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Validation of hub gene expression in retinoblastoma was performed by reverse transcription-polymerase chain reaction (RT-PCR) analysis. The co-expression network based on 221 DE genes between retinoblastoma and normal controls consisted of 210 nodes and 3965 edges, and 5 clusters of the network were evaluated. By assessing the centrality analysis of the co-expression network, 21 hub genes were identified, such as SNORD115-41, RASSF2, and SNORD115-44. According to RT-PCR analysis, 16 of the 21 hub genes were differently expressed, including RASSF2 and CDCA7, and 5 were not differently expressed in retinoblastoma compared to normal controls. Pathway analysis showed that genes in 2 clusters were enriched in 3 pathways: purine metabolism, p53 signaling pathway, and melanogenesis. In this study, we successfully identified 16 hub genes and 3 pathways associated with retinoblastoma, which may be potential biomarkers for early detection and therapy for retinoblastoma. PMID:26662407

  9. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network.

    PubMed

    Ruan, Xiyun; Li, Hongyun; Liu, Bo; Chen, Jie; Zhang, Shibao; Sun, Zeqiang; Liu, Shuangqing; Sun, Fahai; Liu, Qingyong

    2015-08-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson's correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson's correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  10. Elucidating gene function and function evolution through comparison of co-expression networks of plants

    PubMed Central

    Hansen, Bjoern O.; Vaid, Neha; Musialak-Lange, Magdalena; Janowski, Marcin; Mutwil, Marek

    2014-01-01

    The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed) genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23). In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We showed that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution. PMID:25191328

  11. Construction of citrus gene coexpression networks from microarray data using random matrix theory

    PubMed Central

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G.

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  12. Construction of citrus gene coexpression networks from microarray data using random matrix theory.

    PubMed

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  13. Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

    PubMed Central

    Kumari, Sapna; Nie, Jeff; Chen, Huann-Sheng; Ma, Hao; Stewart, Ron; Li, Xiang; Lu, Meng-Zhu; Taylor, William M.; Wei, Hairong

    2012-01-01

    Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. PMID:23226279

  14. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering

    PubMed Central

    McDowell, Ian C.; Zhao, Shiwen; Brown, Christopher D.; Engelhardt, Barbara E.

    2016-01-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues. PMID:27467526

  15. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering.

    PubMed

    Gao, Chuan; McDowell, Ian C; Zhao, Shiwen; Brown, Christopher D; Engelhardt, Barbara E

    2016-07-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues. PMID:27467526

  16. Reconstruction of gene co-expression network from microarray data using local expression patterns

    PubMed Central

    2014-01-01

    Background Biological networks connect genes, gene products to one another. A network of co-regulated genes may form gene clusters that can encode proteins and take part in common biological processes. A gene co-expression network describes inter-relationships among genes. Existing techniques generally depend on proximity measures based on global similarity to draw the relationship between genes. It has been observed that expression profiles are sharing local similarity rather than global similarity. We propose an expression pattern based method called GeCON to extract Gene CO-expression Network from microarray data. Pair-wise supports are computed for each pair of genes based on changing tendencies and regulation patterns of the gene expression. Gene pairs showing negative or positive co-regulation under a given number of conditions are used to construct such gene co-expression network. We construct co-expression network with signed edges to reflect up- and down-regulation between pairs of genes. Most existing techniques do not emphasize computational efficiency. We exploit a fast correlogram matrix based technique for capturing the support of each gene pair to construct the network. Results We apply GeCON to both real and synthetic gene expression data. We compare our results using the DREAM (Dialogue for Reverse Engineering Assessments and Methods) Challenge data with three well known algorithms, viz., ARACNE, CLR and MRNET. Our method outperforms other algorithms based on in silico regulatory network reconstruction. Experimental results show that GeCON can extract functionally enriched network modules from real expression data. Conclusions In view of the results over several in-silico and real expression datasets, the proposed GeCON shows satisfactory performance in predicting co-expression network in a computationally inexpensive way. We further establish that a simple expression pattern matching is helpful in finding biologically relevant gene network. In

  17. Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory

    PubMed Central

    Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071

  18. Discovery of core biotic stress responsive genes in Arabidopsis by weighted gene co-expression network analysis.

    PubMed

    Amrine, Katherine C H; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  19. Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by Weighted Gene Co-Expression Network Analysis

    PubMed Central

    Amrine, Katherine C. H.; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  20. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    PubMed Central

    Lim, Dajeong; Kim, Nam-Kuk; Lee, Seung-Hwan; Park, Hye-Sun; Cho, Yong-Min; Chai, Han-Ha; Kim, Heebal

    2014-01-01

    Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7) using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60) and dihydropyrimidine dehydrogenase (DPYD) are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness. PMID:24624372

  1. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

    PubMed Central

    RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

    2015-01-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  2. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis

    PubMed Central

    Creanza, Teresa Maria; Liguori, Maria; Liuni, Sabino; Nuzziello, Nicoletta; Ancona, Nicola

    2016-01-01

    Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment. PMID:27314336

  3. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis.

    PubMed

    Creanza, Teresa Maria; Liguori, Maria; Liuni, Sabino; Nuzziello, Nicoletta; Ancona, Nicola

    2016-01-01

    Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment. PMID:27314336

  4. Identification of crucial genes in intracranial aneurysm based on weighted gene coexpression network analysis.

    PubMed

    Zheng, X; Xue, C; Luo, G; Hu, Y; Luo, W; Sun, X

    2015-05-01

    The rupture of intracranial aneurysm (IA) is the leading cause for devastating subarachnoid hemorrhage. This study aimed to investigate genes related to IA and potential diagnosis targets. Two data sets (GSE15629 and GSE54083) were downloaded from Gene Expression Omnibus database. GSE15629 contained eight RI (ruptured IA), six UI (unruptured IA) and five control IA samples. GSE54083 included 8 RI, 5 UI and 10 superficial temporal artery samples. In total, 452 differentially expressed genes (DEGs) between RI and control, and 570 DEGs between UI and control, were identified. Protein-protein interaction networks for two kinds of DEGs related to RI and UI were constructed, respectively. Module networks were searched for DEGs related to RI or UI based on WGCNA (weighted gene coexpression network analysis). In the significant modules, FOS, CCL2, COL4A2 and CXCL5 were screened as crucial nodes with high degrees. Among them, FOS and CCL2 were enriched in immune response and COL4A2 was involved in the ECM (extracellular matrix) pathway, whereas CXCL5 was related to cytokine-cytokine receptor pathway. Taken together, FOS, CCL2, COL4A2 and CXCL5 might participate in the pathogenesis of RI or UI, and could serve as potential diagnosis targets. PMID:25721208

  5. Gene Coexpression Analyses Differentiate Networks Associated with Diverse Cancers Harboring TP53 Missense or Null Mutations

    PubMed Central

    Oros Klein, Kathleen; Oualkacha, Karim; Lafond, Marie-Hélène; Bhatnagar, Sahir; Tonin, Patricia N.; Greenwood, Celia M. T.

    2016-01-01

    In a variety of solid cancers, missense mutations in the well-established TP53 tumor suppressor gene may lead to the presence of a partially-functioning protein molecule, whereas mutations affecting the protein encoding reading frame, often referred to as null mutations, result in the absence of p53 protein. Both types of mutations have been observed in the same cancer type. As the resulting tumor biology may be quite different between these two groups, we used RNA-sequencing data from The Cancer Genome Atlas (TCGA) from four different cancers with poor prognosis, namely ovarian, breast, lung and skin cancers, to compare the patterns of coexpression of genes in tumors grouped according to their TP53 missense or null mutation status. We used Weighted Gene Coexpression Network analysis (WGCNA) and a new test statistic built on differences between groups in the measures of gene connectivity. For each cancer, our analysis identified a set of genes showing differential coexpression patterns between the TP53 missense- and null mutation-carrying groups that was robust to the choice of the tuning parameter in WGCNA. After comparing these sets of genes across the four cancers, one gene (KIR3DL2) consistently showed differential coexpression patterns between the null and missense groups. KIR3DL2 is known to play an important role in regulating the immune response, which is consistent with our observation that this gene's strongly-correlated partners implicated many immune-related pathways. Examining mutation-type-related changes in correlations between sets of genes may provide new insight into tumor biology. PMID:27536319

  6. Chronic Ethanol Exposure Produces Time- and Brain Region-Dependent Changes in Gene Coexpression Networks

    PubMed Central

    Osterndorff-Kahanek, Elizabeth A.; Becker, Howard C.; Lopez, Marcelo F.; Farris, Sean P.; Tiwari, Gayatri R.; Nunez, Yury O.; Harris, R. Adron; Mayfield, R. Dayne

    2015-01-01

    Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY), nucleus accumbens (NAC), prefrontal cortex (PFC), and liver after four weekly cycles of chronic intermittent ethanol (CIE) vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000) at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600). Within each region, there was little gene overlap across time (~20%). All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global ‘rewiring‘ of coexpression systems involving glial and immune signaling as well as neuronal genes. PMID:25803291

  7. Gene co-expression networks shed light into diseases of brain iron accumulation

    PubMed Central

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M.; Botía, Juan A.; Collingwood, Joanna F.; Hardy, John; Milward, Elizabeth A.; Ryten, Mina; Houlden, Henry

    2016-01-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700

  8. Gene co-expression networks shed light into diseases of brain iron accumulation.

    PubMed

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M; Botía, Juan A; Collingwood, Joanna F; Hardy, John; Milward, Elizabeth A; Ryten, Mina; Houlden, Henry

    2016-03-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700

  9. New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer

    PubMed Central

    2016-01-01

    We focus on characterizing common and different coexpression patterns among RNAs and proteins in breast cancer tumors. To address this problem, we introduce Joint Random Forest (JRF), a novel nonparametric algorithm to simultaneously estimate multiple coexpression networks by effectively borrowing information across protein and gene expression data. The performance of JRF was evaluated through extensive simulation studies using different network topologies and data distribution functions. Advantages of JRF over other algorithms that estimate class-specific networks separately were observed across all simulation settings. JRF also outperformed a competing method based on Gaussian graphic models. We then applied JRF to simultaneously construct gene and protein coexpression networks based on protein and RNAseq data from CPTAC-TCGA breast cancer study. We identified interesting common and differential coexpression patterns among genes and proteins. This information can help to cast light on the potential disease mechanisms of breast cancer. PMID:26733076

  10. The Structure of a Gene Co-Expression Network Reveals Biological Functions Underlying eQTLs

    PubMed Central

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology. PMID:23577081

  11. The structure of a gene co-expression network reveals biological functions underlying eQTLs.

    PubMed

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology. PMID:23577081

  12. A contribution to the study of plant development evolution based on gene co-expression networks

    PubMed Central

    Romero-Campero, Francisco J.; Lucas-Reina, Eva; Said, Fatima E.; Romero, José M.; Valverde, Federico

    2013-01-01

    Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms. PMID:23935602

  13. Identification of common regulators of genes in co-expression networks affecting muscle and meat properties.

    PubMed

    Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2015-01-01

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network analysis (WGCNA) groups genes into modules based on patterns of co-expression, which can be linked to phenotypes by correlation analysis of trait values and the module eigengenes, i.e. the first principal component of a given module. Network hub genes and regulators of the genes in the modules are likely to play an important role in the emergence of respective traits. In order to detect common regulators of genes in modules showing association with meat quality traits, we identified eQTL for each of these genes, including the highly connected hub genes. Additionally, the module eigengene values were used for association analyses in order to derive a joint eQTL for the respective module. Thereby major sites of orchestrated regulation of genes within trait-associated modules were detected as hotspots of eQTL of many genes of a module and of its eigengene. These sites harbor likely common regulators of genes in the modules. We exemplarily showed the consistent impact of candidate common regulators on the expression of members of respective modules by RNAi knockdown experiments. In fact, Cxcr7 was identified and validated as a regulator of genes in a module, which is involved in the function of defense response in muscle cells. Zfp36l2 was confirmed as a regulator of genes of a module related to cell death or apoptosis pathways. The integration of eQTL in module networks enabled to interpret the differentially-regulated genes from a systems perspective. By integrating genome-wide genomic and transcriptomic data, employing co-expression and eQTL analyses, the study revealed likely regulators that are involved in the fine-tuning and synchronization of genes with trait

  14. Identification of Common Regulators of Genes in Co-Expression Networks Affecting Muscle and Meat Properties

    PubMed Central

    Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2015-01-01

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network analysis (WGCNA) groups genes into modules based on patterns of co-expression, which can be linked to phenotypes by correlation analysis of trait values and the module eigengenes, i.e. the first principal component of a given module. Network hub genes and regulators of the genes in the modules are likely to play an important role in the emergence of respective traits. In order to detect common regulators of genes in modules showing association with meat quality traits, we identified eQTL for each of these genes, including the highly connected hub genes. Additionally, the module eigengene values were used for association analyses in order to derive a joint eQTL for the respective module. Thereby major sites of orchestrated regulation of genes within trait-associated modules were detected as hotspots of eQTL of many genes of a module and of its eigengene. These sites harbor likely common regulators of genes in the modules. We exemplarily showed the consistent impact of candidate common regulators on the expression of members of respective modules by RNAi knockdown experiments. In fact, Cxcr7 was identified and validated as a regulator of genes in a module, which is involved in the function of defense response in muscle cells. Zfp36l2 was confirmed as a regulator of genes of a module related to cell death or apoptosis pathways. The integration of eQTL in module networks enabled to interpret the differentially-regulated genes from a systems perspective. By integrating genome-wide genomic and transcriptomic data, employing co-expression and eQTL analyses, the study revealed likely regulators that are involved in the fine-tuning and synchronization of genes with trait

  15. Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells

    PubMed Central

    Mason, Mike J; Fan, Guoping; Plath, Kathrin; Zhou, Qing; Horvath, Steve

    2009-01-01

    Background Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we investigated the use of unsigned and signed network analysis to identify pluripotency and differentiation related genes. Results We show that signed networks provide a better systems level understanding of the regulatory mechanisms of ES cells than unsigned networks, using two independent murine ES cell expression data sets. Specifically, using signed weighted gene co-expression network analysis (WGCNA), we found a pluripotency module and a differentiation module, which are not identified in unsigned networks. We confirmed the importance of these modules by incorporating genome-wide TF binding data for key ES cell regulators. Interestingly, we find that the pluripotency module is enriched with genes related to DNA damage repair and mitochondrial function in addition to transcriptional regulation. Using a connectivity measure of module membership, we not only identify known regulators of ES cells but also show that Mrpl15, Msh6, Nrf1, Nup133, Ppif, Rbpj, Sh3gl2, and Zfp39, among other genes, have important roles in maintaining ES cell pluripotency and self-renewal. We also report highly significant relationships between module membership and epigenetic modifications (histone modifications and promoter CpG methylation status), which are known to play a role in controlling gene expression during ES cell self-renewal and differentiation. Conclusion Our systems biologic re-analysis of gene expression, transcription factor binding, epigenetic and gene ontology data provides a novel integrative view of ES cell biology. PMID:19619308

  16. Key genes for modulating information flow play a temporal role as breast tumor coexpression networks are dynamically rewired by letrozole

    PubMed Central

    2013-01-01

    Background Genes do not act in isolation but instead as part of complex regulatory networks. To understand how breast tumors adapt to the presence of the drug letrozole, at the molecular level, it is necessary to consider how the expression levels of genes in these networks change relative to one another. Methods Using transcriptomic data generated from sequential tumor biopsy samples, taken at diagnosis, following 10-14 days and following 90 days of letrozole treatment, and a pairwise partial correlation statistic, we build temporal gene coexpression networks. We characterize the structure of each network and identify genes that hold prominent positions for maintaining network integrity and controlling information-flow. Results Letrozole treatment leads to extensive rewiring of the breast tumor coexpression network. Approximately 20% of gene-gene relationships are conserved over time in the presence of letrozole while 80% of relationships are condition dependent. The positions of influence within the networks are transiently held with few genes stably maintaining high centrality scores across the three time points. Conclusions Genes integral for maintaining network integrity and controlling information flow are dynamically changing as the breast tumor coexpression network adapts to perturbation by the drug letrozole. PMID:23819860

  17. Identification of key genes for laryngeal squamous cell carcinoma using weighted co-expression network analysis

    PubMed Central

    LI, XIAO-TIAN

    2016-01-01

    Laryngeal squamous cell carcinoma (LSCC) is the most common malignant tumor in the head and neck, and can seriously affect the daily life of patients. To study the mechanisms of LSCC, the microarray of GSE51958 was analyzed in the present study. GSE51958 was downloaded from Gene Expression Omnibus, and included a collection of LSCC tissue samples and matched adjacent non-cancerous tissue samples from 10 patients. Differentially-expressed genes (DEGs) were identified using limma package. Next, a weighted co-expression network was constructed for the DEGs by WGCNA package in R. Modules of the weighted co-expression network were obtained through constructing a hierarchical clustering tree using the hybrid dynamic shear tree method. Using the clusterProfiler package, the potential functions of DEGs in the modules correlated with LSCC were predicted by pathway enrichment analysis. In total, 959 DEGs were screened from the LSCC samples compared with the adjacent non-cancerous samples, including 553 upregulated and 406 downregulated genes. The appointed black, brown, gray, pink and yellow modules were screened for the DEGs in the weighted co-expression network. For the DEGs in the brown and yellow modules, the enriched pathways were cytokine-cytokine receptor interaction and metabolic pathways, respectively. The DEGs in the pink module were involved in the majority of pathways. With high connectivity degrees in the pink module, TPX2, microtubule-associated (TPX2; degree, 25), minichromosome maintenance complex component 2 (MCM2; degree, 25), ubiquitin-like with PHD and ring finger domains 1 (UHRF1; degree, 22), cyclin-dependent kinase 2 (CDK2; degree, 20) and protein regulator of cytokinesis 1 (PRC1; degree, 20) may be involved in LSCC. Overall, In conclusion, from the integrated bioinformatics analysis of genes that may be associated with LSCC, 959 DEGs were obtained from LSCC samples compared with adjacent non-cancerous samples, and TPX2, MCM2, UHRF1, CDK2 and PRC1 were

  18. Integrated gene co-expression network analysis in the growth phase of Mycobacterium tuberculosis reveals new potential drug targets.

    PubMed

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Verma, Srikant Prasad; Kumar, Sanjiv; Ramachandran, Srinivasan

    2013-11-01

    We have carried out weighted gene co-expression network analysis of Mycobacterium tuberculosis to gain insights into gene expression architecture during log phase growth. The differentially expressed genes between at least one pair of 11 different M. tuberculosis strains as source of biological variability were used for co-expression network analysis. This data included genes with highest coefficient of variation in expression. Five distinct modules were identified using topological overlap based clustering. All the modules together showed significant enrichment in biological processes: fatty acid biosynthesis, cell membrane, intracellular membrane bound organelle, DNA replication, Quinone biosynthesis, cell shape and peptidoglycan biosynthesis, ribosome and structural constituents of ribosome and transposition. We then extracted the co-expressed connections which were supported either by transcriptional regulatory network or STRING database or high edge weight of topological overlap. The genes trpC, nadC, pitA, Rv3404c, atpA, pknA, Rv0996, purB, Rv2106 and Rv0796 emerged as top hub genes. After overlaying this network on the iNJ661 metabolic network, the reactions catalyzed by 15 highly connected metabolic genes were knocked down in silico and evaluated by Flux Balance Analysis. The results showed that in 12 out of 15 cases, in 11 more than 50% of reactions catalyzed by genes connected through co-expressed connections also had altered fluxes. The modules 'Turquoise', 'Blue' and 'Red' also showed enrichment in essential genes. We could map 152 of the previously known or proposed drug targets in these modules and identified 15 new potential drug targets based on their high degree of co-expressed connections and strong correlation with module eigengenes. PMID:24056838

  19. ALCOdb: Gene Coexpression Database for Microalgae.

    PubMed

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems. PMID:26644461

  20. ALCOdb: Gene Coexpression Database for Microalgae

    PubMed Central

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems. PMID:26644461

  1. Microarray and Co-expression Network Analysis of Genes Associated with Acute Doxorubicin Cardiomyopathy in Mice.

    PubMed

    Wei, Sheng-Nan; Zhao, Wen-Jie; Zeng, Xiang-Jun; Kang, Yu-Ming; Du, Jie; Li, Hui-Hua

    2015-10-01

    Clinical use of doxorubicin (DOX) in cancer therapy is limited by its dose-dependent cardiotoxicity. But molecular mechanisms underlying this phenomenon have not been well defined. This study was to investigate the effect of DOX on the changes of global genomics in hearts. Acute cardiotoxicity was induced by giving C57BL/6J mice a single intraperitoneal injection of DOX (15 mg/kg). Cardiac function and apoptosis were monitored using echocardiography and TUNEL assay at days 1, 3 and 5. Myocardial glucose and ATP levels were measured. Microarray assays were used to screen gene expression profiles in the hearts at day 5, and the results were confirmed with qPCR analysis. DOX administration caused decreased cardiac function, increased cardiomyocyte apoptosis and decreased glucose and ATP levels. Microarrays showed 747 up-regulated genes and 438 down-regulated genes involved in seven main functional categories. Among them, metabolic pathway was the most affected by DOX. Several key genes, including 2,3-bisphosphoglycerate mutase (Bpgm), hexokinase 2, pyruvate dehydrogenase kinase, isoenzyme 4 and fructose-2,6-bisphosphate 2-phosphatase, are closely related to glucose metabolism. Gene co-expression networks suggested the core role of Bpgm in DOX cardiomyopathy. These results obtained in mice were further confirmed in cultured cardiomyocytes. In conclusion, genes involved in glucose metabolism, especially Bpgm, may play a central role in the pathogenesis of DOX-induced cardiotoxicity. PMID:25575753

  2. Shared Pathways Among Autism Candidate Genes Determined by Co-expression Network Analysis of the Developing Human Brain Transcriptome.

    PubMed

    Mahfouz, Ahmed; Ziats, Mark N; Rennert, Owen M; Lelieveldt, Boudewijn P F; Reinders, Marcel J T

    2015-12-01

    Autism spectrum disorder (ASD) is a neurodevelopmental syndrome known to have a significant but complex genetic etiology. Hundreds of diverse genes have been implicated in ASD; yet understanding how many genes, each with disparate function, can all be linked to a single clinical phenotype remains unclear. We hypothesized that understanding functional relationships between autism candidate genes during normal human brain development may provide convergent mechanistic insight into the genetic heterogeneity of ASD. We analyzed the co-expression relationships of 455 genes previously implicated in autism using the BrainSpan human transcriptome database, across 16 anatomical brain regions spanning prenatal life through adulthood. We discovered modules of ASD candidate genes with biologically relevant temporal co-expression dynamics, which were enriched for functional ontologies related to synaptogenesis, apoptosis, and GABA-ergic neurons. Furthermore, we also constructed co-expression networks from the entire transcriptome and found that ASD candidate genes were enriched in modules related to mitochondrial function, protein translation, and ubiquitination. Hub genes central to these ASD-enriched modules were further identified, and their functions supported these ontological findings. Overall, our multi-dimensional co-expression analysis of ASD candidate genes in the normal developing human brain suggests the heterogeneous set of ASD candidates share transcriptional networks related to synapse formation and elimination, protein turnover, and mitochondrial function. PMID:26399424

  3. Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks

    PubMed Central

    2012-01-01

    Background The growing use of imaging procedures in medicine has raised concerns about exposure to low-dose ionising radiation (LDIR). While the disastrous effects of high dose ionising radiation (HDIR) is well documented, the detrimental effects of LDIR is not well understood and has been a topic of much debate. Since little is known about the effects of LDIR, various kinds of wet-lab and computational analyses are required to advance knowledge in this domain. In this paper we carry out an “upside-down pyramid” form of systems biology analysis of microarray data. We characterised the global genomic response following 10 cGy (low dose) and 100 cGy (high dose) doses of X-ray ionising radiation at four time points by analysing the topology of gene coexpression networks. This study includes a rich experimental design and state-of-the-art computational systems biology methods of analysis to study the differences in the transcriptional response of skin cells exposed to low and high doses of radiation. Results Using this method we found important genes that have been linked to immune response, cell survival and apoptosis. Furthermore, we also were able to identify genes such as BRCA1, ABCA1, TNFRSF1B, MLLT11 that have been associated with various types of cancers. We were also able to detect many genes known to be associated with various medical conditions. Conclusions Our method of applying network topological differences can aid in identifying the differences among similar (eg: radiation effect) yet very different biological conditions (eg: different dose and time) to generate testable hypotheses. This is the first study where a network level analysis was performed across two different radiation doses at various time points, thereby illustrating changes in the cellular response over time. PMID:22594378

  4. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  5. A Predictive Coexpression Network Identifies Novel Genes Controlling the Seed-to-Seedling Phase Transition in Arabidopsis thaliana1[OPEN

    PubMed Central

    Silva, Anderson Tadeu; Ribone, Pamela A.

    2016-01-01

    The transition from a quiescent dry seed to an actively growing photoautotrophic seedling is a complex and crucial trait for plant propagation. This study provides a detailed description of global gene expression in seven successive developmental stages of seedling establishment in Arabidopsis (Arabidopsis thaliana). Using the transcriptome signature from these developmental stages, we obtained a coexpression gene network that highlights interactions between known regulators of the seed-to-seedling transition and predicts the functions of uncharacterized genes in seedling establishment. The coexpressed gene data sets together with the transcriptional module indicate biological functions related to seedling establishment. Characterization of the homeodomain leucine zipper I transcription factor AtHB13, which is expressed during the seed-to-seedling transition, demonstrated that this gene regulates some of the network nodes and affects late seedling establishment. Knockout mutants for athb13 showed increased primary root length as compared with wild-type (Columbia-0) seedlings, suggesting that this transcription factor is a negative regulator of early root growth, possibly repressing cell division and/or cell elongation or the length of time that cells elongate. The signal transduction pathways present during the early phases of the seed-to-seedling transition anticipate the control of important events for a vigorous seedling, such as root growth. This study demonstrates that a gene coexpression network together with transcriptional modules can provide insights that are not derived from comparative transcript profiling alone. PMID:26888061

  6. Topological and functional discovery in a gene coexpression meta-network of gastric cancer.

    PubMed

    Aggarwal, Amit; Guo, Dong Li; Hoshida, Yujin; Yuen, Siu Tsan; Chu, Kent-Man; So, Samuel; Boussioutas, Alex; Chen, Xin; Bowtell, David; Aburatani, Hiroyuki; Leung, Suet Yi; Tan, Patrick

    2006-01-01

    Gastric cancer is a leading cause of global cancer mortality, but comparatively little is known about the cellular pathways regulating different aspects of the gastric cancer phenotype. To achieve a better understanding of gastric cancer at the levels of systems topology, functional modules, and constituent genes, we assembled and systematically analyzed a consensus gene coexpression meta-network of gastric cancer incorporating >300 tissue samples from four independent patient populations (the "gastrome"). We find that the gastrome exhibits a hierarchical scale-free architecture, with an internal structure comprising multiple deeply embedded modules associated with diverse cellular functions. Individual modules display distinct subtopologies, with some (cellular proliferation) being integrated within the primary network, and others (ribosomal biosynthesis) being relatively isolated. One module associated with intestinal differentiation exhibited a remarkably high degree of autonomy, raising the possibility that its specific topological features may contribute towards the frequent occurrence of intestinal metaplasia in gastric cancer. At the single-gene level, we discovered a novel conserved interaction between the PLA2G2A prognostic marker and the EphB2 receptor, and used tissue microarrays to validate the PLA2G2A/EphB2 association. Finally, because EphB2 is a known target of the Wnt signaling pathway, we tested and provide evidence that the Wnt pathway may also similarly regulate PLA2G2A. Many of these findings were not discernible by studying the single patient populations in isolation. Thus, besides enhancing our knowledge of gastric cancer, our results show the broad utility of applying meta-analytic approaches to genome-wide data for the purposes of biological discovery. PMID:16397236

  7. Discovering gene re-ranking efficiency and conserved gene-gene relationships derived from gene co-expression network analysis on breast cancer data

    PubMed Central

    Bourdakou, Marilena M.; Athanasiadis, Emmanouil I.; Spyrou, George M.

    2016-01-01

    Systemic approaches are essential in the discovery of disease-specific genes, offering a different perspective and new tools on the analysis of several types of molecular relationships, such as gene co-expression or protein-protein interactions. However, due to lack of experimental information, this analysis is not fully applicable. The aim of this study is to reveal the multi-potent contribution of statistical network inference methods in highlighting significant genes and interactions. We have investigated the ability of statistical co-expression networks to highlight and prioritize genes for breast cancer subtypes and stages in terms of: (i) classification efficiency, (ii) gene network pattern conservation, (iii) indication of involved molecular mechanisms and (iv) systems level momentum to drug repurposing pipelines. We have found that statistical network inference methods are advantageous in gene prioritization, are capable to contribute to meaningful network signature discovery, give insights regarding the disease-related mechanisms and boost drug discovery pipelines from a systems point of view. PMID:26892392

  8. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

    SciTech Connect

    Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia; Callister, Stephen J.; Wright, Aaron T.; Westbye, Alexander; Beatty, J. T.; Lang, Andrew S.

    2014-08-28

    The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigated preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional

  9. Identification of hub genes of pneumocyte senescence induced by thoracic irradiation using weighted gene co-expression network analysis

    PubMed Central

    XING, YONGHUA; ZHANG, JUNLING; LU, LU; LI, DEGUAN; WANG, YUEYING; HUANG, SONG; LI, CHENGCHENG; ZHANG, ZHUBO; LI, JIANGUO; MENG, AIMIN

    2016-01-01

    Irradiation commonly causes pneumocyte senescence, which may lead to severe fatal lung injury characterized by pulmonary dysfunction and respiratory failure. However, the molecular mechanism underlying the induction of pneumocyte senescence by irradiation remains to be elucidated. In the present study, weighted gene co-expression network analysis (WGCNA) was used to screen for differentially expressed genes, and to identify the hub genes and gene modules, which may be critical for senescence. A total of 2,916 differentially expressed genes were identified between the senescence and non-senescence groups following thoracic irradiation. In total, 10 gene modules associated with cell senescence were detected, and six hub genes were identified, including B-cell scaffold protein with ankyrin repeats 1, translocase of outer mitochondrial membrane 70 homolog A, actin filament-associated protein 1, Cd84, Nuf2 and nuclear factor erythroid 2. These genes were markedly associated with cell proliferation, cell division and cell cycle arrest. The results of the present study demonstrated that WGCNA of microarray data may provide further insight into the molecular mechanism underlying pneumocyte senescence. PMID:26572216

  10. Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes.

    PubMed

    Riquelme Medina, Ignacio; Lubovac-Pilav, Zelmina

    2016-01-01

    Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body's inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D. PMID:27257970

  11. Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes

    PubMed Central

    Riquelme Medina, Ignacio; Lubovac-Pilav, Zelmina

    2016-01-01

    Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body’s inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D. PMID:27257970

  12. Assessing the Biological Significance of Gene Expression Signatures and Co-Expression Modules by Studying Their Network Properties

    PubMed Central

    Minguez, Pablo; Dopazo, Joaquin

    2011-01-01

    Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem to be reproducible across experiments. Given the close relationship between function and protein interaction, network properties can be used to study to what extent signatures are composed of genes whose resulting proteins show a considerable level of interaction (and consequently a putative common functional role). We have analysed 618 signatures and 507 modules of co-expression in cancer looking for significant values of four main protein-protein interaction (PPI) network parameters: connection degree, cluster coefficient, betweenness and number of components. A total of 3904 gene ontology (GO) modules, 146 KEGG pathways, and 263 Biocarta pathways have been used as functional modules of reference. Co-expression modules found in microarray experiments display a high level of connectivity, similar to the one shown by conventional modules based on functional definitions (GO, KEGG and Biocarta). A general observation for all the classes studied is that the networks formed by the modules improve their topological parameters when an external protein is allowed to be introduced within the paths (up to the 70% of GO modules show network parameters beyond the random expectation). This fact suggests that functional definitions are incomplete and some genes might still be missing. Conversely, signatures are clearly not capturing the altered functions in the corresponding studies. This is probably because the way in which the genes have been selected in the signatures is too conservative. These results suggest that gene selection methods which take into account relationships among genes should be superior to methods that assume independence among genes outside their functional

  13. Screening genes crucial for pediatric pilocytic astrocytoma using weighted gene coexpression network analysis combined with methylation data analysis.

    PubMed

    Zhao, H; Cai, W; Su, S; Zhi, D; Lu, J; Liu, S

    2014-10-01

    To identify novel genes associated with pediatric pilocytic astrocytoma (PA) for better understanding the molecular mechanism underlying the pediatric PA pathogenesis. Gene expression profile data of GSE50161 and GSE44971 and the methylation data of GSE44684 were downloaded from Gene Expression Omnibus. The differentially expressed genes (DEGs) between PA and normal control samples were screened using the limma package in R, and then used to construct weighted gene coexpression network (WGCN) using the WGCN analysis (WGCNA) package in R. Significant modules of DEGs were selected using the clustering analysis. Function enrichment analysis of the DEGs in significant modules were performed using the WGCNA package and clusterprofiler package in R. Correlation between methylation sites of DEGs and PA was analyzed using the CpGassoc package in R. Totally, 3479 DEGs were screened in PA samples. Thereinto, 3424 DEGs were used to construct the WGCN. Several significant modules of DEGs were selected based on the WGCN, in which the turquoise module was positively related to PA, whereas blue module was negatively related to PA. DEGs (for example, DOCK2 (dedicator of cytokinesis 2), DOCK8 and FCGR2A (Fc fragment of IgG, low affinity IIa)) in blue module were mainly involved in Fc gamma R-mediated phagocytosis pathway and natural killer cell-mediated cytotoxicity pathway. Methylations of 14 DEGs among the top 30 genes in blue module were related to PA. Our data suggest that DOCK2, DOCK8 and FCGR2A may represent potential therapeutic targets in PA that merits further investigation. PMID:25257306

  14. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  15. Gene Co-Expression Network Analysis Provides Novel Insights into Myostatin Regulation at Three Different Mouse Developmental Timepoints

    PubMed Central

    Yang, Xuerong; Koltes, James E.; Park, Carissa A.; Chen, Daiwen; Reecy, James M.

    2015-01-01

    Myostatin (Mstn) knockout mice exhibit large increases in skeletal muscle mass. However, relatively few of the genes that mediate or modify MSTN effects are known. In this study, we performed co-expression network analysis using whole transcriptome microarray data from MSTN-null and wild-type mice to identify genes involved in important biological processes and pathways related to skeletal muscle and adipose development. Genes differentially expressed between wild-type and MSTN-null mice were further analyzed for shared DNA motifs using DREME. Differentially expressed genes were identified at 13.5 d.p.c. during primary myogenesis and at d35 during postnatal muscle development, but not at 17.5 d.p.c. during secondary myogenesis. In total, 283 and 2034 genes were differentially expressed at 13.5 d.p.c. and d35, respectively. Over-represented transcription factor binding sites in differentially expressed genes included SMAD3, SP1, ZFP187, and PLAGL1. The use of regulatory (RIF) and phenotypic (PIF) impact factor and differential hubbing co-expression analyses identified both known and potentially novel regulators of skeletal muscle growth, including Apobec2, Atp2a2, and Mmp13 at d35 and Sox2, Tmsb4x, and Vdac1 at 13.5 d.p.c. Among the genes with the highest PIF scores were many fiber type specifying genes. The use of RIF, PIF, and differential hubbing analyses identified both known and potentially novel regulators of muscle development. These results provide new details of how MSTN may mediate transcriptional regulation as well as insight into novel regulators of MSTN signal transduction that merit further study regarding their physiological roles in muscle and adipose development. PMID:25695797

  16. Coexpression network analysis of the genes regulated by two types of resistance responses to powdery mildew in wheat.

    PubMed

    Zhang, Juncheng; Zheng, Hongyuan; Li, Yiwen; Li, Hongjie; Liu, Xin; Qin, Huanju; Dong, Lingli; Wang, Daowen

    2016-01-01

    Powdery mildew disease caused by Blumeria graminis f. sp. tritici (Bgt) inflicts severe economic losses in wheat crops. A systematic understanding of the molecular mechanisms involved in wheat resistance to Bgt is essential for effectively controlling the disease. Here, using the diploid wheat Triticum urartu as a host, the genes regulated by immune (IM) and hypersensitive reaction (HR) resistance responses to Bgt were investigated through transcriptome sequencing. Four gene coexpression networks (GCNs) were developed using transcriptomic data generated for 20 T. urartu accessions showing IM, HR or susceptible responses. The powdery mildew resistance regulated (PMRR) genes whose expression was significantly correlated with Bgt resistance were identified, and they tended to be hubs and enriched in six major modules. A wide occurrence of negative regulation of PMRR genes was observed. Three new candidate immune receptor genes (TRIUR3_13045, TRIUR3_01037 and TRIUR3_06195) positively associated with Bgt resistance were discovered. Finally, the involvement of TRIUR3_01037 in Bgt resistance was tentatively verified through cosegregation analysis in a F2 population and functional expression assay in Bgt susceptible leaf cells. This research provides insights into the global network properties of PMRR genes. Potential molecular differences between IM and HR resistance responses to Bgt are discussed. PMID:27033636

  17. Coexpression network analysis of the genes regulated by two types of resistance responses to powdery mildew in wheat

    PubMed Central

    Zhang, Juncheng; Zheng, Hongyuan; Li, Yiwen; Li, Hongjie; Liu, Xin; Qin, Huanju; Dong, Lingli; Wang, Daowen

    2016-01-01

    Powdery mildew disease caused by Blumeria graminis f. sp. tritici (Bgt) inflicts severe economic losses in wheat crops. A systematic understanding of the molecular mechanisms involved in wheat resistance to Bgt is essential for effectively controlling the disease. Here, using the diploid wheat Triticum urartu as a host, the genes regulated by immune (IM) and hypersensitive reaction (HR) resistance responses to Bgt were investigated through transcriptome sequencing. Four gene coexpression networks (GCNs) were developed using transcriptomic data generated for 20 T. urartu accessions showing IM, HR or susceptible responses. The powdery mildew resistance regulated (PMRR) genes whose expression was significantly correlated with Bgt resistance were identified, and they tended to be hubs and enriched in six major modules. A wide occurrence of negative regulation of PMRR genes was observed. Three new candidate immune receptor genes (TRIUR3_13045, TRIUR3_01037 and TRIUR3_06195) positively associated with Bgt resistance were discovered. Finally, the involvement of TRIUR3_01037 in Bgt resistance was tentatively verified through cosegregation analysis in a F2 population and functional expression assay in Bgt susceptible leaf cells. This research provides insights into the global network properties of PMRR genes. Potential molecular differences between IM and HR resistance responses to Bgt are discussed. PMID:27033636

  18. Learning from Co-expression Networks: Possibilities and Challenges

    PubMed Central

    Serin, Elise A. R.; Nijveen, Harm; Hilhorst, Henk W. M.; Ligterink, Wilco

    2016-01-01

    Plants are fascinating and complex organisms. A comprehensive understanding of the organization, function and evolution of plant genes is essential to disentangle important biological processes and to advance crop engineering and breeding strategies. The ultimate aim in deciphering complex biological processes is the discovery of causal genes and regulatory mechanisms controlling these processes. The recent surge of omics data has opened the door to a system-wide understanding of the flow of biological information underlying complex traits. However, dealing with the corresponding large data sets represents a challenging endeavor that calls for the development of powerful bioinformatics methods. A popular approach is the construction and analysis of gene networks. Such networks are often used for genome-wide representation of the complex functional organization of biological systems. Network based on similarity in gene expression are called (gene) co-expression networks. One of the major application of gene co-expression networks is the functional annotation of unknown genes. Constructing co-expression networks is generally straightforward. In contrast, the resulting network of connected genes can become very complex, which limits its biological interpretation. Several strategies can be employed to enhance the interpretation of the networks. A strategy in coherence with the biological question addressed needs to be established to infer reliable networks. Additional benefits can be gained from network-based strategies using prior knowledge and data integration to further enhance the elucidation of gene regulatory relationships. As a result, biological networks provide many more applications beyond the simple visualization of co-expressed genes. In this study we review the different approaches for co-expression network inference in plants. We analyse integrative genomics strategies used in recent studies that successfully identified candidate genes taking advantage of

  19. Time ordering of gene coexpression.

    PubMed

    Leng, Xiaoyan; Müller, Hans-Georg

    2006-10-01

    Temporal microarray gene expression profiles allow characterization of gene function through time dynamics of gene coexpression within the same genetic pathway. In this paper, we define and estimate a global time shift characteristic for each gene via least squares, inferred from pairwise curve alignments. These time shift characteristics of individual genes reflect a time ordering that is derived from ob- served temporal gene expression profiles. Once these time shift characteristics are obtained for each gene, they can be entered into further analyses, such as clustering. We illustrate the proposed methodology using Drosophila embryonic development and yeast cell-cycle gene expression profiles, as well as simulations. Feasibility is demonstrated through the successful recovery of time ordering. Estimated time shifts for Drosophila maternal and zygotic genes provide excellent discrimination between these two categories and confirm known genetic pathways through the time order of gene expression. The application to yeast cell-cycle data establishes a natural time order of genes that is in line with cell-cycle phases. The method does not require periodicity of gene expression profiles. Asymptotic justifications are also provided. PMID:16495429

  20. Co-expression network analysis of differentially expressed genes associated with metastasis in prolactin pituitary tumors.

    PubMed

    Zhang, Wei; Zang, Zhenle; Song, Yechun; Yang, Hui; Yin, Qing

    2014-07-01

    The aim of the present study was to construct a co‑expression network of differently expressed genes (DEGs) in prolactin pituitary (PRL) tumor metastasis. The gene expression profile, GSE22812 was downloaded from the Gene Expression Omnibus database, and including five non‑invasive, two invasive and six aggressive‑invasive PRL tumor samples. Compared with non‑invasive samples, DEGs were identified in invasive and aggressive‑invasive samples using a limma package in R language. The expression values of DEGs were hierarchically clustered. Next, Gene Ontology (GO) function enrichment and Kyoto Encyclopedia of Genes and Genomes pathway analysis of DEGs were performed via The Database for Annotation, Visualization and Integrated Discovery. Finally, gene pairs of DEGs between non‑invasive and aggressive‑invasive samples were identified using the Spearman cor( ) function in R language. Compared with the non‑invasive samples, 61 and 89 DEGs were obtained from invasive and aggressive‑invasive samples, respectively. Cluster analysis showed that four genes were shared by the two samples, including upregulated solute carrier family 2, facilitated glucose transporter member 11 (SLC2A11) and teneurin transmembrane protein 1 (TENM1) and downregulated importin 7 (IPO7) and chromogranin B (CHGB). In the invasive samples, the most significant GO terms responded to cyclic adenosine monophosphate and a glucocorticoid stimulus. However, this occurred in the cell cycle, and was in response to hormone stimulation in aggressive‑invasive samples. The co‑expression network of DEGs showed different gene pairs and modules, and SLC2A11 and CHGB occurred in two co‑expression networks within different co‑expressed pairs. In the present study, the co‑expression network was constructed using bioinformatics methods. SLC2A11, TENM1, IPO7 and CHGB are hypothesized to be closely associated with metastasis of PRL. Furthermore, CHGB and SLC2A11 may be significant in PRL

  1. Coexpression analysis of human genes across many microarray data sets.

    PubMed

    Lee, Homin K; Hsu, Amy K; Sajdak, Jon; Qin, Jie; Pavlidis, Paul

    2004-06-01

    We present a large-scale analysis of mRNA coexpression based on 60 large human data sets containing a total of 3924 microarrays. We sought pairs of genes that were reliably coexpressed (based on the correlation of their expression profiles) in multiple data sets, establishing a high-confidence network of 8805 genes connected by 220,649 "coexpression links" that are observed in at least three data sets. Confirmed positive correlations between genes were much more common than confirmed negative correlations. We show that confirmation of coexpression in multiple data sets is correlated with functional relatedness, and show how cluster analysis of the network can reveal functionally coherent groups of genes. Our findings demonstrate how the large body of accumulated microarray data can be exploited to increase the reliability of inferences about gene function. PMID:15173114

  2. Immuno-Navigator, a batch-corrected coexpression database, reveals cell type-specific gene networks in the immune system

    PubMed Central

    Vandenbon, Alexis; Dinh, Viet H.; Mikami, Norihisa; Kitagawa, Yohko; Teraguchi, Shunsuke; Ohkura, Naganari; Sakaguchi, Shimon

    2016-01-01

    High-throughput gene expression data are one of the primary resources for exploring complex intracellular dynamics in modern biology. The integration of large amounts of public data may allow us to examine general dynamical relationships between regulators and target genes. However, obstacles for such analyses are study-specific biases or batch effects in the original data. Here we present Immuno-Navigator, a batch-corrected gene expression and coexpression database for 24 cell types of the mouse immune system. We systematically removed batch effects from the underlying gene expression data and showed that this removal considerably improved the consistency between inferred correlations and prior knowledge. The data revealed widespread cell type-specific correlation of expression. Integrated analysis tools allow users to use this correlation of expression for the generation of hypotheses about biological networks and candidate regulators in specific cell types. We show several applications of Immuno-Navigator as examples. In one application we successfully predicted known regulators of importance in naturally occurring Treg cells from their expression correlation with a set of Treg-specific genes. For one high-scoring gene, integrin β8 (Itgb8), we confirmed an association between Itgb8 expression in forkhead box P3 (Foxp3)-positive T cells and Treg-specific epigenetic remodeling. Our results also suggest that the regulation of Treg-specific genes within Treg cells is relatively independent of Foxp3 expression, supporting recent results pointing to a Foxp3-independent component in the development of Treg cells. PMID:27078110

  3. Functional Analysis and Characterization of Differential Coexpression Networks

    PubMed Central

    Hsu, Chia-Lang; Juan, Hsueh-Fen; Huang, Hsuan-Cheng

    2015-01-01

    Differential coexpression analysis is emerging as a complement to conventional differential gene expression analysis. The identified differential coexpression links can be assembled into a differential coexpression network (DCEN) in response to environmental stresses or genetic changes. Differential coexpression analyses have been successfully used to identify condition-specific modules; however, the structural properties and biological significance of general DCENs have not been well investigated. Here, we analyzed two independent Saccharomyces cerevisiae DCENs constructed from large-scale time-course gene expression profiles in response to different situations. Topological analyses show that DCENs are tree-like networks possessing scale-free characteristics, but not small-world. Functional analyses indicate that differentially coexpressed gene pairs in DCEN tend to link different biological processes, achieving complementary or synergistic effects. Furthermore, the gene pairs lacking common transcription factors are sensitive to perturbation and hence lead to differential coexpression. Based on these observations, we integrated transcriptional regulatory information into DCEN and identified transcription factors that might cause differential coexpression by gain or loss of activation in response to different situations. Collectively, our results not only uncover the unique structural characteristics of DCEN but also provide new insights into interpretation of DCEN to reveal its biological significance and infer the underlying gene regulatory dynamics. PMID:26282208

  4. Rat Hepatocytes Weighted Gene Co-Expression Network Analysis Identifies Specific Modules and Hub Genes Related to Liver Regeneration after Partial Hepatectomy

    PubMed Central

    Zhou, Yun; Xu, Jiucheng; Liu, Yunqing; Li, Juntao; Chang, Cuifang; Xu, Cunshuan

    2014-01-01

    The recovery of liver mass is mainly mediated by proliferation of hepatocytes after 2/3 partial hepatectomy (PH) in rats. Studying the gene expression profiles of hepatocytes after 2/3 PH will be helpful to investigate the molecular mechanisms of liver regeneration (LR). We report here the first application of weighted gene co-expression network analysis (WGCNA) to analyze the biological implications of gene expression changes associated with LR. WGCNA identifies 12 specific gene modules and some hub genes from hepatocytes genome-scale microarray data in rat LR. The results suggest that upregulated MCM5 may promote hepatocytes proliferation during LR; BCL3 may play an important role by activating or inhibiting NF-kB pathway; MAPK9 may play a permissible role in DNA replication by p38 MAPK inactivation in hepatocytes proliferation stage. Thus, WGCNA can provide novel insight into understanding the molecular mechanisms of LR. PMID:24743545

  5. A Null Model for Pearson Coexpression Networks

    PubMed Central

    Gobbi, Andrea; Jurman, Giuseppe

    2015-01-01

    Gene coexpression networks inferred by correlation from high-throughput profiling such as microarray data represent simple but effective structures for discovering and interpreting linear gene relationships. In recent years, several approaches have been proposed to tackle the problem of deciding when the resulting correlation values are statistically significant. This is most crucial when the number of samples is small, yielding a non-negligible chance that even high correlation values are due to random effects. Here we introduce a novel hard thresholding solution based on the assumption that a coexpression network inferred by randomly generated data is expected to be empty. The threshold is theoretically derived by means of an analytic approach and, as a deterministic independent null model, it depends only on the dimensions of the starting data matrix, with assumptions on the skewness of the data distribution compatible with the structure of gene expression levels data. We show, on synthetic and array datasets, that the proposed threshold is effective in eliminating all false positive links, with an offsetting cost in terms of false negative detected edges. PMID:26030917

  6. A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes

    PubMed Central

    de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806

  7. From SNP co-association to RNA co-expression: Novel insights into gene networks for intramuscular fatty acid composition in porcine

    PubMed Central

    2014-01-01

    Background Fatty acids (FA) play a critical role in energy homeostasis and metabolic diseases; in the context of livestock species, their profile also impacts on meat quality for healthy human consumption. Molecular pathways controlling lipid metabolism are highly interconnected and are not fully understood. Elucidating these molecular processes will aid technological development towards improvement of pork meat quality and increased knowledge of FA metabolism, underpinning metabolic diseases in humans. Results The results from genome-wide association studies (GWAS) across 15 phenotypes were subjected to an Association Weight Matrix (AWM) approach to predict a network of 1,096 genes related to intramuscular FA composition in pigs. To identify the key regulators of FA metabolism, we focused on the minimal set of transcription factors (TF) that the explored the majority of the network topology. Pathway and network analyses pointed towards a trio of TF as key regulators of FA metabolism: NCOA2, FHL2 and EP300. Promoter sequence analyses confirmed that these TF have binding sites for some well-know regulators of lipid and carbohydrate metabolism. For the first time in a non-model species, some of the co-associations observed at the genetic level were validated through co-expression at the transcriptomic level based on real-time PCR of 40 genes in adipose tissue, and a further 55 genes in liver. In particular, liver expression of NCOA2 and EP300 differed between pig breeds (Iberian and Landrace) extreme in terms of fat deposition. Highly clustered co-expression networks in both liver and adipose tissues were observed. EP300 and NCOA2 showed centrality parameters above average in the both networks. Over all genes, co-expression analyses confirmed 28.9% of the AWM predicted gene-gene interactions in liver and 33.0% in adipose tissue. The magnitude of this validation varied across genes, with up to 60.8% of the connections of NCOA2 in adipose tissue being validated via co-expression

  8. Transcriptome comparison and gene coexpression network analysis provide a systems view of citrus response to ‘Candidatus Liberibacter asiaticus’ infection

    PubMed Central

    2013-01-01

    Background Huanglongbing (HLB) is arguably the most destructive disease for the citrus industry. HLB is caused by infection of the bacterium, Candidatus Liberibacter spp. Several citrus GeneChip studies have revealed thousands of genes that are up- or down-regulated by infection with Ca. Liberibacter asiaticus. However, whether and how these host genes act to protect against HLB remains poorly understood. Results As a first step towards a mechanistic view of citrus in response to the HLB bacterial infection, we performed a comparative transcriptome analysis and found that a total of 21 Probesets are commonly up-regulated by the HLB bacterial infection. In addition, a number of genes are likely regulated specifically at early, late or very late stages of the infection. Furthermore, using Pearson correlation coefficient-based gene coexpression analysis, we constructed a citrus HLB response network consisting of 3,507 Probesets and 56,287 interactions. Genes involved in carbohydrate and nitrogen metabolic processes, transport, defense, signaling and hormone response were overrepresented in the HLB response network and the subnetworks for these processes were constructed. Analysis of the defense and hormone response subnetworks indicates that hormone response is interconnected with defense response. In addition, mapping the commonly up-regulated HLB responsive genes into the HLB response network resulted in a core subnetwork where transport plays a key role in the citrus response to the HLB bacterial infection. Moreover, analysis of a phloem protein subnetwork indicates a role for this protein and zinc transporters or zinc-binding proteins in the citrus HLB defense response. Conclusion Through integrating transcriptome comparison and gene coexpression network analysis, we have provided for the first time a systems view of citrus in response to the Ca. Liberibacter spp. infection causing HLB. PMID:23324561

  9. Protein Co-Expression Network Analysis (ProCoNA)

    SciTech Connect

    Gibbs, David L.; Baratt, Arie; Baric, Ralph; Kawaoka, Yoshihiro; Smith, Richard D.; Orwoll, Eric S.; Katze, Michael G.; Mcweeney, Shannon K.

    2013-06-01

    Biological networks are important for elucidating disease etiology due to their ability to model complex high dimensional data and biological systems. Proteomics provides a critical data source for such models, but currently lacks robust de novo methods for network construction, which could bring important insights in systems biology. We have evaluated the construction of network models using methods derived from weighted gene co-expression network analysis (WGCNA). We show that approximately scale-free peptide networks, composed of statistically significant modules, are feasible and biologically meaningful using two mouse lung experiments and one human plasma experiment. Within each network, peptides derived from the same protein are shown to have a statistically higher topological overlap and concordance in abundance, which is potentially important for inferring protein abundance. The module representatives, called eigenpeptides, correlate significantly with biological phenotypes. Furthermore, within modules, we find significant enrichment for biological function and known interactions (gene ontology and protein-protein interactions). Biological networks are important tools in the analysis of complex systems. In this paper we evaluate the application of weighted co-expression network analysis to quantitative proteomics data. Protein co-expression networks allow novel approaches for biological interpretation, quality control, inference of protein abundance, a framework for potentially resolving degenerate peptide-protein mappings, and a biomarker signature discovery.

  10. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance. PMID:25288767

  11. Gene coexpression measures in large heterogeneous samples using count statistics

    PubMed Central

    Wang, Y. X. Rachel; Waterman, Michael S.; Huang, Haiyan

    2014-01-01

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the “big data” challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance. PMID:25288767

  12. Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks

    PubMed Central

    2015-01-01

    Background Bladder cancer is the most common malignant tumor of the urinary system and it is a heterogeneous disease with both superficial and invasive growth. However, its aetiological agent is still unclear. And it is indispensable to find key genes or modules causing the bladder cancer. Based on gene expression microarray datasets, constructing differential co-expression networks (DCNs) is an important method to investigate diseases and there have been some relevant good tools such as R package 'WGCNA', 'DCGL'. Results Employing an integrated strategy, 36 up-regulated differentially expressed genes (DEGs) and 356 down-regulated DEGs were selected and main functions of those DEGs are cellular physiological precess(24 up-regulated DEGs; 167 down-regulated DEGs) and cellular metabolism (19 up-regulated DEGs; 104 down-regulated DEGs). The up-regulated DEGs are mainly involved in the the pathways related to "metabolism". By comparing two DCNs between the normal and cancer states, we found some great changes in hub genes and topological structure, which suggest that the modules of two different DCNs change a lot. Especially, we screened some hub genes of a differential subnetwork between the normal and the cancer states and then do bioinformatics analysis for them. Conclusions Through constructing and analyzing two differential co-expression networks at different states using the screened DEGs, we found some hub genes associated with the bladder cancer. The results of the bioinformatics analysis for those hub genes will support the biological experiments and the further treatment of the bladder cancer. PMID:25707808

  13. Network-Based Identification of Biomarkers Coexpressed with Multiple Pathways

    PubMed Central

    Guo, Nancy Lan; Wan, Ying-Wooi

    2014-01-01

    Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database. PMID:25392692

  14. Integrating mRNA and miRNA Weighted Gene Co-Expression Networks with eQTLs in the Nucleus Accumbens of Subjects with Alcohol Dependence

    PubMed Central

    Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; D. van der Vaart, Andrew; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S.; Miles, Michael F.; Dick, Danielle; Riley, Brien P.; Dumur, Catherine; Vladimirov, Vladimir I.

    2015-01-01

    Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263

  15. Integrating mRNA and miRNA Weighted Gene Co-Expression Networks with eQTLs in the Nucleus Accumbens of Subjects with Alcohol Dependence.

    PubMed

    Mamdani, Mohammed; Williamson, Vernell; McMichael, Gowon O; Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; van der Vaart, Andrew D; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S; Miles, Michael F; Dick, Danielle; Riley, Brien P; Dumur, Catherine; Vladimirov, Vladimir I

    2015-01-01

    Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263

  16. Understanding the progression of atherosclerosis through gene profiling and co-expression network analysis in Apob(tm2Sgy)Ldlr(tm1Her) double knockout mice.

    PubMed

    Deshpande, Vrushali; Sharma, Ankit; Mukhopadhyay, Rupak; Thota, Lakshmi Narasimha Rao; Ghatge, Madankumar; Vangala, Rajani Kanth; Kakkar, Vijay V; Mundkur, Lakshmi

    2016-06-01

    The objective of the study was to gain molecular insights into the progression of atherosclerosis in Apob(tm2Sgy)Ldlr(tm1Her) mice, using transcriptome profiles. Weighted gene co network analysis (WGCNA) and time course analysis using limma were used to study disease progression from 0 to 20weeks. Five co-expression modules were identified by WGCNA using the expression values of 2153 genes. Genes associated with autophagy, endoplasmic reticulum stress, inflammation and lipid metabolism were differentially expressed at early stages of atherosclerosis. Time course analysis highlighted activation of inflammatory gene signaling at 4weeks, cell proliferation and calcification at 8weeks, amyloid like structures and oxidative stress at 14weeks and enhanced production of inflammatory cytokines at 20weeks. Our results suggest that maximum gene perturbations occur during early atherosclerosis which could be the danger signals associated with subclinical disease. Understanding these genes and associated pathways can help in improvement of diagnostic and therapeutic targets for atherosclerosis. PMID:27133569

  17. Map3k1, Il6st, Gzmk, and Hspb3 gene coexpression network in the mechanism of freezing reaction in mice.

    PubMed

    Kondaurova, Elena M; Naumenko, Vladimir S; Sinyakova, Nadezda A; Kulikov, Alexander V

    2011-02-01

    Freezing reaction (catalepsy) is a natural passive defensive strategy in animals. An exaggerated form of catalepsy is a symptom of grave brain dysfunction. Catalepsy in mice was shown to be linked to the Map3k1, Il6st, Gzmk, and Hspb3 genes as potential candidates for a high predisposition to catalepsy. The study sought to test the hypothesis of an association between catalepsy and expression of these genes in the brain. Thegenes' mRNA levels were measured in the hypothalamus, hippocampus, frontal cortex, striatum, and midbrain of catalepsy-resistant AKR/J strain and catalepsy-prone strains CBA/Lac, ASC (antidepressant-sensitive cataleptic) and the congenic line AKR.CBA-D13M76C. No association between expression of any investigated genes and predisposition to catalepsy was found. At the same time, multivariate analysis revealed interactions among the expressions of Map3k1, Il6st, Gzmk, and Hspb3 genes in the brain structures. A factor analysis of all variables produced two independent factors explaining 76.2% of the total variance. The catalepsy-resistant AKR strain was distinguished from the catalepsy-prone strains CBA, ASC, and AKR.CBA-D13M76C by factor 1. It was suggested that a high predisposition to catalepsy in mice can be defined by the Map3k1, Il6st, Gzmk, and Hspb3 genes' coexpression network. PMID:21162133

  18. Discriminative gene co-expression network analysis uncovers novel modules involved in the formation of phosphate deficiency-induced root hairs in Arabidopsis

    PubMed Central

    Salazar-Henao, Jorge E.; Lin, Wen-Dar; Schmidt, Wolfgang

    2016-01-01

    Cell fate and differentiation in the Arabidopsis root epidermis are genetically defined but remain plastic to environmental signals such as limited availability of inorganic phosphate (Pi). Root hairs of Pi-deficient plants are more frequent and longer than those of plants grown under Pi-replete conditions. To dissect genes involved in Pi deficiency-induced root hair morphogenesis, we constructed a co-expression network of Pi-responsive genes against a customized database that was assembled from experiments in which differentially expressed genes that encode proteins with validated functions in root hair development were over-represented. To further filter out less relevant genes, we combined this procedure with a search for common cis-regulatory elements in the promoters of the selected genes. In addition to well-described players and processes such as auxin signalling and modifications of primary cell walls, we discovered several novel aspects in the biology of root hairs induced by Pi deficiency, including cell cycle control, putative plastid-to-nucleus signalling, pathogen defence, reprogramming of cell wall-related carbohydrate metabolism, and chromatin remodelling. This approach allows the discovery of novel of aspects of a biological process from transcriptional profiles with high sensitivity and accuracy. PMID:27220366

  19. Discriminative gene co-expression network analysis uncovers novel modules involved in the formation of phosphate deficiency-induced root hairs in Arabidopsis.

    PubMed

    Salazar-Henao, Jorge E; Lin, Wen-Dar; Schmidt, Wolfgang

    2016-01-01

    Cell fate and differentiation in the Arabidopsis root epidermis are genetically defined but remain plastic to environmental signals such as limited availability of inorganic phosphate (Pi). Root hairs of Pi-deficient plants are more frequent and longer than those of plants grown under Pi-replete conditions. To dissect genes involved in Pi deficiency-induced root hair morphogenesis, we constructed a co-expression network of Pi-responsive genes against a customized database that was assembled from experiments in which differentially expressed genes that encode proteins with validated functions in root hair development were over-represented. To further filter out less relevant genes, we combined this procedure with a search for common cis-regulatory elements in the promoters of the selected genes. In addition to well-described players and processes such as auxin signalling and modifications of primary cell walls, we discovered several novel aspects in the biology of root hairs induced by Pi deficiency, including cell cycle control, putative plastid-to-nucleus signalling, pathogen defence, reprogramming of cell wall-related carbohydrate metabolism, and chromatin remodelling. This approach allows the discovery of novel of aspects of a biological process from transcriptional profiles with high sensitivity and accuracy. PMID:27220366

  20. Anesthetic Propofol-Induced Gene Expression Changes in Patients Undergoing Coronary Artery Bypass Graft Surgery Based on Dynamical Differential Coexpression Network Analysis

    PubMed Central

    Huang, Li-Jun; Chen, Na-Mi

    2016-01-01

    We aimed to determine the influence of anesthetic propofol on gene expression in patients treated by coronary artery bypass graft (CABG) surgery based on differential coexpression network (DCN) and to further reveal the novel mechanisms of the cardioprotective effects of propofol. Firstly, we constructed the DCN for disease condition based on Pearson correlation coefficient (PCC) and weight value. Secondly, the inference of modules was applied to search modules from DCN with same members but varied connectivity. Furthermore, we measured the statistical significance of the modules for selecting differential modules (DMs). Finally, attract method was used for DMs analysis to select key modules. Based on the δ value, 11928 edges and 2956 nodes were chosen to construct DCNs. A total of 29 seed genes were selected. Moreover, by quantifying connectivity changes in shared gene modules across different conditions, 8 DMs with higher connectivity dynamics were identified. Then, we extracted key modules using attract method, there were 8 key modules, and the top 3 modules were module 1, 2, and 3. Furthermore, GCG, PPY, and PON1 were initial seed genes of these 3 key modules, respectively. Accordingly, GCG and PON1 might exert important roles in the cardioprotective effects of propofol during CABG. PMID:27437027

  1. Anesthetic Propofol-Induced Gene Expression Changes in Patients Undergoing Coronary Artery Bypass Graft Surgery Based on Dynamical Differential Coexpression Network Analysis.

    PubMed

    Yu, Da; Huang, Li-Jun; Chen, Na-Mi

    2016-01-01

    We aimed to determine the influence of anesthetic propofol on gene expression in patients treated by coronary artery bypass graft (CABG) surgery based on differential coexpression network (DCN) and to further reveal the novel mechanisms of the cardioprotective effects of propofol. Firstly, we constructed the DCN for disease condition based on Pearson correlation coefficient (PCC) and weight value. Secondly, the inference of modules was applied to search modules from DCN with same members but varied connectivity. Furthermore, we measured the statistical significance of the modules for selecting differential modules (DMs). Finally, attract method was used for DMs analysis to select key modules. Based on the δ value, 11928 edges and 2956 nodes were chosen to construct DCNs. A total of 29 seed genes were selected. Moreover, by quantifying connectivity changes in shared gene modules across different conditions, 8 DMs with higher connectivity dynamics were identified. Then, we extracted key modules using attract method, there were 8 key modules, and the top 3 modules were module 1, 2, and 3. Furthermore, GCG, PPY, and PON1 were initial seed genes of these 3 key modules, respectively. Accordingly, GCG and PON1 might exert important roles in the cardioprotective effects of propofol during CABG. PMID:27437027

  2. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes.

    PubMed

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-05-31

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  3. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes

    PubMed Central

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-01-01

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  4. Transcriptional modules related to hepatocellular carcinoma survival: coexpression network analysis.

    PubMed

    Xu, Xinsen; Zhou, Yanyan; Miao, Runchen; Chen, Wei; Qu, Kai; Pang, Qing; Liu, Chang

    2016-06-01

    We performed weighted gene coexpression network analysis (WGCNA) to gain insights into the molecular aspects of hepatocellular carcinoma (HCC). Raw microarray datasets (including 488 samples) were downloaded from the Gene Expression Omnibus (GEO) website. Data were normalized using the RMA algorithm. We utilized the WGCNA to identify the coexpressed genes (modules) after non-specific filtering. Correlation and survival analyses were conducted using the modules, and gene ontology (GO) enrichment was applied to explore the possible mechanisms. Eight distinct modules were identified by the WGCNA. Pink and red modules were associated with liver function, whereas turquoise and black modules were inversely correlated with tumor staging. Poor outcomes were found in the low expression group in the turquoise module and in the high expression group in the red module. In addition, GO enrichment analysis suggested that inflammation, immune, virus-related, and interferon-mediated pathways were enriched in the turquoise module. Several potential biomarkers, such as cyclin-dependent kinase 1 (CDK1), topoisomerase 2α (TOP2A), and serpin peptidase inhibitor clade C (antithrombin) member 1 (SERPINC1), were also identified. In conclusion, gene signatures identified from the genome-based assays could contribute to HCC stratification. WGCNA was able to identify significant groups of genes associated with cancer prognosis. PMID:27052251

  5. Weighted gene co-expression network analysis of colorectal cancer liver metastasis genome sequencing data and screening of anti-metastasis drugs.

    PubMed

    Gao, Bo; Shao, Qin; Choudhry, Hani; Marcus, Victoria; Dong, Kung; Ragoussis, Jiannis; Gao, Zu-Hua

    2016-09-01

    Approximately 9% of cancer-related deaths are caused by colorectal cancer (CRC). CRC patients are prone to liver metastasis, which is the most important cause for the high CRC mortality rate. Understanding the molecular mechanism of CRC liver metastasis could help us to find novel targets for the effective treatment of this deadly disease. Using weighted gene co-expression network analysis on the sequencing data of CRC with and with metastasis, we identified 5 colorectal cancer liver metastasis related modules which were labeled as brown, blue, grey, yellow and turquoise. In the brown module, which represents the metastatic tumor in the liver, gene ontology (GO) analysis revealed functions including the G-protein coupled receptor protein signaling pathway, epithelial cell differentiation and cell surface receptor linked signal transduction. In the blue module, which represents the primary CRC that has metastasized, GO analysis showed that the genes were mainly enriched in GO terms including G-protein coupled receptor protein signaling pathway, cell surface receptor linked signal transduction, and negative regulation of cell differentiation. In the yellow and turquoise modules, which represent the primary non-metastatic CRC, 13 downregulated CRC liver metastasis-related candidate miRNAs were identified (e.g. hsa-miR-204, hsa-miR-455, etc.). Furthermore, analyzing the DrugBank database and mining the literature identified 25 and 12 candidate drugs that could potentially block the metastatic processes of the primary tumor and inhibit the progression of metastatic tumors in the liver, respectively. Data generated from this study not only furthers our understanding of the genetic alterations that drive the metastatic process, but also guides the development of molecular-targeted therapy of colorectal cancer liver metastasis. PMID:27571956

  6. ImmuCo: a database of gene co-expression in immune cells.

    PubMed

    Wang, Pingzhang; Qi, Huiying; Song, Shibin; Li, Shuang; Huang, Ningyu; Han, Wenling; Ma, Dalong

    2015-01-01

    Current gene co-expression databases and correlation networks do not support cell-specific analysis. Gene co-expression and expression correlation are subtly different phenomena, although both are likely to be functionally significant. Here, we report a new database, ImmuCo (http://immuco.bjmu.edu.cn), which is a cell-specific database that contains information about gene co-expression in immune cells, identifying co-expression and correlation between any two genes. The strength of co-expression of queried genes is indicated by signal values and detection calls, whereas expression correlation and strength are reflected by Pearson correlation coefficients. A scatter plot of the signal values is provided to directly illustrate the extent of co-expression and correlation. In addition, the database allows the analysis of cell-specific gene expression profile across multiple experimental conditions and can generate a list of genes that are highly correlated with the queried genes. Currently, the database covers 18 human cell groups and 10 mouse cell groups, including 20,283 human genes and 20,963 mouse genes. More than 8.6 × 10(8) and 7.4 × 10(8) probe set combinations are provided for querying each human and mouse cell group, respectively. Sample applications support the distinctive advantages of the database. PMID:25326331

  7. Co-expression network-based analysis of hippocampal expression data associated with Alzheimer's disease using a novel algorithm

    PubMed Central

    YUE, HONG; YANG, BO; YANG, FANG; HU, XIAO-LI; KONG, FAN-BIN

    2016-01-01

    Recent progress in bioinformatics has facilitated the clarification of biological processes associated with complex diseases. Numerous methods of co-expression analysis have been proposed for use in the study of pairwise relationships among genes. In the present study, a combined network based on gene pairs was constructed following the conversion and combination of gene pair score values using a novel algorithm across multiple approaches. Three hippocampal expression profiles of patients with Alzheimer's disease (AD) and normal controls were extracted from the ArrayExpress database, and a total of 144 differentially expressed (DE) genes across multiple studies were identified by a rank product (RP) method. Five groups of co-expression gene pairs and five networks were identified and constructed using four existing methods [weighted gene co-expression network analysis (WGCNA), empirical Bayesian (EB), differentially co-expressed genes and links (DCGL), search tool for the retrieval of interacting genes/proteins database (STRING)] and a novel rank-based algorithm with combined score, respectively. Topological analysis indicated that the co-expression network constructed by the WGCNA method had the tendency to exhibit small-world characteristics, and the combined co-expression network was confirmed to be a scale-free network. Functional analysis of the co-expression gene pairs was conducted by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. The co-expression gene pairs were mostly enriched in five pathways, namely proteasome, oxidative phosphorylation, Parkinson's disease, Huntington's disease and AD. This study provides a new perspective to co-expression analysis. Since different methods of analysis often present varying abilities, the novel combination algorithm may provide a more credible and robust outcome, and could be used to complement to traditional co-expression analysis. PMID:27168792

  8. Genomic Complexity Places Less Restrictions on the Evolution of Young Coexpression Networks than Protein–Protein Interactions

    PubMed Central

    Wei, Wen; Jin, Yan-Ting; Du, Meng-Ze; Wang, Ju; Rao, Nini; Guo, Feng-Biao

    2016-01-01

    The differences in evolutionary patterns of young protein–protein interactions (PPIs) among distinct species have long been a puzzle. However, based on our genome-wide analysis of available integrated experimental data, we confirm that young genes preferentially integrate into ancestral PPI networks, and that this manner is consistent in all of six model organisms with widely different levels of phenotypic complexity. We demonstrate that the level of restrictions placed on the evolution of biological networks declines with a decrease of phenotypic complexity. Compared with young PPI networks, new co-expression links have less evolutionary restrictions, so a young gene with a high possibility to be coexpressed other young genes relatively frequently emerges in the four simpler genomes among the six studied. However, it is not favorable for such young–young coexpression in terms of a young gene evolving into a coexpression hub, so the coexpression pattern could gradually decline. To explain this apparent contradiction, we suggest that young genes that are initially peripheral to networks are temporarily coexpressed with other young genes, driving functional evolution because of low selective pressure. However, as the expression levels of genes increase and they gradually develop a greater effect on fitness, young genes start to be coexpressed more with members of ancestral networks and less with other young genes. Our findings provide new insights into the evolution of biological networks. PMID:27521813

  9. Pathways of lipid metabolism in marine algae, co-expression network, bottlenecks and candidate genes for enhanced production of EPA and DHA in species of Chromista.

    PubMed

    Mühlroth, Alice; Li, Keshuai; Røkke, Gunvor; Winge, Per; Olsen, Yngvar; Hohmann-Marriott, Martin F; Vadstein, Olav; Bones, Atle M

    2013-11-01

    The importance of n-3 long chain polyunsaturated fatty acids (LC-PUFAs) for human health has received more focus the last decades, and the global consumption of n-3 LC-PUFA has increased. Seafood, the natural n-3 LC-PUFA source, is harvested beyond a sustainable capacity, and it is therefore imperative to develop alternative n-3 LC-PUFA sources for both eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3). Genera of algae such as Nannochloropsis, Schizochytrium, Isochrysis and Phaedactylum within the kingdom Chromista have received attention due to their ability to produce n-3 LC-PUFAs. Knowledge of LC-PUFA synthesis and its regulation in algae at the molecular level is fragmentary and represents a bottleneck for attempts to enhance the n-3 LC-PUFA levels for industrial production. In the present review, Phaeodactylum tricornutum has been used to exemplify the synthesis and compartmentalization of n-3 LC-PUFAs. Based on recent transcriptome data a co-expression network of 106 genes involved in lipid metabolism has been created. Together with recent molecular biological and metabolic studies, a model pathway for n-3 LC-PUFA synthesis in P. tricornutum has been proposed, and is compared to industrialized species of Chromista. Limitations of the n-3 LC-PUFA synthesis by enzymes such as thioesterases, elongases, acyl-CoA synthetases and acyltransferases are discussed and metabolic bottlenecks are hypothesized such as the supply of the acetyl-CoA and NADPH. A future industrialization will depend on optimization of chemical compositions and increased biomass production, which can be achieved by exploitation of the physiological potential, by selective breeding and by genetic engineering. PMID:24284429

  10. Pathways of Lipid Metabolism in Marine Algae, Co-Expression Network, Bottlenecks and Candidate Genes for Enhanced Production of EPA and DHA in Species of Chromista

    PubMed Central

    Mühlroth, Alice; Li, Keshuai; Røkke, Gunvor; Winge, Per; Olsen, Yngvar; Hohmann-Marriott, Martin F.; Vadstein, Olav; Bones, Atle M.

    2013-01-01

    The importance of n-3 long chain polyunsaturated fatty acids (LC-PUFAs) for human health has received more focus the last decades, and the global consumption of n-3 LC-PUFA has increased. Seafood, the natural n-3 LC-PUFA source, is harvested beyond a sustainable capacity, and it is therefore imperative to develop alternative n-3 LC-PUFA sources for both eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3). Genera of algae such as Nannochloropsis, Schizochytrium, Isochrysis and Phaedactylum within the kingdom Chromista have received attention due to their ability to produce n-3 LC-PUFAs. Knowledge of LC-PUFA synthesis and its regulation in algae at the molecular level is fragmentary and represents a bottleneck for attempts to enhance the n-3 LC-PUFA levels for industrial production. In the present review, Phaeodactylum tricornutum has been used to exemplify the synthesis and compartmentalization of n-3 LC-PUFAs. Based on recent transcriptome data a co-expression network of 106 genes involved in lipid metabolism has been created. Together with recent molecular biological and metabolic studies, a model pathway for n-3 LC-PUFA synthesis in P. tricornutum has been proposed, and is compared to industrialized species of Chromista. Limitations of the n-3 LC-PUFA synthesis by enzymes such as thioesterases, elongases, acyl-CoA synthetases and acyltransferases are discussed and metabolic bottlenecks are hypothesized such as the supply of the acetyl-CoA and NADPH. A future industrialization will depend on optimization of chemical compositions and increased biomass production, which can be achieved by exploitation of the physiological potential, by selective breeding and by genetic engineering. PMID:24284429

  11. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  12. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research.

    PubMed

    Li, Junyi; Li, Yi-Xue; Li, Yuan-Yuan

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies. PMID:27597964

  13. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research

    PubMed Central

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies. PMID:27597964

  14. ComPlEx: conservation and divergence of co-expression networks in A. thaliana, Populus and O. sativa

    PubMed Central

    2014-01-01

    Background Divergence in gene regulation has emerged as a key mechanism underlying species differentiation. Comparative analysis of co-expression networks across species can reveal conservation and divergence in the regulation of genes. Results We inferred co-expression networks of A. thaliana, Populus spp. and O. sativa using state-of-the-art methods based on mutual information and context likelihood of relatedness, and conducted a comprehensive comparison of these networks across a range of co-expression thresholds. In addition to quantifying gene-gene link and network neighbourhood conservation, we also applied recent advancements in network analysis to do cross-species comparisons of network properties such as scale free characteristics and gene centrality as well as network motifs. We found that in all species the networks emerged as scale free only above a certain co-expression threshold, and that the high-centrality genes upholding this organization tended to be conserved. Network motifs, in particular the feed-forward loop, were found to be significantly enriched in specific functional subnetworks but where much less conserved across species than gene centrality. Although individual gene-gene co-expression had massively diverged, up to ~80% of the genes still had a significantly conserved network neighbourhood. For genes with multiple predicted orthologs, about half had one ortholog with conserved regulation and another ortholog with diverged or non-conserved regulation. Furthermore, the most sequence similar ortholog was not the one with the most conserved gene regulation in over half of the cases. Conclusions We have provided a comprehensive analysis of gene regulation evolution in plants and built a web tool for Comparative analysis of Plant co-Expression networks (ComPlEx, http://complex.plantgenie.org/). The tool can be particularly useful for identifying the ortholog with the most conserved regulation among several sequence-similar alternatives and

  15. MIClique: An algorithm to identify differentially coexpressed disease gene subset from microarray data.

    PubMed

    Zhang, Huanping; Song, Xiaofeng; Wang, Huinan; Zhang, Xiaobai

    2009-01-01

    Computational analysis of microarray data has provided an effective way to identify disease-related genes. Traditional disease gene selection methods from microarray data such as statistical test always focus on differentially expressed genes in different samples by individual gene prioritization. These traditional methods might miss differentially coexpressed (DCE) gene subsets because they ignore the interaction between genes. In this paper, MIClique algorithm is proposed to identify DEC gene subsets based on mutual information and clique analysis. Mutual information is used to measure the coexpression relationship between each pair of genes in two different kinds of samples. Clique analysis is a commonly used method in biological network, which generally represents biological module of similar function. By applying the MIClique algorithm to real gene expression data, some DEC gene subsets which correlated under one experimental condition but uncorrelated under another condition are detected from the graph of colon dataset and leukemia dataset. PMID:20169000

  16. VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine)

    PubMed Central

    2013-01-01

    Background Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. Description The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and

  17. Co-expression analysis reveals a group of genes potentially involved in regulation of plant response to iron-deficiency.

    PubMed

    Li, Hua; Wang, Lei; Yang, Zhi Min

    2015-01-01

    Iron (Fe) is an essential element for plant growth and development. Iron deficiency results in abnormal metabolisms from respiration to photosynthesis. Exploration of Fe-deficient responsive genes and their networks is critically important to understand molecular mechanisms leading to the plant adaptation to soil Fe-limitation. Co-expression genes are a cluster of genes that have a similar expression pattern to execute relatively biological functions at a stage of development or under a certain environmental condition. They may share a common regulatory mechanism. In this study, we investigated Fe-starved-related co-expression genes from Arabidopsis. From the biological process GO annotation of TAIR (The Arabidopsis Information Resource), 180 iron-deficient responsive genes were detected. Using ATTED-II database, we generated six gene co-expression networks. Among these, two modules of PYE and IRT1 were successfully constructed. There are 30 co-expression genes that are incorporated in the two modules (12 in PYE-module and 18 in IRT1-module). Sixteen of the co-expression genes were well characterized. The remaining genes (14) are poorly or not functionally identified with iron stress. Validation of the 14 genes using real-time PCR showed differential expression under iron-deficiency. Most of the co-expression genes (23/30) could be validated in pye and fit mutant plants with iron-deficiency. We further identified iron-responsive cis-elements upstream of the co-expression genes and found that 22 out of 30 genes contain the iron-responsive motif IDE1. Furthermore, some auxin and ethylene-responsive elements were detected in the promoters of the co-expression genes. These results suggest that some of the genes can be also involved in iron stress response through the phytohormone-responsive pathways. PMID:25300251

  18. Co-expression network analysis identifies transcriptional modules in the mouse liver.

    PubMed

    Liu, Wei; Ye, Hua

    2014-10-01

    The mouse liver transcriptome has been extensively studied but little is known about the global hepatic gene network of the mouse under normal physiological conditions. Understanding this will help reveal the transcriptional organization of the liver and elucidate its functional complexity. Here, weighted gene co-expression network analysis (WGCNA) was carried out to explore gene co-expression networks using large-scale microarray data from normal mouse livers. A total of 7,203 genes were parsed into 16 gene modules associated with protein catabolism, RNA processing, muscle contraction, transcriptional regulation, oxidation reduction, sterol biosynthesis, translation, fatty acid metabolism, immune response and others. The modules were organized into higher order co-expression groups. Hub genes in each module were found to be critical for module function. In sum, the analyses revealed the gene modular map of the mouse liver under normal physiological condition. These results provide a systems-level framework to help understand the complexity of the mouse liver at the molecular level, and should be beneficial in annotating uncharacterized genes. PMID:24816893

  19. Sharing and Specificity of Co-expression Networks across 35 Human Tissues

    PubMed Central

    Pierson, Emma; Koller, Daphne; Battle, Alexis; Mostafavi, Sara

    2015-01-01

    To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are available for a majority of the tissues, and therefore statistical inference of networks in this setting is highly underpowered. To address this problem, we infer tissue-specific gene co-expression networks for 35 tissues in the GTEx dataset using a novel algorithm, GNAT, that uses a hierarchy of tissues to share data between related tissues. We show that this transfer learning approach increases the accuracy with which networks are learned. Analysis of these networks reveals that tissue-specific transcription factors are hubs that preferentially connect to genes with tissue specific functions. Additionally, we observe that genes with tissue-specific functions lie at the peripheries of our networks. We identify numerous modules enriched for Gene Ontology functions, and show that modules conserved across tissues are especially likely to have functions common to all tissues, while modules that are upregulated in a particular tissue are often instrumental to tissue-specific function. Finally, we provide a web tool, available at mostafavilab.stat.ubc.ca/GNAT, which allows exploration of gene function and regulation in a tissue-specific manner. PMID:25970446

  20. CoExpNetViz: Comparative Co-Expression Networks Construction and Visualization Tool

    PubMed Central

    Tzfadia, Oren; Diels, Tim; De Meyer, Sam; Vandepoele, Klaas; Aharoni, Asaph; Van de Peer, Yves

    2016-01-01

    Motivation: Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. Results: We introduce CoExpNetViz, a computational tool that uses a set of query or “bait” genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non-bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. Availability: The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platforms. PMID:26779228

  1. Novel structural co-expression analysis linking the NPM1-associated ribosomal biogenesis network to chronic myelogenous leukemia

    PubMed Central

    Chan, Lawrence WC; Lin, Xihong; Yung, Godwin; Lui, Thomas; Chiu, Ya Ming; Wang, Fengfeng; Tsui, Nancy BY; Cho, William CS; Yip, SP; Siu, Parco M.; Wong, SC Cesar; Yung, Benjamin YM

    2015-01-01

    Co-expression analysis reveals useful dysregulation patterns of gene cooperativeness for understanding cancer biology and identifying new targets for treatment. We developed a structural strategy to identify co-expressed gene networks that are important for chronic myelogenous leukemia (CML). This strategy compared the distributions of expressional correlations between CML and normal states, and it identified a data-driven threshold to classify strongly co-expressed networks that had the best coherence with CML. Using this strategy, we found a transcriptome-wide reduction of co-expression connectivity in CML, reflecting potentially loosened molecular regulation. Conversely, when we focused on nucleophosmin 1 (NPM1) associated networks, NPM1 established more co-expression linkages with BCR-ABL pathways and ribosomal protein networks in CML than normal. This finding implicates a new role of NPM1 in conveying tumorigenic signals from the BCR-ABL oncoprotein to ribosome biogenesis, affecting cellular growth. Transcription factors may be regulators of the differential co-expression patterns between CML and normal. PMID:26205693

  2. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    PubMed

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. PMID:26619072

  3. A co-expression modules based gene selection for cancer recognition.

    PubMed

    Lu, Xinguo; Deng, Yong; Huang, Lei; Feng, Bingtao; Liao, Bo

    2014-12-01

    Gene expression profiles are used to recognize patient samples for cancer diagnosis and therapy. Gene selection is crucial to high recognition performance. In usual gene selection methods the genes are considered as independent individuals and the correlation among genes is not used efficiently. In this description, a co-expression modules based gene selection method for cancer recognition is proposed. First, in the cancer dataset a weighted correlation network is constructed according to the correlation between each pair of genes, different modules from this network are identified and the significant modules are selected for following exploration. Second, based on these informative modules information gain is applied to selecting the feature genes for cancer recognition. Then using LOOCV, the experiments with different classification algorithms are conducted and the results show that the proposed method makes better classification accuracy than traditional gene selection methods. At last, via gene ontology enrichment analysis the biological significance of the co-expressed genes in specific modules was verified. PMID:24440175

  4. Genome-Wide Tissue-Specific Gene Expression, Co-expression and Regulation of Co-expressed Genes in Adult Nematode Ascaris suum

    PubMed Central

    Rosa, Bruce A.; Jasmer, Douglas P.; Mitreva, Makedonka

    2014-01-01

    Background Caenorhabditis elegans has traditionally been used as a model for studying nematode biology, but its small size limits the ability for researchers to perform some experiments such as high-throughput tissue-specific gene expression studies. However, the dissection of individual tissues is possible in the parasitic nematode Ascaris suum due to its relatively large size. Here, we take advantage of the recent genome sequencing of Ascaris suum and the ability to physically dissect its separate tissues to produce a wide-scale tissue-specific nematode RNA-seq datasets, including data on three non-reproductive tissues (head, pharynx, and intestine) in both male and female worms, as well as four reproductive tissues (testis, seminal vesicle, ovary, and uterus). We obtained fundamental information about the biology of diverse cell types and potential interactions among tissues within this multicellular organism. Methodology/Principal Findings Overexpression and functional enrichment analyses identified many putative biological functions enriched in each tissue studied, including functions which have not been previously studied in detail in nematodes. Putative tissue-specific transcriptional factors and corresponding binding motifs that regulate expression in each tissue were identified, including the intestine-enriched ELT-2 motif/transcription factor previously described in nematode intestines. Constitutively expressed and novel genes were also characterized, with the largest number of novel genes found to be overexpressed in the testis. Finally, a putative acetylcholine-mediated transcriptional network connecting biological activity in the head to the male reproductive system is described using co-expression networks, along with a similar ecdysone-mediated system in the female. Conclusions/Significance The expression profiles, co-expression networks and co-expression regulation of the 10 tissues studied and the tissue-specific analysis presented here are a

  5. Uncovering the liver's role in immunity through RNA co-expression networks.

    PubMed

    Harrall, Kylie K; Kechris, Katerina J; Tabakoff, Boris; Hoffman, Paula L; Hines, Lisa M; Tsukamoto, Hidekazu; Pravenec, Michal; Printz, Morton; Saba, Laura M

    2016-10-01

    Gene co-expression analysis has proven to be a powerful tool for ascertaining the organization of gene products into networks that are important for organ function. An organ, such as the liver, engages in a multitude of functions important for the survival of humans, rats, and other animals; these liver functions include energy metabolism, metabolism of xenobiotics, immune system function, and hormonal homeostasis. With the availability of organ-specific transcriptomes, we can now examine the role of RNA transcripts (both protein-coding and non-coding) in these functions. A systems genetic approach for identifying and characterizing liver gene networks within a recombinant inbred panel of rats was used to identify genetically regulated transcriptional networks (modules). For these modules, biological consensus was found between functional enrichment analysis and publicly available phenotypic quantitative trait loci (QTL). In particular, the biological function of two liver modules could be linked to immune response. The eigengene QTLs for these co-expression modules were located at genomic regions coincident with highly significant phenotypic QTLs; these phenotypes were related to rheumatoid arthritis, food preference, and basal corticosterone levels in rats. Our analysis illustrates that genetically and biologically driven RNA-based networks, such as the ones identified as part of this research, provide insight into the genetic influences on organ functions. These networks can pinpoint phenotypes that manifest through the interaction of many organs/tissues and can identify unannotated or under-annotated RNA transcripts that play a role in these phenotypes. PMID:27401171

  6. DTW-MIC Coexpression Networks from Time-Course Data.

    PubMed

    Riccadonna, Samantha; Jurman, Giuseppe; Visintainer, Roberto; Filosi, Michele; Furlanello, Cesare

    2016-01-01

    When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. PMID:27031641

  7. DTW-MIC Coexpression Networks from Time-Course Data

    PubMed Central

    Riccadonna, Samantha; Jurman, Giuseppe; Visintainer, Roberto; Filosi, Michele; Furlanello, Cesare

    2016-01-01

    When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. PMID:27031641

  8. Co-expression of mitosis-regulating genes contributes to malignant progression and prognosis in oligodendrogliomas.

    PubMed

    Liu, Yanwei; Hu, Huimin; Zhang, Chuanbao; Wang, Haoyuan; Zhang, Wenlong; Wang, Zheng; Li, Mingyang; Zhang, Wei; Zhou, Dabiao; Jiang, Tao

    2015-11-10

    The clinical prognosis of patients with glioma is determined by tumor grades, but tumors of different subtypes with equal malignancy grade usually have different prognosis that is largely determined by genetic abnormalities. Oligodendrogliomas (ODs) are the second most common type of gliomas. In this study, integrative analyses found that distribution of TCGA transcriptomic subtypes was associated with grade progression in ODs. To identify critical gene(s) associated with tumor grades and TCGA subtypes, we analyzed 34 normal brain tissue (NBT), 146 WHO grade II and 130 grade III ODs by microarray and RNA sequencing, and identified a co-expression network of six genes (AURKA, NDC80, CENPK, KIAA0101, TIMELESS and MELK) that was associated with tumor grades and TCGA subtypes as well as Ki-67 expression. Validation of the six genes was performed by qPCR in additional 28 ODs. Importantly, these genes also were validated in four high-grade recurrent gliomas and the initial lower-grade gliomas resected from the same patients. Finally, the RNA data on two genes with the highest discrimination potential (AURKA and NDC80) and Ki-67 were validated on an independent cohort (5 NBTs and 86 ODs) by immunohistochemistry. Knockdown of AURKA and NDC80 by siRNAs suppressed Ki-67 expression and proliferation of gliomas cells. Survival analysis showed that high expression of the six genes corporately indicated a poor survival outcome. Correlation and protein interaction analysis provided further evidence for this co-expression network. These data suggest that the co-expression of the six mitosis-regulating genes was associated with malignant progression and prognosis in ODs. PMID:26468983

  9. Co-expression of mitosis-regulating genes contributes to malignant progression and prognosis in oligodendrogliomas

    PubMed Central

    Liu, Yanwei; Hu, Huimin; Zhang, Chuanbao; Wang, Haoyuan; Zhang, Wenlong; Wang, Zheng; Li, Mingyang; Zhang, Wei; Zhou, Dabiao; Jiang, Tao

    2015-01-01

    The clinical prognosis of patients with glioma is determined by tumor grades, but tumors of different subtypes with equal malignancy grade usually have different prognosis that is largely determined by genetic abnormalities. Oligodendrogliomas (ODs) are the second most common type of gliomas. In this study, integrative analyses found that distribution of TCGA transcriptomic subtypes was associated with grade progression in ODs. To identify critical gene(s) associated with tumor grades and TCGA subtypes, we analyzed 34 normal brain tissue (NBT), 146 WHO grade II and 130 grade III ODs by microarray and RNA sequencing, and identified a co-expression network of six genes (AURKA, NDC80,CENPK, KIAA0101, TIMELESS and MELK) that was associated with tumor grades and TCGA subtypes as well as Ki-67 expression. Validation of the six genes was performed by qPCR in additional 28 ODs. Importantly, these genes also were validated in four high-grade recurrent gliomas and the initial lower-grade gliomas resected from the same patients. Finally, the RNA data on two genes with the highest discrimination potential (AURKA and NDC80) and Ki-67 were validated on an independent cohort (5 NBTs and 86 ODs) by immunohistochemistry. Knockdown of AURKA and NDC80 by siRNAs suppressed Ki-67 expression and proliferation of gliomas cells. Survival analysis showed that high expression of the six genes corporately indicated a poor survival outcome. Correlation and protein interaction analysis provided further evidence for this co-expression network. These data suggest that the co-expression of the six mitosis-regulating genes was associated with malignant progression and prognosis in ODs. PMID:26468983

  10. Construction and application of a co-expression network in Mycobacterium tuberculosis.

    PubMed

    Jiang, Jun; Sun, Xian; Wu, Wei; Li, Li; Wu, Hai; Zhang, Lu; Yu, Guohua; Li, Yao

    2016-01-01

    Because of its high pathogenicity and infectivity, tuberculosis is a serious threat to human health. Some information about the functions of the genes in Mycobacterium tuberculosis genome was currently available, but it was not enough to explore transcriptional regulatory mechanisms. Here, we applied the WGCNA (Weighted Gene Correlation Network Analysis) algorithm to mine pooled microarray datasets for the M. tuberculosis H37Rv strain. We constructed a co-expression network that was subdivided into 78 co-expression gene modules. The different response to two kinds of vitro models (a constant 0.2% oxygen hypoxia model and a Wayne model) were explained based on these modules. We identified potential transcription factors based on high Pearson's correlation coefficients between the modules and genes. Three modules that may be associated with hypoxic stimulation were identified, and their potential transcription factors were predicted. In the validation experiment, we determined the expression levels of genes in the modules under hypoxic condition and under overexpression of potential transcription factors (Rv0081, furA (Rv1909c), Rv0324, Rv3334, and Rv3833). The experimental results showed that the three identified modules related to hypoxia and that the overexpression of transcription factors could significantly change the expression levels of genes in the corresponding modules. PMID:27328747

  11. Construction and application of a co-expression network in Mycobacterium tuberculosis

    PubMed Central

    Jiang, Jun; Sun, Xian; Wu, Wei; Li, Li; Wu, Hai; Zhang, Lu; Yu, Guohua; Li, Yao

    2016-01-01

    Because of its high pathogenicity and infectivity, tuberculosis is a serious threat to human health. Some information about the functions of the genes in Mycobacterium tuberculosis genome was currently available, but it was not enough to explore transcriptional regulatory mechanisms. Here, we applied the WGCNA (Weighted Gene Correlation Network Analysis) algorithm to mine pooled microarray datasets for the M. tuberculosis H37Rv strain. We constructed a co-expression network that was subdivided into 78 co-expression gene modules. The different response to two kinds of vitro models (a constant 0.2% oxygen hypoxia model and a Wayne model) were explained based on these modules. We identified potential transcription factors based on high Pearson’s correlation coefficients between the modules and genes. Three modules that may be associated with hypoxic stimulation were identified, and their potential transcription factors were predicted. In the validation experiment, we determined the expression levels of genes in the modules under hypoxic condition and under overexpression of potential transcription factors (Rv0081, furA (Rv1909c), Rv0324, Rv3334, and Rv3833). The experimental results showed that the three identified modules related to hypoxia and that the overexpression of transcription factors could significantly change the expression levels of genes in the corresponding modules. PMID:27328747

  12. Integrated genome-wide association, coexpression network, and expression single nucleotide polymorphism analysis identifies novel pathway in allergic rhinitis

    PubMed Central

    2014-01-01

    Background Allergic rhinitis is a common disease whose genetic basis is incompletely explained. We report an integrated genomic analysis of allergic rhinitis. Methods We performed genome wide association studies (GWAS) of allergic rhinitis in 5633 ethnically diverse North American subjects. Next, we profiled gene expression in disease-relevant tissue (peripheral blood CD4+ lymphocytes) collected from subjects who had been genotyped. We then integrated the GWAS and gene expression data using expression single nucleotide (eSNP), coexpression network, and pathway approaches to identify the biologic relevance of our GWAS. Results GWAS revealed ethnicity-specific findings, with 4 genome-wide significant loci among Latinos and 1 genome-wide significant locus in the GWAS meta-analysis across ethnic groups. To identify biologic context for these results, we constructed a coexpression network to define modules of genes with similar patterns of CD4+ gene expression (coexpression modules) that could serve as constructs of broader gene expression. 6 of the 22 GWAS loci with P-value ≤ 1x10−6 tagged one particular coexpression module (4.0-fold enrichment, P-value 0.0029), and this module also had the greatest enrichment (3.4-fold enrichment, P-value 2.6 × 10−24) for allergic rhinitis-associated eSNPs (genetic variants associated with both gene expression and allergic rhinitis). The integrated GWAS, coexpression network, and eSNP results therefore supported this coexpression module as an allergic rhinitis module. Pathway analysis revealed that the module was enriched for mitochondrial pathways (8.6-fold enrichment, P-value 4.5 × 10−72). Conclusions Our results highlight mitochondrial pathways as a target for further investigation of allergic rhinitis mechanism and treatment. Our integrated approach can be applied to provide biologic context for GWAS of other diseases. PMID:25085501

  13. Protein-protein interaction and gene co-expression maps of ARFs and Aux/IAAs in Arabidopsis

    PubMed Central

    Piya, Sarbottam; Shrestha, Sandesh K.; Binder, Brad; Stewart, C. Neal; Hewezi, Tarek

    2014-01-01

    The phytohormone auxin regulates nearly all aspects of plant growth and development. Based on the current model in Arabidopsis thaliana, Auxin/indole-3-acetic acid (Aux/IAA) proteins repress auxin-inducible genes by inhibiting auxin response transcription factors (ARFs). Experimental evidence suggests that heterodimerization between Aux/IAA and ARF proteins are related to their unique biological functions. The objective of this study was to generate the Aux/IAA-ARF protein-protein interaction map using full length sequences and locate the interacting protein pairs to specific gene co-expression networks in order to define tissue-specific responses of the Aux/IAA-ARF interactome. Pairwise interactions between 19 ARFs and 29 Aux/IAAs resulted in the identification of 213 specific interactions of which 79 interactions were previously unknown. The incorporation of co-expression profiles with protein-protein interaction data revealed a strong correlation of gene co-expression for 70% of the ARF-Aux/IAA interacting pairs in at least one tissue/organ, indicative of the biological significance of these interactions. Importantly, ARF4-8 and 19, which were found to interact with almost all Aux-Aux/IAA showed broad co-expression relationships with Aux/IAA genes, thus, formed the central hubs of the co-expression network. Our analyses provide new insights into the biological significance of ARF-Aux/IAA associations in the morphogenesis and development of various plant tissues and organs. PMID:25566309

  14. Co-expression networks in generation of induced pluripotent stem cells

    PubMed Central

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P.; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M.

    2016-01-01

    ABSTRACT We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation. PMID:26892236

  15. Co-expression networks in generation of induced pluripotent stem cells.

    PubMed

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M

    2016-01-01

    We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation. PMID:26892236

  16. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    PubMed Central

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively coregulated genes and their annotation using gene ontology analysis and cis-regulatory element discovery. The causal basis for coregulation is detected through the use of quantitative trait locus mapping. PMID:16046823

  17. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    DOE PAGESBeta

    Baldwin, Nicole E.; Chesler, Elissa J.; Kirov, Stefan; Langston, Michael A.; Snoddy, Jay R.; Williams, Robert W.; Zhang, Bing

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively co-regulated genes and their annotation using gene ontology analysis and cis -regulatory element discovery.more » The causal basis for co-regulation is detected through the use of quantitative trait locus mapping.« less

  18. Gene Co-Expression Analysis Predicts Genetic Variants Associated with Drug Responsiveness in Lung Cancer

    PubMed Central

    Shroff, Sanaya; Zhang, Jie; Huang, Kun

    2016-01-01

    Responsiveness to drugs is an important concern in designing personalized treatment for cancer patients. Currently genetic markers are often used to guide targeted therapy. However, deeper understanding of the molecular basis for drug responses and discovery of new predictive biomarkers for drug sensitivity are much needed. In this paper, we present a workflow for identifying condition-specific gene co-expression networks associated with responses to the tyrosine kinase inhibitor, Erlotinib, in lung adenocarcinoma cell lines using data from the Cancer Cell Line Encyclopedia by combining network mining and statistical analysis. Particularly, we have identified multiple gene modules specifically co-expressed in the drug responsive cell lines but not in the unresponsive group. Interestingly, most of these modules are enriched on specific cytobands, suggesting potential copy number variation events on these loci. Our results therefore imply that there are multiple genetic loci with copy number variations associated with the Erlotinib responses. The existence of CNVs in these loci is also confirmed in lung cancer tissue samples using the TCGA data. Since these structural variations are inferred from functional genomics data, these CNVs are functional variations. These results suggest the condition specific gene co- expression network mining approach is an effective approach in predicting candidate biomarkers for drug responses. PMID:27570645

  19. Gene Co-Expression Analysis Predicts Genetic Variants Associated with Drug Responsiveness in Lung Cancer.

    PubMed

    Shroff, Sanaya; Zhang, Jie; Huang, Kun

    2016-01-01

    Responsiveness to drugs is an important concern in designing personalized treatment for cancer patients. Currently genetic markers are often used to guide targeted therapy. However, deeper understanding of the molecular basis for drug responses and discovery of new predictive biomarkers for drug sensitivity are much needed. In this paper, we present a workflow for identifying condition-specific gene co-expression networks associated with responses to the tyrosine kinase inhibitor, Erlotinib, in lung adenocarcinoma cell lines using data from the Cancer Cell Line Encyclopedia by combining network mining and statistical analysis. Particularly, we have identified multiple gene modules specifically co-expressed in the drug responsive cell lines but not in the unresponsive group. Interestingly, most of these modules are enriched on specific cytobands, suggesting potential copy number variation events on these loci. Our results therefore imply that there are multiple genetic loci with copy number variations associated with the Erlotinib responses. The existence of CNVs in these loci is also confirmed in lung cancer tissue samples using the TCGA data. Since these structural variations are inferred from functional genomics data, these CNVs are functional variations. These results suggest the condition specific gene co- expression network mining approach is an effective approach in predicting candidate biomarkers for drug responses. PMID:27570645

  20. Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction

    PubMed Central

    Shoichet, Brian K.; Gillis, Jesse

    2016-01-01

    The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of neighboring proteins to bind related ligands, may complement biologically-oriented gene networks, which are used to predict functional or disease relevance. To quantify the degree to which such ligand-based protein associations might complement functional genomic associations, including sequence similarity, physical protein-protein interactions, co-expression, and disease gene annotations, we calculated a network based on the Similarity Ensemble Approach (SEA: sea.docking.org), where protein neighbors reflect the similarity of their ligands. We also measured the similarity with functional genomic networks over a common set of 1,131 genes, and found that the networks had only small overlaps, which were significant only due to the large scale of the data. Consistent with the view that the networks contain different information, combining them substantially improved Molecular Function prediction within GO (from AUROC~0.63–0.75 for the individual data modalities to AUROC~0.8 in the aggregate). We investigated the boost in guilt-by-association gene function prediction when the networks are combined and describe underlying properties that can be further exploited. PMID:27467773

  1. KaPPA-View4: a metabolic pathway database for representation and analysis of correlation networks of gene co-expression and metabolite co-accumulation and omics data.

    PubMed

    Sakurai, Nozomu; Ara, Takeshi; Ogata, Yoshiyuki; Sano, Ryosuke; Ohno, Takashi; Sugiyama, Kenjiro; Hiruta, Atsushi; Yamazaki, Kiyoshi; Yano, Kentaro; Aoki, Koh; Aharoni, Asaph; Hamada, Kazuki; Yokoyama, Koji; Kawamura, Shingo; Otsuka, Hirofumi; Tokimatsu, Toshiaki; Kanehisa, Minoru; Suzuki, Hideyuki; Saito, Kazuki; Shibata, Daisuke

    2011-01-01

    Correlations of gene-to-gene co-expression and metabolite-to-metabolite co-accumulation calculated from large amounts of transcriptome and metabolome data are useful for uncovering unknown functions of genes, functional diversities of gene family members and regulatory mechanisms of metabolic pathway flows. Many databases and tools are available to interpret quantitative transcriptome and metabolome data, but there are only limited ones that connect correlation data to biological knowledge and can be utilized to find biological significance of it. We report here a new metabolic pathway database, KaPPA-View4 (http://kpv.kazusa.or.jp/kpv4/), which is able to overlay gene-to-gene and/or metabolite-to-metabolite relationships as curves on a metabolic pathway map, or on a combination of up to four maps. This representation would help to discover, for example, novel functions of a transcription factor that regulates genes on a metabolic pathway. Pathway maps of the Kyoto Encyclopedia of Genes and Genomes (KEGG) and maps generated from their gene classifications are available at KaPPA-View4 KEGG version (http://kpv.kazusa.or.jp/kpv4-kegg/). At present, gene co-expression data from the databases ATTED-II, COXPRESdb, CoP and MiBASE for human, mouse, rat, Arabidopsis, rice, tomato and other plants are available. PMID:21097783

  2. ATTED-II: a database of co-expressed genes and cis elements for identifying co-regulated gene groups in Arabidopsis

    PubMed Central

    Obayashi, Takeshi; Kinoshita, Kengo; Nakai, Kenta; Shibaoka, Masayuki; Hayashi, Shinpei; Saeki, Motoshi; Shibata, Daisuke; Saito, Kazuki; Ohta, Hiroyuki

    2007-01-01

    Publicly available database of co-expressed gene sets would be a valuable tool for a wide variety of experimental designs, including targeting of genes for functional identification or for regulatory investigation. Here, we report the construction of an Arabidopsis thaliana trans-factor and cis-element prediction database (ATTED-II) that provides co-regulated gene relationships based on co-expressed genes deduced from microarray data and the predicted cis elements. ATTED-II () includes the following features: (i) lists and networks of co-expressed genes calculated from 58 publicly available experimental series, which are composed of 1388 GeneChip data in A.thaliana; (ii) prediction of cis-regulatory elements in the 200 bp region upstream of the transcription start site to predict co-regulated genes amongst the co-expressed genes; and (iii) visual representation of expression patterns for individual genes. ATTED-II can thus help researchers to clarify the function and regulation of particular genes and gene networks. PMID:17130150

  3. Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

    PubMed Central

    2014-01-01

    Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624

  4. Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks.

    PubMed

    Rahmani, Bahareh; Zimmermann, Michael T; Grill, Diane E; Kennedy, Richard B; Oberg, Ann L; White, Bill C; Poland, Gregory A; McKinney, Brett A

    2016-01-01

    Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways. PMID:27242890

  5. Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks

    PubMed Central

    Rahmani, Bahareh; Zimmermann, Michael T.; Grill, Diane E.; Kennedy, Richard B.; Oberg, Ann L.; White, Bill C.; Poland, Gregory A.; McKinney, Brett A.

    2016-01-01

    Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways. PMID:27242890

  6. Coexpression Pattern Analysis of NPM1-Associated Genes in Chronic Myelogenous Leukemia

    PubMed Central

    Wong, S. C. Cesar; Siu, Parco M.; Yung, Benjamin Y. M.

    2015-01-01

    Background. Nucleophosmin 1 (NPM1) plays an important role in ribosomal synthesis and malignancies, but NPM1 mutations occur rarely in the blast-crisis and chronic-phase chronic myelogenous leukemia (CML) patients. The NPM1-associated gene set (GCM_NPM1), in total 116 genes including NPM1, was chosen as the candidate gene set for the coexpression analysis. We wonder if NPM1-associated genes can affect the ribosomal synthesis and translation process in CML. Results. We presented a distribution-based approach for gene pair classification by identifying a disease-specific cutoff point that classified the coexpressed gene pairs into strong and weak coexpression structures. The differences in the coexpression patterns between the normal and the CML groups were reflected from the overall structure by performing two-sample Kolmogorov-Smirnov test. Our developed method effectively identified the coexpression pattern differences from the overall structure: P  value = 1.71 × 10−22 < 0.05 for the maximum deviation D = 0.109. Moreover, we found that genes involved in the ribosomal synthesis and translation process tended to be coexpressed in the CML group. Conclusion. Our developed method can identify the coexpression difference between two different groups. Dysregulation of ribosomal synthesis and translation process may be related to the CML disease. Our significant findings may provide useful information for the novel CML mechanism exploration and cancer treatment. PMID:25961029

  7. GLITTER: a web-based application for gene link inspection through tissue-specific coexpression.

    PubMed

    Liu, Xiangtao; Yu, Pengfei; Cheng, Chao; Potash, James B; Han, Shizhong

    2016-01-01

    Accumulating evidence supports the polygenic nature of most complex diseases, suggesting the involvement of many susceptibility genes with small effect sizes. Although hundreds of genes may underlie the genetic architecture of complex diseases, those involved in a given disease are probably not randomly distributed, but likely to be functionally related. Protein-protein interaction networks have been used to evaluate the functional relatedness of susceptibility genes. However, these networks do not account for tissue specificity, are limited to protein-coding genes, and are typically biased by incomplete biological knowledge. Here, we present Gene Link Inspector Through Tissue-specific coExpRession (GLITTER), a web-based application for assessing the functional relatedness of susceptibility genes, either coding or noncoding, according to tissue-specific gene expression profiles. GLITTER can also shed light on the specific tissues in which susceptibility genes might exert their functions. We further demonstrate examples of how GLITTER can evaluate the functional relatedness of susceptibility genes underlying schizophrenia and breast cancer, and provide clues about etiology. PMID:27623690

  8. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish.

    PubMed

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-01-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320

  9. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish

    PubMed Central

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-01-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320

  10. RiceArrayNet: a database for correlating gene expression from transcriptome profiling, and its application to the analysis of coexpressed genes in rice.

    PubMed

    Lee, Tae-Ho; Kim, Yeon-Ki; Pham, Thu Thi Minh; Song, Sang Ik; Kim, Ju-Kon; Kang, Kyu Young; An, Gynheung; Jung, Ki-Hong; Galbraith, David W; Kim, Minkyun; Yoon, Ung-Han; Nahm, Baek Hie

    2009-09-01

    Microarray data can be used to derive understanding of the relationships between the genes involved in various biological systems of an organism, given the availability of databases of gene expression measurements from the complete spectrum of experimental conditions and materials. However, there have been no reports, to date, of such a database being constructed for rice (Oryza sativa). Here, we describe the construction of such a database, called RiceArrayNet (RAN; http://www.ggbio.com/arraynet/), which provides information on coexpression between genes in terms of correlation coefficients (r values). The average number of coexpressed genes is 214, with sd of 440 at r >or= 0.5. Given the correlation between genes in a gene pair, the degrees of closeness between genes can be visualized in a relational tree and a relational network. The distribution of correlated genes according to degree of stringency shows how each gene is related to other genes. As an application of RAN, the 16-member L7Ae ribosomal protein family was explored for coexpressed genes and gene expression values within and between rice and Arabidopsis (Arabidopsis thaliana), and common and unique features in coexpression partners and expression patterns were observed for these family members. We observed a correlation pattern between Os01g0968800, a drought-responsive element-binding transcription factor, Os02g0790500, a trehalose-6-phosphate synthase, and Os06g0219500, a small heat shock factor, reflecting the fact that genes responding to the same biological stresses are regulated together. The RAN database can be used as a tool to gain insight into a particular gene by examining its coexpression partners. PMID:19605550

  11. Co-expression network analysis of Down's syndrome based on microarray data

    PubMed Central

    Zhao, Jianping; Zhang, Zhengguo; Ren, Shumin; Zong, Yanan; Kong, Xiangdong

    2016-01-01

    Down's syndrome (DS) is a type of chromosome disease. The present study aimed to explore the underlying molecular mechanisms of DS. GSE5390 microarray data downloaded from the gene expression omnibus database was used to identify differentially expressed genes (DEGs) in DS. Pathway enrichment analysis of the DEGs was performed, followed by co-expression network construction. Significant differential modules were mined by mutual information, followed by functional analysis. The accuracy of sample classification for the significant differential modules of DEGs was evaluated by leave-one-out cross-validation. A total of 997 DEGs, including 638 upregulated and 359 downregulated genes, were identified. Upregulated DEGs were enriched in 15 pathways, such as cell adhesion molecules, whereas downregulated DEGs were enriched in maturity onset diabetes of the young. Three significant differential modules with the highest discriminative scores (mutual information>0.35) were selected from a co-expression network. The classification accuracy of GSE16677 expression profile samples was 54.55% and 72.73% when characterized by 12 DEGs and 3 significant differential modules, respectively. Genes in significant differential modules were significantly enriched in 5 functions, including the endoplasmic reticulum (P=0.018) and regulation of apoptosis (P=0.061). The identified DEGs, in particular the 12 DEGs in the significant differential modules, such as B-cell lymphoma 2-associated transcription factor 1, heat shock protein 90 kDa beta member 1, UBX domain-containing protein 2 and transmembrane protein 50B, may serve important roles in the pathogenesis of DS. PMID:27588071

  12. A co-expression network analysis reveals lncRNA abnormalities in peripheral blood in early-onset schizophrenia.

    PubMed

    Ren, Yan; Cui, Yuehua; Li, Xinrong; Wang, Binhong; Na, Long; Shi, Junyan; Wang, Liang; Qiu, Lixia; Zhang, Kerang; Liu, Guifen; Xu, Yong

    2015-12-01

    Long non-coding RNAs (lncRNAs) are emerging as important regulators of gene expression and disease processes especially in neuropsychiatric disorders. To explore the potential regulatory roles of lncRNAs in schizophrenia, we performed an integrated co-expression network analysis on lncRNA and mRNA microarray profiles generated from the peripheral blood samples in 19 drug-naïve first-episode early-onset schizophrenia (EOS) patients and 18 demographically matched typically developing controls (TDCs). Using weighted gene co-expression network analysis (WGCNA), we showed that the lncRNAs were organized into co-expressed modules, and two lncRNA modules were associated with EOS. The mRNA networks were constructed and three disease-associated modules were identified. Gene Ontology (GO) analysis indicated that the mRNAs were highly enriched for mitochondrion and related biological processes. Moreover, our results revealed a significant correlation between lncRNAs and mRNAs using the canonical correlation analysis (CCA). Our results suggest that the convergent lncRNA alteration may be involved in the etiologies of EOS, and mitochondrial dysfunction participates in the pathological process of the disease. Our findings may shed light on the pathogenesis of schizophrenia and facilitate future diagnosis and therapeutic strategies. PMID:25967042

  13. Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

    NASA Technical Reports Server (NTRS)

    Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

    2000-01-01

    Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.

  14. Correlated mRNAs and miRNAs from co-expression and regulatory networks affect porcine muscle and finally meat properties

    PubMed Central

    2013-01-01

    Background Physiological processes aiding the conversion of muscle to meat involve many genes associated with muscle structure and metabolic processes. MicroRNAs regulate networks of genes to orchestrate cellular functions, in turn regulating phenotypes. Results We applied weighted gene co-expression network analysis to identify co-expression modules that correlated to meat quality phenotypes and were highly enriched for genes involved in glucose metabolism, response to wounding, mitochondrial ribosome, mitochondrion, and extracellular matrix. Negative correlation of miRNA with mRNA and target prediction were used to select transcripts out of the modules of trait-associated mRNAs to further identify those genes that are correlated with post mortem traits. Conclusions Porcine muscle co-expression transcript networks that correlated to post mortem traits were identified. The integration of miRNA and mRNA expression analyses, as well as network analysis, enabled us to interpret the differentially-regulated genes from a systems perspective. Linking co-expression networks of transcripts and hierarchically organized pairs of miRNAs and mRNAs to meat properties yields new insight into several biological pathways underlying phenotype differences. These pathways may also be diagnostic for many myopathies, which are accompanied by deficient nutrient and oxygen supply of muscle fibers. PMID:23915301

  15. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism

    PubMed Central

    Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.

    2013-01-01

    SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886

  16. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  17. Human transcriptional interactome of chromatin contribute to gene co-expression

    PubMed Central

    2010-01-01

    Background Transcriptional interactome of chromatin is one of the important mechanisms in gene transcription regulation. By chromatin conformation capture and 3D FISH experiments, several chromatin interactions cases among sequence-distant genes or even inter-chromatin genes were reported. However, on genomics level, there is still little evidence to support these mechanisms. Recently based on Hi-C experiment, a genome-wide picture of chromatin interactions in human cells was presented. It provides a useful material for analysing whether the mechanism of transcriptional interactome is common. Results The main work here is to demonstrate whether the effects of transcriptional interactome on gene co-expression exist on genomic level. While controlling the effects of transcription factors control similarities (TCS), we tested the correlation between Hi-C interaction and the mutual ranks of gene co-expression rates (provided by COXPRESdb) of intra-chromatin gene pairs. We used 6,084 genes with both TF annotation and co-expression information, and matched them into 273,458 pairs with similar Hi-C interaction ranks in different cell types. The results illustrate that co-expression is strongly associated with chromatin interaction. Further analysis using GO annotation reveals potential correlation between gene function similarity, Hi-C interaction and their co-expression. Conclusions According to the results in this research, the intra-chromatin interactome may have relation to gene function and associate with co-expression. This study provides evidence for illustrating the effect of transcriptional interactome on transcription regulation. PMID:21156067

  18. Integration of Metabolic Modeling with Gene Co-expression Reveals Transcriptionally Programmed Reactions Explaining Robustness in Mycobacterium tuberculosis

    PubMed Central

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Mittal, Inna; Mobeen, Ahmed; Ramachandran, Srinivasan

    2016-01-01

    Robustness of metabolic networks is accomplished by gene regulation, modularity, re-routing of metabolites and plasticity. Here, we probed robustness against perturbations of biochemical reactions of M. tuberculosis in the form of predicting compensatory trends. In order to investigate the transcriptional programming of genes associated with correlated fluxes, we integrated with gene co-expression network. Knock down of the reactions NADH2r and ATPS responsible for producing the hub metabolites, and Central carbon metabolism had the highest proportion of their associated genes under transcriptional co-expression with genes of their flux correlated reactions. Reciprocal gene expression correlations were observed among compensatory routes, fresh activation of alternative routes and in the multi-copy genes of Cysteine synthase and of Phosphate transporter. Knock down of 46 reactions caused the activation of Isocitrate lyase or Malate synthase or both reactions, which are central to the persistent state of M. tuberculosis. A total of 30 new freshly activated routes including Cytochrome c oxidase, Lactate dehydrogenase, and Glycine cleavage system were predicted, which could be responsible for switching into dormant or persistent state. Thus, our integrated approach of exploring transcriptional programming of flux correlated reactions has the potential to unravel features of system architecture conferring robustness. PMID:27000948

  19. Integration of Metabolic Modeling with Gene Co-expression Reveals Transcriptionally Programmed Reactions Explaining Robustness in Mycobacterium tuberculosis.

    PubMed

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Mittal, Inna; Mobeen, Ahmed; Ramachandran, Srinivasan

    2016-01-01

    Robustness of metabolic networks is accomplished by gene regulation, modularity, re-routing of metabolites and plasticity. Here, we probed robustness against perturbations of biochemical reactions of M. tuberculosis in the form of predicting compensatory trends. In order to investigate the transcriptional programming of genes associated with correlated fluxes, we integrated with gene co-expression network. Knock down of the reactions NADH2r and ATPS responsible for producing the hub metabolites, and Central carbon metabolism had the highest proportion of their associated genes under transcriptional co-expression with genes of their flux correlated reactions. Reciprocal gene expression correlations were observed among compensatory routes, fresh activation of alternative routes and in the multi-copy genes of Cysteine synthase and of Phosphate transporter. Knock down of 46 reactions caused the activation of Isocitrate lyase or Malate synthase or both reactions, which are central to the persistent state of M. tuberculosis. A total of 30 new freshly activated routes including Cytochrome c oxidase, Lactate dehydrogenase, and Glycine cleavage system were predicted, which could be responsible for switching into dormant or persistent state. Thus, our integrated approach of exploring transcriptional programming of flux correlated reactions has the potential to unravel features of system architecture conferring robustness. PMID:27000948

  20. Cell-type–based model explaining coexpression patterns of genes in the brain

    PubMed Central

    Grange, Pascal; Bohland, Jason W.; Okaty, Benjamin W.; Sugino, Ken; Bokil, Hemant; Nelson, Sacha B.; Ng, Lydia; Hawrylycz, Michael; Mitra, Partha P.

    2014-01-01

    Spatial patterns of gene expression in the vertebrate brain are not independent, as pairs of genes can exhibit complex patterns of coexpression. Two genes may be similarly expressed in one region, but differentially expressed in other regions. These correlations have been studied quantitatively, particularly for the Allen Atlas of the adult mouse brain, but their biological meaning remains obscure. We propose a simple model of the coexpression patterns in terms of spatial distributions of underlying cell types and establish its plausibility using independently measured cell-type–specific transcriptomes. The model allows us to predict the spatial distribution of cell types in the mouse brain. PMID:24706869

  1. Co-expression network analysis reveals transcription factors associated to cell wall biosynthesis in sugarcane.

    PubMed

    Ferreira, Savio Siqueira; Hotta, Carlos Takeshi; Poelking, Viviane Guzzo de Carli; Leite, Debora Chaves Coelho; Buckeridge, Marcos Silveira; Loureiro, Marcelo Ehlers; Barbosa, Marcio Henrique Pereira; Carneiro, Monalisa Sampaio; Souza, Glaucia Mendes

    2016-05-01

    Sugarcane is a hybrid of Saccharum officinarum and Saccharum spontaneum, with minor contributions from other species in Saccharum and other genera. Understanding the molecular basis of cell wall metabolism in sugarcane may allow for rational changes in fiber quality and content when designing new energy crops. This work describes a comparative expression profiling of sugarcane ancestral genotypes: S. officinarum, S. spontaneum and S. robustum and a commercial hybrid: RB867515, linking gene expression to phenotypes to identify genes for sugarcane improvement. Oligoarray experiments of leaves, immature and intermediate internodes, detected 12,621 sense and 995 antisense transcripts. Amino acid metabolism was particularly evident among pathways showing natural antisense transcripts expression. For all tissues sampled, expression analysis revealed 831, 674 and 648 differentially expressed genes in S. officinarum, S. robustum and S. spontaneum, respectively, using RB867515 as reference. Expression of sugar transporters might explain sucrose differences among genotypes, but an unexpected differential expression of histones were also identified between high and low Brix° genotypes. Lignin biosynthetic genes and bioenergetics-related genes were up-regulated in the high lignin genotype, suggesting that these genes are important for S. spontaneum to allocate carbon to lignin, while S. officinarum allocates it to sucrose storage. Co-expression network analysis identified 18 transcription factors possibly related to cell wall biosynthesis while in silico analysis detected cis-elements involved in cell wall biosynthesis in their promoters. Our results provide information to elucidate regulatory networks underlying traits of interest that will allow the improvement of sugarcane for biofuel and chemicals production. PMID:26820137

  2. G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...

  3. Gene differential coexpression analysis based on biweight correlation and maximum clique.

    PubMed

    Zheng, Chun-Hou; Yuan, Lin; Sha, Wen; Sun, Zhan-Li

    2014-01-01

    Differential coexpression analysis usually requires the definition of 'distance' or 'similarity' between measured datasets. Until now, the most common choice is Pearson correlation coefficient. However, Pearson correlation coefficient is sensitive to outliers. Biweight midcorrelation is considered to be a good alternative to Pearson correlation since it is more robust to outliers. In this paper, we introduce to use Biweight Midcorrelation to measure 'similarity' between gene expression profiles, and provide a new approach for gene differential coexpression analysis. Firstly, we calculate the biweight midcorrelation coefficients between all gene pairs. Then, we filter out non-informative correlation pairs using the 'half-thresholding' strategy and calculate the differential coexpression value of gene, The experimental results on simulated data show that the new approach performed better than three previously published differential coexpression analysis (DCEA) methods. Moreover, we use the maximum clique analysis to gene subset included genes identified by our approach and previously reported T2D-related genes, many additional discoveries can be found through our method. PMID:25474074

  4. Understanding developmental and adaptive cues in pine through metabolite profiling and co-expression network analysis

    PubMed Central

    Cañas, Rafael A.; Canales, Javier; Muñoz-Hernández, Carmen; Granados, Jose M.; Ávila, Concepción; García-Martín, María L.; Cánovas, Francisco M.

    2015-01-01

    Conifers include long-lived evergreen trees of great economic and ecological importance, including pines and spruces. During their long lives conifers must respond to seasonal environmental changes, adapt to unpredictable environmental stresses, and co-ordinate their adaptive adjustments with internal developmental programmes. To gain insights into these responses, we examined metabolite and transcriptomic profiles of needles from naturally growing 25-year-old maritime pine (Pinus pinaster L. Aiton) trees over a year. The effect of environmental parameters such as temperature and rain on needle development were studied. Our results show that seasonal changes in the metabolite profiles were mainly affected by the needles’ age and acclimation for winter, but changes in transcript profiles were mainly dependent on climatic factors. The relative abundance of most transcripts correlated well with temperature, particularly for genes involved in photosynthesis or winter acclimation. Gene network analysis revealed relationships between 14 co-expressed gene modules and development and adaptation to environmental stimuli. Novel Myb transcription factors were identified as candidate regulators during needle development. Our systems-based analysis provides integrated data of the seasonal regulation of maritime pine growth, opening new perspectives for understanding the complex regulatory mechanisms underlying conifers’ adaptive responses. Taken together, our results suggest that the environment regulates the transcriptome for fine tuning of the metabolome during development. PMID:25873654

  5. Gene expression networks.

    PubMed

    Thomas, Reuben; Portier, Christopher J

    2013-01-01

    With the advent of microarrays and next-generation biotechnologies, the use of gene expression data has become ubiquitous in biological research. One potential drawback of these data is that they are very rich in features or genes though cost considerations allow for the use of only relatively small sample sizes. A useful way of getting at biologically meaningful interpretations of the environmental or toxicological condition of interest would be to make inferences at the level of a priori defined biochemical pathways or networks of interacting genes or proteins that are known to perform certain biological functions. This chapter describes approaches taken in the literature to make such inferences at the biochemical pathway level. In addition this chapter describes approaches to create hypotheses on genes playing important roles in response to a treatment, using organism level gene coexpression or protein-protein interaction networks. Also, approaches to reverse engineer gene networks or methods that seek to identify novel interactions between genes are described. Given the relatively small sample numbers typically available, these reverse engineering approaches are generally useful in inferring interactions only among a relatively small or an order 10 number of genes. Finally, given the vast amounts of publicly available gene expression data from different sources, this chapter summarizes the important sources of these data and characteristics of these sources or databases. In line with the overall aims of this book of providing practical knowledge to a researcher interested in analyzing gene expression data from a network perspective, the chapter provides convenient publicly accessible tools for performing analyses described, and in addition describe three motivating examples taken from the published literature that illustrate some of the relevant analyses. PMID:23086841

  6. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers

    PubMed Central

    Zhang, Jie; Huang, Kun

    2014-01-01

    In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. PMID:27486298

  7. WeGET: predicting new genes for molecular systems by weighted co-expression.

    PubMed

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  8. WeGET: predicting new genes for molecular systems by weighted co-expression

    PubMed Central

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A.

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  9. Co-expression analysis of differentially expressed genes in hepatitis C virus-induced hepatocellular carcinoma.

    PubMed

    Song, Qingfeng; Zhao, Chang; Ou, Shengqiu; Meng, Zhibin; Kang, Ping; Fan, Liwei; Qi, Feng; Ma, Yilong

    2015-01-01

    The aim of the current study was to investigate the molecular mechanisms underlying hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) using the expression profiles of HCV-infected Huh7 cells at different time points. The differentially expressed genes (DEGs) were identified with the Samr package in R software once the data were normalized. Functional and pathway enrichment analysis of the identified DEGs was also performed. Subsequently, MCODE in Cytoscape software was applied to conduct module analysis of the constructed co-expression networks. A total of 1,100 DEGs were identified between the HCV-infected and control samples at 12, 18, 24 and 48 h post-infection. DEGs at 24 and 48 h were involved in the same signaling pathways and biological processes, including sterol biosynthetic processes and tRNA amino-acylation. There were 22 time series genes which were clustered into 3 expression patterns, and the demarcation point of the 2 expression patterns that 401 overlapping DEGs at 24 and 48 h clustered into was 24 h post-infection. tRNA synthesis-related biological processes emerged at 24 and 48 h. Replication and assembly of HCV in HCV-infected Huh7 cells occurred mainly at 24 h post-infection. In view of this, the screened time series genes have the potential to become candidate target molecules for monitoring, diagnosing and treating HCV-induced HCC. PMID:25339452

  10. Co-expression analysis of differentially expressed genes in hepatitis C virus-induced hepatocellular carcinoma

    PubMed Central

    SONG, QINGFENG; ZHAO, CHANG; OU, SHENGQIU; MENG, ZHIBIN; KANG, PING; FAN, LIWEI; QI, FENG; MA, YILONG

    2015-01-01

    The aim of the current study was to investigate the molecular mechanisms underlying hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) using the expression profiles of HCV-infected Huh7 cells at different time points. The differentially expressed genes (DEGs) were identified with the Samr package in R software once the data were normalized. Functional and pathway enrichment analysis of the identified DEGs was also performed. Subsequently, MCODE in Cytoscape software was applied to conduct module analysis of the constructed co-expression networks. A total of 1,100 DEGs were identified between the HCV-infected and control samples at 12, 18, 24 and 48 h post-infection. DEGs at 24 and 48 h were involved in the same signaling pathways and biological processes, including sterol biosynthetic processes and tRNA amino-acylation. There were 22 time series genes which were clustered into 3 expression patterns, and the demarcation point of the 2 expression patterns that 401 overlapping DEGs at 24 and 48 h clustered into was 24 h post-infection. tRNA synthesis-related biological processes emerged at 24 and 48 h. Replication and assembly of HCV in HCV-infected Huh7 cells occurred mainly at 24 h post-infection. In view of this, the screened time series genes have the potential to become candidate target molecules for monitoring, diagnosing and treating HCV-induced HCC. PMID:25339452

  11. Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation

    PubMed Central

    Coppe, Alessandro; Ferrari, Francesco; Bisognin, Andrea; Danieli, Gian Antonio; Ferrari, Sergio; Bicciato, Silvio; Bortoluzzi, Stefania

    2009-01-01

    Genes co-expressed may be under similar promoter-based and/or position-based regulation. Although data on expression, position and function of human genes are available, their true integration still represents a challenge for computational biology, hampering the identification of regulatory mechanisms. We carried out an integrative analysis of genomic position, functional annotation and promoters of genes expressed in myeloid cells. Promoter analysis was conducted by a novel multi-step method for discovering putative regulatory elements, i.e. over-represented motifs, in a selected set of promoters, as compared with a background model. The combination of transcriptional, structural and functional data allowed the identification of sets of promoters pertaining to groups of genes co-expressed and co-localized in regions of the human genome. The application of motif discovery to 26 groups of genes co-expressed in myeloid cells differentiation and co-localized in the genome showed that there are more over-represented motifs in promoters of co-expressed and co-localized genes than in promoters of simply co-expressed genes (CEG). Motifs, which are similar to the binding sequences of known transcription factors, non-uniformly distributed along promoter sequences and/or occurring in highly co-expressed subset of genes were identified. Co-expressed and co-localized gene sets were grouped in two co-expressed genomic meta-regions, putatively representing functional domains of a high-level expression regulation. PMID:19059999

  12. Co-expression of soybean Dicer-like genes in response to stress and development.

    PubMed

    Curtin, Shaun J; Kantar, Michael B; Yoon, Han W; Whaley, Adam M; Schlueter, Jessica A; Stupar, Robert M

    2012-11-01

    Regulation of gene transcription and post-transcriptional processes is critical for proper development, genome integrity, and stress responses in plants. Many genes involved in the key processes of transcriptional and post-transcriptional regulation have been well studied in model diploid organisms. However, gene and genome duplication may alter the function of the genes involved in these processes. To address this question, we assayed the stress-induced transcription patterns of duplicated gene pairs involved in RNAi and DNA methylation processes in the paleopolyploid soybean. Real-time quantitative PCR and Sequenom MassARRAY expression assays were used to profile the relative expression ratios of eight gene pairs across eight different biotic and abiotic stress conditions. The transcriptional responses to stress for genes involved in DNA methylation, RNAi processing, and miRNA processing were compared. The strongest evidence for pairwise co-expression in response to stresses was exhibited by non-paralogous Dicer-like (DCL) genes GmDCL2a-GmDCL3a and GmDCL1b-GmDCL2b, most profoundly in root tissues. Among homoeologous or paralogous DCL genes, the Dicer-like 2 (DCL2) gene pair exhibited the strongest response to stress and most conserved co-expression pattern. This was surprising because the DCL2 duplication event is more ancient than the other DCL duplications. Possible mechanisms that may be driving the DCL2 co-expression are discussed. PMID:22527487

  13. Age gene expression and coexpression progressive signatures in peripheral blood leukocytes.

    PubMed

    Irizar, Haritz; Goñi, Joaquín; Alzualde, Ainhoa; Castillo-Triviño, Tamara; Olascoaga, Javier; Lopez de Munain, Adolfo; Otaegui, David

    2015-12-01

    Both cellular senescence and organismic aging are known to be dynamic processes that start early in life and progress constantly during the whole life of the individual. In this work, with the objective of identifying signatures of age-related progressive change at the transcriptomic level, we have performed a whole-genome gene expression analysis of peripheral blood leukocytes in a group of healthy individuals with ages ranging from 14 to 93 years. A set of genes with progressively changing gene expression (either increase or decrease with age) has been identified and contextualized in a coexpression network. A modularity analysis has been performed on this network and biological-term and pathway enrichment analyses have been used for biological interpretation of each module. In summary, the results of the present work reveal the existence of a transcriptomic component that shows progressive expression changes associated to age in peripheral blood leukocytes, highlighting both the dynamic nature of the process and the need to complement young vs. elder studies with longitudinal studies that include middle aged individuals. From the transcriptional point of view, immunosenescence seems to be occurring from a relatively early age, at least from the late 20s/early 30s, and the 49-56 year old age-range appears to be critical. In general, the genes that, according to our results, show progressive expression changes with aging are involved in pathogenic/cellular processes that have classically been linked to aging in humans: cancer, immune processes and cellular growth vs. maintenance. PMID:26362218

  14. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474

  15. Predicting targeted drug combinations based on Pareto optimal patterns of coexpression network connectivity

    PubMed Central

    2014-01-01

    Background Molecularly targeted drugs promise a safer and more effective treatment modality than conventional chemotherapy for cancer patients. However, tumors are dynamic systems that readily adapt to these agents activating alternative survival pathways as they evolve resistant phenotypes. Combination therapies can overcome resistance but finding the optimal combinations efficiently presents a formidable challenge. Here we introduce a new paradigm for the design of combination therapy treatment strategies that exploits the tumor adaptive process to identify context-dependent essential genes as druggable targets. Methods We have developed a framework to mine high-throughput transcriptomic data, based on differential coexpression and Pareto optimization, to investigate drug-induced tumor adaptation. We use this approach to identify tumor-essential genes as druggable candidates. We apply our method to a set of ER+ breast tumor samples, collected before (n = 58) and after (n = 60) neoadjuvant treatment with the aromatase inhibitor letrozole, to prioritize genes as targets for combination therapy with letrozole treatment. We validate letrozole-induced tumor adaptation through coexpression and pathway analyses in an independent data set (n = 18). Results We find pervasive differential coexpression between the untreated and letrozole-treated tumor samples as evidence of letrozole-induced tumor adaptation. Based on patterns of coexpression, we identify ten genes as potential candidates for combination therapy with letrozole including EPCAM, a letrozole-induced essential gene and a target to which drugs have already been developed as cancer therapeutics. Through replication, we validate six letrozole-induced coexpression relationships and confirm the epithelial-to-mesenchymal transition as a process that is upregulated in the residual tumor samples following letrozole treatment. Conclusions To derive the greatest benefit from molecularly targeted drugs it is

  16. The Detection of Metabolite-Mediated Gene Module Co-Expression Using Multivariate Linear Models

    PubMed Central

    Padayachee, Trishanta; Khamiakova, Tatsiana; Shkedy, Ziv; Perola, Markus; Salo, Perttu; Burzykowski, Tomasz

    2016-01-01

    Investigating whether metabolites regulate the co-expression of a predefined gene module is one of the relevant questions posed in the integrative analysis of metabolomic and transcriptomic data. This article concerns the integrative analysis of the two high-dimensional datasets by means of multivariate models and statistical tests for the dependence between metabolites and the co-expression of a gene module. The general linear model (GLM) for correlated data that we propose models the dependence between adjusted gene expression values through a block-diagonal variance-covariance structure formed by metabolic-subset specific general variance-covariance blocks. Performance of statistical tests for the inference of conditional co-expression are evaluated through a simulation study. The proposed methodology is applied to the gene expression data of the previously characterized lipid-leukocyte module. Our results show that the GLM approach improves on a previous approach by being less prone to the detection of spurious conditional co-expression. PMID:26918614

  17. Differential co-expression analysis of venous thromboembolism based on gene expression profile data

    PubMed Central

    MING, ZHIBING; DING, WENBIN; YUAN, RUIFAN; JIN, JIE; LI, XIAOQIANG

    2016-01-01

    The aim of the present study was to screen differentially co-expressed genes and the involved transcription factors (TFs) and microRNAs (miRNAs) in venous thromboembolism (VTE). Microarray data of GSE19151 were downloaded from Gene Expression Omnibus, including 70 patients with VTE and 63 healthy controls. Principal component analysis (PCA) was performed using R software. Differential co-expression analysis was performed using R, followed by screening of modules using Cytoscape. Functional annotation was performed using Database for Annotation, Visualization, and Integrated Discovery. Moreover, Fisher test was used to screen key TFs and miRNAs for the modules. PCA revealed the disease and healthy samples could not be distinguished at the gene expression level. A total of 4,796 upregulated differentially co-expressed genes (e.g. zinc finger protein 264, electron-transfer-flavoprotein, beta polypeptide and Janus kinase 2) and 3,629 downregulated differentially co-expressed genes (e.g. adenylate cyclase 7 and single-stranded DNA binding protein 2) were identified, which were further mined to obtain 17 and eight modules separately. Functional annotation revealed that the largest upregulated module was primarily associated with acetylation and the largest downregulated module was mainly involved in mitochondrion. Moreover, 48 TFs and 62 miRNA families were screened for the 17 upregulated modules, such as E2F transcription factor 4, miR-30 and miR-135 regulating the largest module. Conversely, 35 TFs and 18 miRNA families were identified for the 8 downregulated modules, including mitochondrial ribosomal protein S12 and miR-23 regulating the largest module. Differentially co-expressed genes regulated by TFs and miRNAs may jointly contribute to the abnormal acetylation and mitochondrion presentation in the progression of VTE. PMID:27284300

  18. Characterization of Chemically Induced Liver Injuries Using Gene Co-Expression Modules

    PubMed Central

    Tawa, Gregory J.; AbdulHameed, Mohamed Diwan M.; Yu, Xueping; Kumar, Kamal; Ippolito, Danielle L.; Lewis, John A.; Stallings, Jonathan D.; Wallqvist, Anders

    2014-01-01

    Liver injuries due to ingestion or exposure to chemicals and industrial toxicants pose a serious health risk that may be hard to assess due to a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific damage and clinical outcomes via biomarkers or biomarker panels will provide the foundation for highly specific and robust diagnostic tests. Here, we have used DrugMatrix, a toxicogenomics database containing organ-specific gene expression data matched to dose-dependent chemical exposures and adverse clinical pathology assessments in Sprague Dawley rats, to identify groups of co-expressed genes (modules) specific to injury endpoints in the liver. We identified 78 such gene co-expression modules associated with 25 diverse injury endpoints categorized from clinical pathology, organ weight changes, and histopathology. Using gene expression data associated with an injury condition, we showed that these modules exhibited different patterns of activation characteristic of each injury. We further showed that specific module genes mapped to 1) known biochemical pathways associated with liver injuries and 2) clinically used diagnostic tests for liver fibrosis. As such, the gene modules have characteristics of both generalized and specific toxic response pathways. Using these results, we proposed three gene signature sets characteristic of liver fibrosis, steatosis, and general liver injury based on genes from the co-expression modules. Out of all 92 identified genes, 18 (20%) genes have well-documented relationships with liver disease, whereas the rest are novel and have not previously been associated with liver disease. In conclusion, identifying gene co-expression modules associated with chemically induced liver injuries aids in generating testable hypotheses and has the potential to identify putative biomarkers of adverse health effects. PMID:25226513

  19. Coexpression of two closely linked avian genes for purine nucleotide synthesis from a bidirectional promoter.

    PubMed Central

    Gavalas, A; Dixon, J E; Brayton, K A; Zalkin, H

    1993-01-01

    Two avian genes encoding essential steps in the purine nucleotide biosynthetic pathway are transcribed divergently from a bidirectional promoter element. The bidirectional promoter, embedded in a CpG island, directs coexpression of GPAT and AIRC genes from distinct transcriptional start sites 229 bp apart. The bidirectional promoter can be divided in half, with each half retaining partial activity towards the cognate gene. GPAT and AIRC genes encode the enzymes that catalyze step 1 and steps 6 plus 7, respectively, in the de novo purine biosynthetic pathway. This is the first report of genes coding for structurally unrelated enzymes of the same pathway that are tightly linked and transcribed divergently from a bidirectional promoter. This arrangement has the potential to provide for regulated coexpression comparable to that in a prokaryotic operon. Images PMID:8336716

  20. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism

    PubMed Central

    Pérez-Delgado, Carmen M.; Moyano, Tomás C.; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A.; Márquez, Antonio J.; Betti, Marco

    2016-01-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. PMID:27117340

  1. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism.

    PubMed

    Pérez-Delgado, Carmen M; Moyano, Tomás C; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A; Márquez, Antonio J; Betti, Marco

    2016-05-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. PMID:27117340

  2. ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression

    PubMed Central

    Aoki, Yuichi; Okamura, Yasunobu; Tadaka, Shu; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    ATTED-II (http://atted.jp) is a coexpression database for plant species with parallel views of multiple coexpression data sets and network analysis tools. The user can efficiently find functional gene relationships and design experiments to identify gene functions by reverse genetics and general molecular biology techniques. Here, we report updates to ATTED-II (version 8.0), including new and updated coexpression data and analysis tools. ATTED-II now includes eight microarray- and six RNA sequencing-based coexpression data sets for seven dicot species (Arabidopsis, field mustard, soybean, barrel medick, poplar, tomato and grape) and two monocot species (rice and maize). Stand-alone coexpression analyses tend to have low reliability. Therefore, examining evolutionarily conserved coexpression is a more effective approach from the viewpoints of reliability and evolutionary importance. In contrast, the reliability of species-specific coexpression data remains poor. Our assessment scores for individual coexpression data sets indicated that the quality of the new coexpression data sets in ATTED-II is higher than for any previous coexpression data set. In addition, five species (Arabidopsis, soybean, tomato, rice and maize) in ATTED-II are now supported by both microarray- and RNA sequencing-based coexpression data, which has increased the reliability. Consequently, ATTED-II can now provide lineage-specific coexpression information. As an example of the use of ATTED-II to explore lineage-specific coexpression, we demonstrate monocot- and dicot-specific coexpression of cell wall genes. With the expanded coexpression data for multilevel evaluation, ATTED-II provides new opportunities to investigate lineage-specific evolution in plants. PMID:26546318

  3. Increased co-expression of genes harboring the damaging de novo mutations in Chinese schizophrenic patients during prenatal development

    PubMed Central

    Wang, Qiang; Li, Miaoxin; Yang, Zhenxing; Hu, Xun; Wu, Hei-Man; Ni, Peiyan; Ren, Hongyan; Deng, Wei; Li, Mingli; Ma, Xiaohong; Guo, Wanjun; Zhao, Liansheng; Wang, Yingcheng; Xiang, Bo; Lei, Wei; Sham, Pak C; Li, Tao

    2015-01-01

    Schizophrenia is a heritable, heterogeneous common psychiatric disorder. In this study, we evaluated the hypothesis that de novo variants (DNVs) contribute to the pathogenesis of schizophrenia. We performed exome sequencing in Chinese patients (N = 45) with schizophrenia and their unaffected parents (N = 90). Forty genes were found to contain DNVs. These genes had enriched transcriptional co-expression profile in prenatal frontal cortex (Bonferroni corrected p < 9.1 × 10−3), and in prenatal temporal and parietal regions (Bonferroni corrected p < 0.03). Also, four prenatal anatomical subregions (VCF, MFC, OFC and ITC) have shown significant enrichment of connectedness in co-expression networks. Moreover, four genes (LRP1, MACF1, DICER1 and ABCA2) harboring the damaging de novo mutations are strongly prioritized as susceptibility genes by multiple evidences. Our findings in Chinese schizophrenic patients indicate the pathogenic role of DNVs, supporting the hypothesis that schizophrenia is a neurodevelopmental disease. PMID:26666178

  4. In silico prioritization based on coexpression can aid epileptic encephalopathy gene discovery

    PubMed Central

    Oliver, Karen L.; Lukic, Vesna; Freytag, Saskia; Scheffer, Ingrid E.; Berkovic, Samuel F.

    2016-01-01

    Objective: To evaluate the performance of an in silico prioritization approach that was applied to 179 epileptic encephalopathy candidate genes in 2013 and to expand the application of this approach to the whole genome based on expression data from the Allen Human Brain Atlas. Methods: PubMed searches determined which of the 179 epileptic encephalopathy candidate genes had been validated. For validated genes, it was noted whether they were 1 of the 19 of 179 candidates prioritized in 2013. The in silico prioritization approach was applied genome-wide; all genes were ranked according to their coexpression strength with a reference set (i.e., 51 established epileptic encephalopathy genes) in both adult and developing human brain expression data sets. Candidate genes ranked in the top 10% for both data sets were cross-referenced with genes previously implicated in the epileptic encephalopathies due to a de novo variant. Results: Five of 6 validated epileptic encephalopathy candidate genes were among the 19 prioritized in 2013 (odds ratio = 54, 95% confidence interval [7,∞], p = 4.5 × 10−5, Fisher exact test); one gene was false negative. A total of 297 genes ranked in the top 10% for both the adult and developing brain data sets based on coexpression with the reference set. Of these, 9 had been previously implicated in the epileptic encephalopathies (FBXO41, PLXNA1, ACOT4, PAK6, GABBR2, YWHAG, NBEA, KNDC1, and SELRC1). Conclusions: We conclude that brain gene coexpression data can be used to assist epileptic encephalopathy gene discovery and propose 9 genes as strong epileptic encephalopathy candidates worthy of further investigation. PMID:27066588

  5. Coexpression Network Analysis in Abdominal and Gluteal Adipose Tissue Reveals Regulatory Genetic Loci for Metabolic Syndrome and Related Phenotypes

    PubMed Central

    Min, Josine L.; Nicholson, George; Halgrimsdottir, Ingileif; Almstrup, Kristian; Petri, Andreas; Barrett, Amy; Travers, Mary; Rayner, Nigel W.; Mägi, Reedik; Pettersson, Fredrik H.; Broxholme, John; Neville, Matt J.; Wills, Quin F.; Cheeseman, Jane; Allen, Maxine; Holmes, Chris C.; Spector, Tim D.; Fleckner, Jan; McCarthy, Mark I.; Karpe, Fredrik; Lindgren, Cecilia M.; Zondervan, Krina T.

    2012-01-01

    Metabolic Syndrome (MetS) is highly prevalent and has considerable public health impact, but its underlying genetic factors remain elusive. To identify gene networks involved in MetS, we conducted whole-genome expression and genotype profiling on abdominal (ABD) and gluteal (GLU) adipose tissue, and whole blood (WB), from 29 MetS cases and 44 controls. Co-expression network analysis for each tissue independently identified nine, six, and zero MetS–associated modules of coexpressed genes in ABD, GLU, and WB, respectively. Of 8,992 probesets expressed in ABD or GLU, 685 (7.6%) were expressed in ABD and 51 (0.6%) in GLU only. Differential eigengene network analysis of 8,256 shared probesets detected 22 shared modules with high preservation across adipose depots (DABD-GLU = 0.89), seven of which were associated with MetS (FDR P<0.01). The strongest associated module, significantly enriched for immune response–related processes, contained 94/620 (15%) genes with inter-depot differences. In an independent cohort of 145/141 twins with ABD and WB longitudinal expression data, median variability in ABD due to familiality was greater for MetS–associated versus un-associated modules (ABD: 0.48 versus 0.18, P = 0.08; GLU: 0.54 versus 0.20, P = 7.8×10−4). Cis-eQTL analysis of probesets associated with MetS (FDR P<0.01) and/or inter-depot differences (FDR P<0.01) provided evidence for 32 eQTLs. Corresponding eSNPs were tested for association with MetS–related phenotypes in two GWAS of >100,000 individuals; rs10282458, affecting expression of RARRES2 (encoding chemerin), was associated with body mass index (BMI) (P = 6.0×10−4); and rs2395185, affecting inter-depot differences of HLA-DRB1 expression, was associated with high-density lipoprotein (P = 8.7×10−4) and BMI–adjusted waist-to-hip ratio (P = 2.4×10−4). Since many genes and their interactions influence complex traits such as MetS, integrated analysis of genotypes and

  6. Computational gene expression profiling under salt stress reveals patterns of co-expression.

    PubMed

    Sanchita; Sharma, Ashok

    2016-03-01

    Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411

  7. Datasets of genes coexpressed with FBN1 in mouse adipose tissue and during human adipogenesis.

    PubMed

    Davis, Margaret R; Arner, Erik; Duffy, Cairnan R E; De Sousa, Paul A; Dahlman, Ingrid; Arner, Peter; Summers, Kim M

    2016-09-01

    This article contains data related to the research article entitled "Expression of FBN1 during adipogenesis: relevance to the lipodystrophy phenotype in Marfan syndrome and related conditions" [1]. The article concerns the expression of FBN1, the gene encoding the extracellular matrix protein fibrillin-1, during adipogenesis in vitro and in relation to adipose tissue in vivo. The encoded protein has recently been shown to produce a short glucogenic peptide hormone, (Romere et al., 2016) [2], and this gene is therefore a key gene for regulating blood glucose levels. FBN1 and coexpressed genes were examined in mouse strains and in human cells undergoing adipogenesis. The data show the genes that were coexpressed with FBN1, including genes coding for other connective tissue proteins and the proteases that modify them and for the transcription factors that control their expression. Data analysed were derived from datasets available in the public domain and the analysis highlights the utility of such datasets for ongoing analysis and hence reduction in the use of experimental animals. PMID:27508231

  8. Salmonid genomes have a remarkably expanded akirin family, coexpressed with genes from conserved pathways governing skeletal muscle growth and catabolism

    PubMed Central

    Kristjánsson, Bjarni K.; Johnston, Ian A.

    2010-01-01

    Metazoan akirin genes regulate innate immunity, myogenesis, and carcinogenesis. Invertebrates typically have one family member, while most tetrapod and teleost vertebrates have one to three. We demonstrate an expanded repertoire of eight family members in genomes of four salmonid fishes, owing to paralog preservation after three tetraploidization events. Retention of paralogs secondarily lost in other teleosts may be related to functional diversification and posttranslational regulation. We hypothesized that salmonid akirins would be transcriptionally regulated in fast-twitch skeletal muscle during activation of conserved pathways governing catabolism and growth. The in vivo nutritional state of Arctic charr (Salvelinus alpinus L.) was experimentally manipulated, and transcript levels for akirin family members and 26 other genes were measured by quantitative real-time PCR (qPCR), allowing the establishment of a similarity network of expression profiles. In fasted muscle, a class of akirins was upregulated, with one family member showing high coexpression with catabolic genes coding the NF-κB p65 subunit, E2 ubiquitin-conjugating enzymes, E3 ubiquitin ligases, and IGF-I receptors. Another class of akirin was upregulated with subsequent feeding, coexpressed with 14-3-3 protein genes. There was no similarity between expression profiles of akirins with IGF hormones or binding protein genes. The level of phylogenetic relatedness of akirin family members was not a strong predictor of transcriptional responses to nutritional state, or differences in transcript abundance levels, indicating a complex pattern of regulatory evolution. The salmonid akirins epitomize the complexity linking the genome to physiological phenotypes of vertebrates with a history of tetraploidization. PMID:20388840

  9. Salmonid genomes have a remarkably expanded akirin family, coexpressed with genes from conserved pathways governing skeletal muscle growth and catabolism.

    PubMed

    Macqueen, Daniel J; Kristjánsson, Bjarni K; Johnston, Ian A

    2010-06-01

    Metazoan akirin genes regulate innate immunity, myogenesis, and carcinogenesis. Invertebrates typically have one family member, while most tetrapod and teleost vertebrates have one to three. We demonstrate an expanded repertoire of eight family members in genomes of four salmonid fishes, owing to paralog preservation after three tetraploidization events. Retention of paralogs secondarily lost in other teleosts may be related to functional diversification and posttranslational regulation. We hypothesized that salmonid akirins would be transcriptionally regulated in fast-twitch skeletal muscle during activation of conserved pathways governing catabolism and growth. The in vivo nutritional state of Arctic charr (Salvelinus alpinus L.) was experimentally manipulated, and transcript levels for akirin family members and 26 other genes were measured by quantitative real-time PCR (qPCR), allowing the establishment of a similarity network of expression profiles. In fasted muscle, a class of akirins was upregulated, with one family member showing high coexpression with catabolic genes coding the NF-kappaB p65 subunit, E2 ubiquitin-conjugating enzymes, E3 ubiquitin ligases, and IGF-I receptors. Another class of akirin was upregulated with subsequent feeding, coexpressed with 14-3-3 protein genes. There was no similarity between expression profiles of akirins with IGF hormones or binding protein genes. The level of phylogenetic relatedness of akirin family members was not a strong predictor of transcriptional responses to nutritional state, or differences in transcript abundance levels, indicating a complex pattern of regulatory evolution. The salmonid akirins epitomize the complexity linking the genome to physiological phenotypes of vertebrates with a history of tetraploidization. PMID:20388840

  10. An expression quantitative trait loci-guided co-expression analysis for constructing regulatory network using a rice recombinant inbred line population.

    PubMed

    Wang, Jia; Yu, Huihui; Weng, Xiaoyu; Xie, Weibo; Xu, Caiguo; Li, Xianghua; Xiao, Jinghua; Zhang, Qifa

    2014-03-01

    The ability to reveal the regulatory architecture of genes at the whole-genome level by constructing a regulatory network is critical for understanding the biological processes and developmental programmes of organisms. Here, we conducted an eQTL-guided function-related co-expression analysis to identify the putative regulators and construct gene regulatory network. We performed an eQTL analysis of 210 recombinant inbred lines (RILs) derived from a cross between two indica rice lines, Zhenshan 97 and Minghui 63, the parents of an elite hybrid, using data obtained by hybridizing RNA samples of flag leaves at the heading stage with Affymetrix whole-genome arrays. Making use of an ultrahigh-density single-nucleotide polymorphism bin map constructed by population sequencing, 13 647 eQTLs for 10 725 e-traits were detected, comprising 5079 cis-eQTLs (37.2%) and 8568 trans-eQTLs (62.8%). The analysis revealed 138 trans-eQTLs hotspots, each of which apparently regulates the expression variations of many genes. Co-expression analysis of functionally related genes within the framework of regulator-target relationships outlined by the eQTLs led to the identification of putative regulators in the system. The usefulness of the strategy was demonstrated with the genes known to be involved in flowering. We also applied this strategy to the analysis of QTLs for yield traits, which also suggested likely candidate genes. eQTL-guided co-expression analysis may provide a promising solution for outlining a framework for the complex regulatory network of an organism. PMID:24420573

  11. MGMT enrichment and second gene co-expression in hematopoietic progenitor cells using separate or dual-gene lentiviral vectors.

    PubMed

    Roth, Justin C; Alberti, Michael O; Ismail, Mourad; Lingas, Karen T; Reese, Jane S; Gerson, Stanton L

    2015-01-22

    The DNA repair gene O(6)-methylguanine-DNA methyltransferase (MGMT) allows efficient in vivo enrichment of transduced hematopoietic stem cells (HSC). Thus, linking this selection strategy to therapeutic gene expression offers the potential to reconstitute diseased hematopoietic tissue with gene-corrected cells. However, different dual-gene expression vector strategies are limited by poor expression of one or both transgenes. To evaluate different co-expression strategies in the context of MGMT-mediated HSC enrichment, we compared selection and expression efficacies in cells cotransduced with separate single-gene MGMT and GFP lentivectors to those obtained with dual-gene vectors employing either encephalomyocarditis virus (EMCV) internal ribosome entry site (IRES) or foot and mouth disease virus (FMDV) 2A elements for co-expression strategies. Each strategy was evaluated in vitro and in vivo using equivalent multiplicities of infection (MOI) to transduce 5-fluorouracil (5-FU) or Lin(-)Sca-1(+)c-kit(+) (LSK)-enriched murine bone marrow cells (BMCs). The highest dual-gene expression (MGMT(+)GFP(+)) percentages were obtained with the FMDV-2A dual-gene vector, but half of the resulting gene products existed as fusion proteins. Following selection, dual-gene expression percentages in single-gene vector cotransduced and dual-gene vector transduced populations were similar. Equivalent MGMT expression levels were obtained with each strategy, but GFP expression levels derived from the IRES dual-gene vector were significantly lower. In mice, vector-insertion averages were similar among cells enriched after dual-gene vectors and those cotransduced with single-gene vectors. These data demonstrate the limitations and advantages of each strategy in the context of MGMT-mediated selection, and may provide insights into vector design with respect to a particular therapeutic gene or hematologic defect. PMID:25479595

  12. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation.

    PubMed

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches. PMID:26881263

  13. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation

    PubMed Central

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches. PMID:26881263

  14. Co-Expression Analysis of Fetal Weight-Related Genes in Ovine Skeletal Muscle during Mid and Late Fetal Development Stages

    PubMed Central

    Xu, Lingyang; Zhao, Fuping; Ren, Hangxing; Li, Li; Lu, Jian; Liu, Jiasen; Zhang, Shifang; Liu, George E.; Song, Jiuzhou; Zhang, Li; Wei, Caihong; Du, Lixin

    2014-01-01

    Background: Muscle development and lipid metabolism play important roles during fetal development stages. The commercial Texel sheep are more muscular than the indigenous Ujumqin sheep. Results: We performed serial transcriptomics assays and systems biology analyses to investigate the dynamics of gene expression changes associated with fetal longissimus muscles during different fetal stages in two sheep breeds. Totally, we identified 1472 differentially expressed genes during various fetal stages using time-series expression analysis. A systems biology approach, weighted gene co-expression network analysis (WGCNA), was used to detect modules of correlated genes among these 1472 genes. Dramatically different gene modules were identified in four merged datasets, corresponding to the mid fetal stage in Texel and Ujumqin sheep, the late fetal stage in Texel and Ujumqin sheep, respectively. We further detected gene modules significantly correlated with fetal weight, and constructed networks and pathways using genes with high significances. In these gene modules, we identified genes like TADA3, LMNB1, TGF-β3, EEF1A2, FGFR1, MYOZ1, and FBP2 correlated with fetal weight. Conclusion: Our study revealed the complex network characteristics involved in muscle development and lipid metabolism during fetal development stages. Diverse patterns of the network connections observed between breeds and fetal stages could involve some hub genes, which play central roles in fetal development, correlating with fetal weight. Our findings could provide potential valuable biomarkers for selection of body weight-related traits in sheep and other livestock. PMID:25285036

  15. Modified Logistic Regression Models Using Gene Coexpression and Clinical Features to Predict Prostate Cancer Progression

    PubMed Central

    Zhao, Hongya; Logothetis, Christopher J.; Gorlov, Ivan P.; Zeng, Jia; Dai, Jianguo

    2013-01-01

    Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. PMID:24367394

  16. Module Based Differential Coexpression Analysis Method for Type 2 Diabetes

    PubMed Central

    Yuan, Lin; Zheng, Chun-Hou; Xia, Jun-Feng; Huang, De-Shuang

    2015-01-01

    More and more studies have shown that many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional biological pathway or network and are highly correlated. Differential coexpression analysis, as a more comprehensive technique to the differential expression analysis, was raised to research gene regulatory networks and biological pathways of phenotypic changes through measuring gene correlation changes between disease and normal conditions. In this paper, we propose a gene differential coexpression analysis algorithm in the level of gene sets and apply the algorithm to a publicly available type 2 diabetes (T2D) expression dataset. Firstly, we calculate coexpression biweight midcorrelation coefficients between all gene pairs. Then, we select informative correlation pairs using the “differential coexpression threshold” strategy. Finally, we identify the differential coexpression gene modules using maximum clique concept and k-clique algorithm. We apply the proposed differential coexpression analysis method on simulated data and T2D data. Two differential coexpression gene modules about T2D were detected, which should be useful for exploring the biological function of the related genes. PMID:26339648

  17. Novel role of ZmaNAC36 in co-expression of starch synthetic genes in maize endosperm.

    PubMed

    Zhang, Junjie; Chen, Jiang; Yi, Qiang; Hu, Yufeng; Liu, Hanmei; Liu, Yinghong; Huang, Yubi

    2014-02-01

    Starch is an essential commodity that is widely used as food, feed, fuel and in industry. However, its mechanism of synthesis is not fully understood, especially in terms of the expression and regulation of the starch synthetic genes. It was reported that the starch synthetic genes were co-expressed during maize endosperm development; however, the mechanism of the co-expression was not reported. In this paper, the ZmaNAC36 gene was amplified by homology-based cloning, and its expression vector was constructed for transient expression. The nuclear localization, transcriptional activation and target sites of the ZmaNAC36 protein were identified. The expression profile of ZmaNAC36 showed that it was strongly expressed in the maize endosperm and was co-expressed with most of the starch synthetic genes. Moreover, the expressions of many starch synthesis genes in the endosperm were upregulated when ZmaNAC36 was transiently overexpressed. All our results indicated that NAC36 might be a transcription factor and play a potential role in the co-expression of starch synthetic genes in the maize endosperm. PMID:24235061

  18. Mining Temporal Protein Complex Based on the Dynamic PIN Weighted with Connected Affinity and Gene Co-Expression

    PubMed Central

    Shen, Xianjun; Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Yang, Jincai

    2016-01-01

    The identification of temporal protein complexes would make great contribution to our knowledge of the dynamic organization characteristics in protein interaction networks (PINs). Recent studies have focused on integrating gene expression data into static PIN to construct dynamic PIN which reveals the dynamic evolutionary procedure of protein interactions, but they fail in practice for recognizing the active time points of proteins with low or high expression levels. We construct a Time-Evolving PIN (TEPIN) with a novel method called Deviation Degree, which is designed to identify the active time points of proteins based on the deviation degree of their own expression values. Owing to the differences between protein interactions, moreover, we weight TEPIN with connected affinity and gene co-expression to quantify the degree of these interactions. To validate the efficiencies of our methods, ClusterONE, CAMSE and MCL algorithms are applied on the TEPIN, DPIN (a dynamic PIN constructed with state-of-the-art three-sigma method) and SPIN (the original static PIN) to detect temporal protein complexes. Each algorithm on our TEPIN outperforms that on other networks in terms of match degree, sensitivity, specificity, F-measure and function enrichment etc. In conclusion, our Deviation Degree method successfully eliminates the disadvantages which exist in the previous state-of-the-art dynamic PIN construction methods. Moreover, the biological nature of protein interactions can be well described in our weighted network. Weighted TEPIN is a useful approach for detecting temporal protein complexes and revealing the dynamic protein assembly process for cellular organization. PMID:27100396

  19. Gene co-expression analysis identifies brain regions and cell types involved in migraine pathophysiology: a GWAS-based study using the Allen Human Brain Atlas.

    PubMed

    Eising, Else; Huisman, Sjoerd M H; Mahfouz, Ahmed; Vijfhuizen, Lisanne S; Anttila, Verneri; Winsvold, Bendik S; Kurth, Tobias; Ikram, M Arfan; Freilinger, Tobias; Kaprio, Jaakko; Boomsma, Dorret I; van Duijn, Cornelia M; Järvelin, Marjo-Riitta R; Zwart, John-Anker; Quaye, Lydia; Strachan, David P; Kubisch, Christian; Dichgans, Martin; Davey Smith, George; Stefansson, Kari; Palotie, Aarno; Chasman, Daniel I; Ferrari, Michel D; Terwindt, Gisela M; de Vries, Boukje; Nyholt, Dale R; Lelieveldt, Boudewijn P F; van den Maagdenberg, Arn M J M; Reinders, Marcel J T

    2016-04-01

    Migraine is a common disabling neurovascular brain disorder typically characterised by attacks of severe headache and associated with autonomic and neurological symptoms. Migraine is caused by an interplay of genetic and environmental factors. Genome-wide association studies (GWAS) have identified over a dozen genetic loci associated with migraine. Here, we integrated migraine GWAS data with high-resolution spatial gene expression data of normal adult brains from the Allen Human Brain Atlas to identify specific brain regions and molecular pathways that are possibly involved in migraine pathophysiology. To this end, we used two complementary methods. In GWAS data from 23,285 migraine cases and 95,425 controls, we first studied modules of co-expressed genes that were calculated based on human brain expression data for enrichment of genes that showed association with migraine. Enrichment of a migraine GWAS signal was found for five modules that suggest involvement in migraine pathophysiology of: (i) neurotransmission, protein catabolism and mitochondria in the cortex; (ii) transcription regulation in the cortex and cerebellum; and (iii) oligodendrocytes and mitochondria in subcortical areas. Second, we used the high-confidence genes from the migraine GWAS as a basis to construct local migraine-related co-expression gene networks. Signatures of all brain regions and pathways that were prominent in the first method also surfaced in the second method, thus providing support that these brain regions and pathways are indeed involved in migraine pathophysiology. PMID:26899160

  20. Multi-tissue Analysis of Co-expression Networks by Higher-Order Generalized Singular Value Decomposition Identifies Functionally Coherent Transcriptional Modules

    PubMed Central

    Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed

  1. Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

    PubMed

    Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed

  2. Co-expression networks revealed potential core lncRNAs in the triple-negative breast cancer.

    PubMed

    Yang, Fan; Liu, Ye-Huan; Dong, Si-Yang; Yao, Zhi-Han; Lv, Lin; Ma, Rui-Min; Dai, Xuan-Xuan; Wang, Jiao; Zhang, Xiao-Hua; Wang, Ou-Chen

    2016-10-15

    Triple-negative breast cancer (TNBC) is an aggressive type of breast cancer with unfavorable outcome. It is urgent to explore novel biomarkers and potential therapeutic targets in this malignancy. Increasing knowledge of long noncoding RNAs (lncRNAs) significantly deepens our understanding of cancer biology. Here, we sequenced eight paired TNBC tumor tissues and non-cancerous tissues, and validated significantly differentially expressed lncRNAs. Gene ontology (GO) and pathway analysis were used to investigate the function of differentially expressed mRNAs. Further, potential core lncRNAs in TNBC were identified by co-expression networks. Kaplan-Meier analysis also indicated that breast cancer patients with lower expression level of rhabdomyosarcoma 2 associated transcript (RMST), one of the potential core lncRNAs, had worse overall survival. To the best of our knowledge, it was the first report that RMST was involved in breast cancer. Our research provided a rich resource to the research community for further investigating lncRNAs functions and identifying lncRNAs with diagnostic and therapeutic potentials in TNBC. PMID:27380926

  3. Gene Coexpression and Evolutionary Conservation Analysis of the Human Preimplantation Embryos.

    PubMed

    Liu, Tiancheng; Yu, Lin; Ding, Guohui; Wang, Zhen; Liu, Lei; Li, Hong; Li, Yixue

    2015-01-01

    Evolutionary developmental biology (EVO-DEVO) tries to decode evolutionary constraints on the stages of embryonic development. Two models--the "funnel-like" model and the "hourglass" model--have been proposed by investigators to illustrate the fluctuation of selective pressure on these stages. However, selective indices of stages corresponding to mammalian preimplantation embryonic development (PED) were undetected in previous studies. Based on single cell RNA sequencing of stages during human PED, we used coexpression method to identify gene modules activated in each of these stages. Through measuring the evolutionary indices of gene modules belonging to each stage, we observed change pattern of selective constraints on PED for the first time. The selective pressure decreases from the zygote stage to the 4-cell stage and increases at the 8-cell stage and then decreases again from 8-cell stage to the late blastocyst stages. Previous EVO-DEVO studies concerning the whole embryo development neglected the fluctuation of selective pressure in these earlier stages, and the fluctuation was potentially correlated with events of earlier stages, such as zygote genome activation (ZGA). Such oscillation in an earlier stage would further affect models of the evolutionary constraints on whole embryo development. Therefore, these earlier stages should be measured intensively in future EVO-DEVO studies. PMID:26273607

  4. Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics

    PubMed Central

    2014-01-01

    Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353

  5. [Construction of recombinant adenovirus co-expressing M1 and HA genes of influenza virus type A].

    PubMed

    Guo, Jian-Qiang; Yao, Li-Hong; Chen, Ai-Jun; Xu, Yi; Jia, Run-Qing; Bo, Hong; Dong, Jie; Zhou, Jian-Fang; Shu, Yue-Long; Zhang, Zhi-Qing

    2009-03-01

    Based on the human H5N1 influenza virus strain A/Anhui/1/2005, recombinant adenovirus co-expressing M1 and HA genes of H5N1 influenza virus was constructed using an internal ribosome entry site (IRES) sequence to link the two genes. The M1 and HA genes of H5N1 influenza virus were amplified by PCR and subcloned into pStar vector separately. Then the M1-IRES-HA fragment was amplified and subcloned into pShuttle-CMV vector, the shuttle plasmid was then linearized and transformed into BJ5183 bacteria which contained backbone vector pAd-Easy. The recombinant vector pAd-Easy was packaged in 293 cells to get recombinant adenovirus Ad-M1/HA. CPE was observed after 293 cells were transfected by Ad-M1/HA. The co-expression of M1 and HA genes was confirmed by Western-blot and IFA (immunofluorescence assay). The IRES containing recombinant adenovirus allowed functional co-expression of M1 and HA genes and provided the foundation for developing new influenza vaccines with adenoviral vector. PMID:19678564

  6. Bioinformatics Data Mining Approach Suggests Coexpression of AGTPBP1 with an ALS-linked Gene C9orf72

    PubMed Central

    Kitano, Shouta; Kino, Yoshihiro; Yamamoto, Yoji; Takitani, Mika; Miyoshi, Junko; Ishida, Tsuyoshi; Saito, Yuko; Arima, Kunimasa; Satoh, Jun-ichi

    2015-01-01

    BACKGROUND Expanded GGGGCC hexanucleotide repeats located in the noncoding region of the chromosome 9 open reading frame 72 (C9orf72) gene represent the most common genetic abnormality for familial and sporadic amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). Formation of nuclear RNA foci, accumulation of repeat-associated non-ATG-translated dipeptide-repeat proteins, and haploinsufficiency of C9orf72 are proposed for pathological mechanisms of C9ALS/FTD. However, at present, the physiological function of C9orf72 remains largely unknown. METHODS By searching on a bioinformatics database named COXPRESdb composed of the comprehensive gene coexpression data, we studied potential C9orf72 interactors. RESULTS We identified the ATP/GTP binding protein 1 (AGTPBP1) gene alternatively named NNA1 encoding a cytosolic carboxypeptidase whose mutation is causative of the degeneration of Purkinje cells and motor neurons as the most significant gene coexpressed with C9orf72. We verified coexpression and interaction of AGTPBP1 and C9orf72 in transfected cells by immunoprecipitation and in neurons of the human brain by double-labeling immunohistochemistry. Furthermore, we found a positive correlation between AGTPBP1 and C9orf72 mRNA expression levels in the set of 21 human brains examined. CONCLUSIONS These results suggest that AGTPBP1 serves as a C9orf72 interacting partner that plays a role in the regulation of neuronal function in a coordinated manner within the central nervous system. PMID:26106267

  7. cap alpha. -skeletal and. cap alpha. -cardiac actin genes are coexpressed in adult human skeletal muscle and heart

    SciTech Connect

    Gunning, P.; Ponte, P.; Blau, H.; Kedes, L.

    1983-11-01

    The authors determined the actin isotypes encoded by 30 actin cDNA clones previously isolated from an adult human muscle cDNA library. Using 3' untranslated region probes, derived from ..cap alpha.. skeletal, ..beta..- and ..gamma..-actin cDNAs and from an ..cap alpha..-cardiac actin genomic clone, they showed that 28 of the cDNAs correspond to ..cap alpha..-skeletal actin transcripts. Unexpectedly, however, the remaining two cDNA clones proved to derive from ..cap alpha..-cardiac actin mRNA. Sequence analysis confirmed that the two skeletal muscle ..cap alpha..-cardiac actin cDNAs are derived from transcripts of the cloned ..cap alpha..-cardiac actin gene. Comparison of total actin mRNA levels in adult skeletal muscle and adult heart revealed that the steady-state levels in skeletal muscle are about twofold greater, per microgram of total cellular RNA, than those in heart. Thus, in skeletal muscle and in heart, both of the sarcomeric actin mRNA isotypes are quite abundant transcripts. They conclude that ..cap alpha..-skeletal and ..cap alpha..-cardiac actin genes are coexpressed as an actin pair in human adult striated muscles. Since the smooth-muscle actins (aortic and stomach) and the cytoplasmic actins (..beta.. and ..gamma..) are known to be coexpressed in smooth muscle and nonmuscle cells, respectively, they postulate that coexpression of actin pairs may be a common feature of mammalian actin gene expression in all tissues.

  8. Coexpression Network Analysis of Benign and Malignant Phenotypes of SIV-Infected Sooty Mangabey and Rhesus Macaque

    PubMed Central

    Silvestri, Guido; Bosinger, Steven E.; Li, Bai-Lian; Jong, Ambrose; Zhou, Yan-Hong; Huang, Sheng-He

    2016-01-01

    To explore the differences between the extreme SIV infection phenotypes, nonprogression (BEN: benign) to AIDS in sooty mangabeys (SMs) and progression to AIDS (MAL: malignant) in rhesus macaques (RMs), we performed an integrated dual positive-negative connectivity (DPNC) analysis of gene coexpression networks (GCN) based on publicly available big data sets in the GEO database of NCBI. The microarray-based gene expression data sets were generated, respectively, from the peripheral blood of SMs and RMs at several time points of SIV infection. Significant differences of GCN changes in DPNC values were observed in SIV-infected SMs and RMs. There are three groups of enriched genes or pathways (EGPs) that are associated with three SIV infection phenotypes (BEN+, MAL+ and mixed BEN+/MAL+). The MAL+ phenotype in SIV-infected RMs is specifically associated with eight EGPs, including the protein ubiquitin proteasome system, p53, granzyme A, gramzyme B, polo-like kinase, Glucocorticoid receptor, oxidative phosyphorylation and mitochondrial signaling. Mitochondrial (endosymbiotic) dysfunction is solely present in RMs. Specific BEN+ pattern changes in four EGPs are identified in SIV-infected SMs, including the pathways contributing to interferon signaling, BRCA1/DNA damage response, PKR/INF induction and LGALS8. There are three enriched pathways (PRR-activated IRF signaling, RIG1-like receptor and PRR pathway) contributing to the mixed (BEN+/MAL+) phenotypes of SIV infections in RMs and SMs, suggesting that these pathways play a dual role in the host defense against viral infections. Further analysis of Hub genes in these GCNs revealed that the genes LGALS8 and IL-17RA, which positively regulate the barrier function of the gut mucosa and the immune homeostasis with the gut microbiota (exosymbiosis), were significantly differentially expressed in RMs and SMs. Our data suggest that there exists an exo- (dysbiosis of the gut microbiota) and endo- (mitochondrial dysfunction

  9. Enhanced production of shikimic acid using a multi-gene co-expression system in Escherichia coli.

    PubMed

    Liu, Xiang-Lei; Lin, Jun; Hu, Hai-Feng; Zhou, Bin; Zhu, Bao-Quan

    2016-04-01

    Shikimic acid (SA) is the key synthetic material for the chemical synthesis of Oseltamivir, which is prescribed as the front-line treatment for serious cases of influenza. Multi-gene expression vector can be used for expressing the plurality of the genes in one plasmid, so it is widely applied to increase the yield of metabolites. In the present study, on the basis of a shikimate kinase genetic defect strain Escherichia coli BL21 (ΔaroL/aroK, DE3), the key enzyme genes aroG, aroB, tktA and aroE of SA pathway were co-expressed and compared systematically by constructing a series of multi-gene expression vectors. The results showed that different gene co-expression combinations (two, three or four genes) or gene orders had different effects on the production of SA. SA production of the recombinant BL21-GBAE reached to 886.38 mg·L(-1), which was 17-fold (P < 0.05) of the parent strain BL21 (ΔaroL/aroK, DE3). PMID:27114316

  10. GENE EXPRESSION NETWORKS

    EPA Science Inventory

    "Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...

  11. Identifying Gene Interaction Networks

    PubMed Central

    Bebek, Gurkan

    2016-01-01

    In this chapter, we introduce interaction networks by describing how they are generated, where they are stored, and how they are shared. We focus on publicly available interaction networks and describe a simple way of utilizing these resources. As a case study, we used Cytoscape, an open source and easy-to-use network visualization and analysis tool to first gather and visualize a small network. We have analyzed this network’s topological features and have looked at functional enrichment of the network nodes by integrating the gene ontology database. The methods described are applicable to larger networks that can be collected from various resources. PMID:22307715

  12. Coexpression of Nuclear Receptors and Histone Methylation Modifying Genes in the Testis: Implications for Endocrine Disruptor Modes of Action

    PubMed Central

    Anderson, Alison M.; Carter, Kim W.; Anderson, Denise; Wise, Michael J.

    2012-01-01

    Background Endocrine disruptor chemicals elicit adverse health effects by perturbing nuclear receptor signalling systems. It has been speculated that these compounds may also perturb epigenetic mechanisms and thus contribute to the early origin of adult onset disease. We hypothesised that histone methylation may be a component of the epigenome that is susceptible to perturbation. We used coexpression analysis of publicly available data to investigate the combinatorial actions of nuclear receptors and genes involved in histone methylation in normal testis and when faced with endocrine disruptor compounds. Methodology/Principal Findings The expression patterns of a set of genes were profiled across testis tissue in human, rat and mouse, plus control and exposed samples from four toxicity experiments in the rat. Our results indicate that histone methylation events are a more general component of nuclear receptor mediated transcriptional regulation in the testis than previously appreciated. Coexpression patterns support the role of a gatekeeper mechanism involving the histone methylation modifiers Kdm1, Prdm2, and Ehmt1 and indicate that this mechanism is a common determinant of transcriptional integrity for genes critical to diverse physiological endpoints relevant to endocrine disruption. Coexpression patterns following exposure to vinclozolin and dibutyl phthalate suggest that coactivity of the demethylase Kdm1 in particular warrants further investigation in relation to endocrine disruptor mode of action. Conclusions/Significance This study provides proof of concept that a bioinformatics approach that profiles genes related to a specific hypothesis across multiple biological settings can provide powerful insight into coregulatory activity that would be difficult to discern at an individual experiment level or by traditional differential expression analysis methods. PMID:22496781

  13. Genes and gene networks implicated in aggression related behaviour.

    PubMed

    Malki, Karim; Pain, Oliver; Du Rietz, Ebba; Tosto, Maria Grazia; Paya-Cano, Jose; Sandnabba, Kenneth N; de Boer, Sietse; Schalkwyk, Leonard C; Sluyter, Frans

    2014-10-01

    Aggressive behaviour is a major cause of mortality and morbidity. Despite of moderate heritability estimates, progress in identifying the genetic factors underlying aggressive behaviour has been limited. There are currently three genetic mouse models of high and low aggression created using selective breeding. This is the first study to offer a global transcriptomic characterization of the prefrontal cortex across all three genetic mouse models of aggression. A systems biology approach has been applied to transcriptomic data across the three pairs of selected inbred mouse strains (Turku Aggressive (TA) and Turku Non-Aggressive (TNA), Short Attack Latency (SAL) and Long Attack Latency (LAL) mice and North Carolina Aggressive (NC900) and North Carolina Non-Aggressive (NC100)), providing novel insight into the neurobiological mechanisms and genetics underlying aggression. First, weighted gene co-expression network analysis (WGCNA) was performed to identify modules of highly correlated genes associated with aggression. Probe sets belonging to gene modules uncovered by WGCNA were carried forward for network analysis using ingenuity pathway analysis (IPA). The RankProd non-parametric algorithm was then used to statistically evaluate expression differences across the genes belonging to modules significantly associated with aggression. IPA uncovered two pathways, involving NF-kB and MAPKs. The secondary RankProd analysis yielded 14 differentially expressed genes, some of which have previously been implicated in pathways associated with aggressive behaviour, such as Adrbk2. The results highlighted plausible candidate genes and gene networks implicated in aggression-related behaviour. PMID:25142712

  14. Identification of Crowding Stress Tolerance Co-Expression Networks Involved in Sweet Corn Yield.

    PubMed

    Choe, Eunsoo; Drnevich, Jenny; Williams, Martin M

    2016-01-01

    Tolerance to crowding stress has played a crucial role in improving agronomic productivity in field corn; however, commercial sweet corn hybrids vary greatly in crowding stress tolerance. The objectives were to 1) explore transcriptional changes among sweet corn hybrids with differential yield under crowding stress, 2) identify relationships between phenotypic responses and gene expression patterns, and 3) identify groups of genes associated with yield and crowding stress tolerance. Under conditions of crowding stress, three high-yielding and three low-yielding sweet corn hybrids were grouped for transcriptional and phenotypic analyses. Transcriptional analyses identified from 372 to 859 common differentially expressed genes (DEGs) for each hybrid. Large gene expression pattern variation among hybrids and only 26 common DEGs across all hybrid comparisons were identified, suggesting each hybrid has a unique response to crowding stress. Over-represented biological functions of DEGs also differed among hybrids. Strong correlation was observed between: 1) modules with up-regulation in high-yielding hybrids and yield traits, and 2) modules with up-regulation in low-yielding hybrids and plant/ear traits. Modules linked with yield traits may be important crowding stress response mechanisms influencing crop yield. Functional analysis of the modules and common DEGs identified candidate crowding stress tolerant processes in photosynthesis, glycolysis, cell wall, carbohydrate/nitrogen metabolic process, chromatin, and transcription regulation. Moreover, these biological functions were greatly inter-connected, indicating the importance of improving the mechanisms as a network. PMID:26796516

  15. Identification of Crowding Stress Tolerance Co-Expression Networks Involved in Sweet Corn Yield

    PubMed Central

    Choe, Eunsoo; Drnevich, Jenny; Williams, Martin M.

    2016-01-01

    Tolerance to crowding stress has played a crucial role in improving agronomic productivity in field corn; however, commercial sweet corn hybrids vary greatly in crowding stress tolerance. The objectives were to 1) explore transcriptional changes among sweet corn hybrids with differential yield under crowding stress, 2) identify relationships between phenotypic responses and gene expression patterns, and 3) identify groups of genes associated with yield and crowding stress tolerance. Under conditions of crowding stress, three high-yielding and three low-yielding sweet corn hybrids were grouped for transcriptional and phenotypic analyses. Transcriptional analyses identified from 372 to 859 common differentially expressed genes (DEGs) for each hybrid. Large gene expression pattern variation among hybrids and only 26 common DEGs across all hybrid comparisons were identified, suggesting each hybrid has a unique response to crowding stress. Over-represented biological functions of DEGs also differed among hybrids. Strong correlation was observed between: 1) modules with up-regulation in high-yielding hybrids and yield traits, and 2) modules with up-regulation in low-yielding hybrids and plant/ear traits. Modules linked with yield traits may be important crowding stress response mechanisms influencing crop yield. Functional analysis of the modules and common DEGs identified candidate crowding stress tolerant processes in photosynthesis, glycolysis, cell wall, carbohydrate/nitrogen metabolic process, chromatin, and transcription regulation. Moreover, these biological functions were greatly inter-connected, indicating the importance of improving the mechanisms as a network. PMID:26796516

  16. Genes2FANs: connecting genes through functional association networks

    PubMed Central

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  17. The gap gene network

    PubMed Central

    2010-01-01

    Gap genes are involved in segment determination during the early development of the fruit fly Drosophila melanogaster as well as in other insects. This review attempts to synthesize the current knowledge of the gap gene network through a comprehensive survey of the experimental literature. I focus on genetic and molecular evidence, which provides us with an almost-complete picture of the regulatory interactions responsible for trunk gap gene expression. I discuss the regulatory mechanisms involved, and highlight the remaining ambiguities and gaps in the evidence. This is followed by a brief discussion of molecular regulatory mechanisms for transcriptional regulation, as well as precision and size-regulation provided by the system. Finally, I discuss evidence on the evolution of gap gene expression from species other than Drosophila. My survey concludes that studies of the gap gene system continue to reveal interesting and important new insights into the role of gene regulatory networks in development and evolution. PMID:20927566

  18. Coexpression of multiple genes reconstitutes two pathways of very long-chain polyunsaturated fatty acid biosynthesis in Pichia pastoris.

    PubMed

    Kim, Sun Hee; Roh, Kyung Hee; Kim, Kwang-Soo; Kim, Hyun Uk; Lee, Kyeong-Ryeol; Kang, Han-Chul; Kim, Jong-Bum

    2014-09-01

    The introduction of novel traits to cells often requires the stable coexpression of multiple genes within the same cell. Herein, we report that C22 very long-chain polyunsaturated fatty acids (VLC-PUFAs) were synthesized from C18 precursors by reactions catalyzed by delta 6-desaturase, an ELOVL5 involved in VLC-PUFA elongation, and delta 5-desaturase. The coexpression of McD6DES, AsELOVL5, and PtD5DES encoding the corresponding enzymes, produced docosatetraenoic acid (C22:4 n-6) and docosapentaenoic acid (C22:5 n-3), as well as arachidonic acid (C20:4 n-6) and eicosapentaenoic acid (C20:5 n-3) in the methylotrophic yeast Pichia pastoris. The expression of each gene increased within 24 h, with high transcript levels after induction with 0.5 or 1 % methanol. High levels of the newly expressed VLC-PUFAs occurred after 144 h. This expression system exemplifies the recent progress and future possibilities of the metabolic engineering of VLC-PUFAs in oilseed crops. PMID:24863294

  19. Gene Coexpression Analysis Reveals Complex Metabolism of the Monoterpene Alcohol Linalool in Arabidopsis Flowers[W][OPEN

    PubMed Central

    Ginglinger, Jean-François; Boachon, Benoit; Höfer, René; Paetz, Christian; Köllner, Tobias G.; Miesch, Laurence; Lugan, Raphael; Baltenweck, Raymonde; Mutterer, Jérôme; Ullmann, Pascaline; Beran, Franziska; Claudel, Patricia; Verstappen, Francel; Fischer, Marc J.C.; Karst, Francis; Bouwmeester, Harro; Miesch, Michel; Schneider, Bernd; Gershenzon, Jonathan; Ehlting, Jürgen; Werck-Reichhart, Danièle

    2013-01-01

    The cytochrome P450 family encompasses the largest family of enzymes in plant metabolism, and the functions of many of its members in Arabidopsis thaliana are still unknown. Gene coexpression analysis pointed to two P450s that were coexpressed with two monoterpene synthases in flowers and were thus predicted to be involved in monoterpenoid metabolism. We show that all four selected genes, the two terpene synthases (TPS10 and TPS14) and the two cytochrome P450s (CYP71B31 and CYP76C3), are simultaneously expressed at anthesis, mainly in upper anther filaments and in petals. Upon transient expression in Nicotiana benthamiana, the TPS enzymes colocalize in vesicular structures associated with the plastid surface, whereas the P450 proteins were detected in the endoplasmic reticulum. Whether they were expressed in Saccharomyces cerevisiae or in N. benthamiana, the TPS enzymes formed two different enantiomers of linalool: (−)-(R)-linalool for TPS10 and (+)-(S)-linalool for TPS14. Both P450 enzymes metabolize the two linalool enantiomers to form different but overlapping sets of hydroxylated or epoxidized products. These oxygenated products are not emitted into the floral headspace, but accumulate in floral tissues as further converted or conjugated metabolites. This work reveals complex linalool metabolism in Arabidopsis flowers, the ecological role of which remains to be determined. PMID:24285789

  20. Comprehensive Network Analysis of Anther-Expressed Genes in Rice by the Combination of 33 Laser Microdissection and 143 Spatiotemporal Microarrays

    PubMed Central

    Takahashi, Hirokazu; Shiono, Katsuhiro; Yano, Kentaro; Tsutsumi, Nobuhiro; Nakazono, Mikio; Nagamura, Yoshiaki; Matsuoka, Makoto; Watanabe, Masao

    2011-01-01

    Co-expression networks systematically constructed from large-scale transcriptome data reflect the interactions and functions of genes with similar expression patterns and are a powerful tool for the comprehensive understanding of biological events and mining of novel genes. In Arabidopsis (a model dicot plant), high-resolution co-expression networks have been constructed from very large microarray datasets and these are publicly available as online information resources. However, the available transcriptome data of rice (a model monocot plant) have been limited so far, making it difficult for rice researchers to achieve reliable co-expression analysis. In this study, we performed co-expression network analysis by using combined 44 K agilent microarray datasets of rice, which consisted of 33 laser microdissection (LM)-microarray datasets of anthers, and 143 spatiotemporal transcriptome datasets deposited in RicexPro. The entire data of the rice co-expression network, which was generated from the 176 microarray datasets by the Pearson correlation coefficient (PCC) method with the mutual rank (MR)-based cut-off, contained 24,258 genes and 60,441 genes pairs. Using these datasets, we constructed high-resolution co-expression subnetworks of two specific biological events in the anther, “meiosis” and “pollen wall synthesis”. The meiosis network contained many known or putative meiotic genes, including genes related to meiosis initiation and recombination. In the pollen wall synthesis network, several candidate genes involved in the sporopollenin biosynthesis pathway were efficiently identified. Hence, these two subnetworks are important demonstrations of the efficiency of co-expression network analysis in rice. Our co-expression analysis included the separated transcriptomes of pollen and tapetum cells in the anther, which are able to provide precise information on transcriptional regulation during male gametophyte development in rice. The co-expression network

  1. Comprehensive network analysis of anther-expressed genes in rice by the combination of 33 laser microdissection and 143 spatiotemporal microarrays.

    PubMed

    Aya, Koichiro; Suzuki, Go; Suwabe, Keita; Hobo, Tokunori; Takahashi, Hirokazu; Shiono, Katsuhiro; Yano, Kentaro; Tsutsumi, Nobuhiro; Nakazono, Mikio; Nagamura, Yoshiaki; Matsuoka, Makoto; Watanabe, Masao

    2011-01-01

    Co-expression networks systematically constructed from large-scale transcriptome data reflect the interactions and functions of genes with similar expression patterns and are a powerful tool for the comprehensive understanding of biological events and mining of novel genes. In Arabidopsis (a model dicot plant), high-resolution co-expression networks have been constructed from very large microarray datasets and these are publicly available as online information resources. However, the available transcriptome data of rice (a model monocot plant) have been limited so far, making it difficult for rice researchers to achieve reliable co-expression analysis. In this study, we performed co-expression network analysis by using combined 44 K agilent microarray datasets of rice, which consisted of 33 laser microdissection (LM)-microarray datasets of anthers, and 143 spatiotemporal transcriptome datasets deposited in RicexPro. The entire data of the rice co-expression network, which was generated from the 176 microarray datasets by the Pearson correlation coefficient (PCC) method with the mutual rank (MR)-based cut-off, contained 24,258 genes and 60,441 genes pairs. Using these datasets, we constructed high-resolution co-expression subnetworks of two specific biological events in the anther, "meiosis" and "pollen wall synthesis". The meiosis network contained many known or putative meiotic genes, including genes related to meiosis initiation and recombination. In the pollen wall synthesis network, several candidate genes involved in the sporopollenin biosynthesis pathway were efficiently identified. Hence, these two subnetworks are important demonstrations of the efficiency of co-expression network analysis in rice. Our co-expression analysis included the separated transcriptomes of pollen and tapetum cells in the anther, which are able to provide precise information on transcriptional regulation during male gametophyte development in rice. The co-expression network data

  2. SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

    PubMed

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp. PMID:24386124

  3. SiBIC: A Web Server for Generating Gene Set Networks Based on Biclusters Obtained by Maximal Frequent Itemset Mining

    PubMed Central

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp. PMID:24386124

  4. Use of the growing environment as a source of variation to identify the quantitative trait transcripts and modules of co-expressed genes that determine chlorogenic acid accumulation

    PubMed Central

    JOËT, THIERRY; SALMONA, JORDI; LAFFARGUE, ANDRÉINA; DESCROIX, FRÉDÉRIC; DUSSERT, STÉPHANE

    2010-01-01

    Developing Coffea arabica seeds accumulate large amounts of chlorogenic acids (CGAs) as a storage form of phenylpropanoid derivatives, making coffee a valuable model to investigate the metabolism of these widespread plant phenolics. However, developmental and environmental regulations of CGA metabolism are poorly understood. In the present work, the expression of selected phenylpropanoid genes, together with CGA isomer profiles, was monitored throughout seed development across a wide set of contrasted natural environments. Although CGA metabolism was controlled by major developmental factors, the mean temperature during seed development had a direct impact on the time-window of CGA biosynthesis, as well as on final CGA isomer composition through subtle transcriptional regulations. We provide evidence that the variability induced by the environment is a useful tool to test whether CGA accumulation is quantitatively modulated at the transcriptional level, hence enabling detection of rate-limiting transcriptional steps [quantitative trait transcripts (QTTs)] for CGA biosynthesis. Variations induced by the environment also enabled a better description of the phenylpropanoid gene transcriptional network throughout seed development, as well as the detection of three temporally distinct modules of quantitatively co-expressed genes. Finally, analysis of metabolite-to-metabolite relationships revealed new biochemical characteristics of the isomerization steps that remain uncharacterized at the gene level. PMID:20199615

  5. Functional Analysis of Prognostic Gene Expression Network Genes in Metastatic Breast Cancer Models

    PubMed Central

    Geiger, Thomas R.; Ha, Ngoc-Han; Faraji, Farhoud; Michael, Helen T.; Rodriguez, Loren; Walker, Renard C.; Green, Jeffery E.; Simpson, R. Mark; Hunter, Kent W.

    2014-01-01

    Identification of conserved co-expression networks is a useful tool for clustering groups of genes enriched for common molecular or cellular functions [1]. The relative importance of genes within networks can frequently be inferred by the degree of connectivity, with those displaying high connectivity being significantly more likely to be associated with specific molecular functions [2]. Previously we utilized cross-species network analysis to identify two network modules that were significantly associated with distant metastasis free survival in breast cancer. Here, we validate one of the highly connected genes as a metastasis associated gene. Tpx2, the most highly connected gene within a proliferation network specifically prognostic for estrogen receptor positive (ER+) breast cancers, enhances metastatic disease, but in a tumor autonomous, proliferation-independent manner. Histologic analysis suggests instead that variation of TPX2 levels within disseminated tumor cells may influence the transition between dormant to actively proliferating cells in the secondary site. These results support the co-expression network approach for identification of new metastasis-associated genes to provide new information regarding the etiology of breast cancer progression and metastatic disease. PMID:25368990

  6. Significant enhancement of methionol production by co-expression of the aminotransferase gene ARO8 and the decarboxylase gene ARO10 in Saccharomyces cerevisiae.

    PubMed

    Yin, Sheng; Lang, Tiandan; Xiao, Xiao; Liu, Li; Sun, Baoguo; Wang, Chengtao

    2015-03-01

    Methionol is an important volatile sulfur flavor compound, which can be produced via the Ehrlich pathway in Saccharomyces cerevisiae. Aminotransferase and decarboxylase are essential enzymes catalyzing methionol biosynthesis. In this work, two aminotransferase genes ARO8 and ARO9 and one decarboxylase gene ARO10 were introduced into S. cerevisiae S288c, respectively, via an expression vector. Over-expression of ARO8 resulted in higher aminotransferase activity than that of ARO9. And the cellular decarboxylase activity was remarkably increased by over-expression of ARO10. A co-expression vector carrying both ARO8 and ARO10 was further constructed to generate the recombinant strain S810. Shaking flask experiments showed that the methionol yield from S810 reached 1.27 g L(-1), which was increased by 51.8 and 68.8% compared to that from the wild-type strain and the control strain harboring the empty vector. The fed-batch fermentation by strain S810 produced 3.24 g L(-1) of methionol after 72 h of cultivation in a bioreactor. These results demonstrated that co-expression of ARO8 and ARO10 significantly boosted the methionol production. It is the first time that more than 3.0 g L(-1) of methionol produced by genetically engineered yeast strain was reported by co-expression of the aminotransferase and decarboxylase via the Ehrlich pathway. PMID:25743068

  7. Co-expression of G2-EPSPS and glyphosate acetyltransferase GAT genes conferring high tolerance to glyphosate in soybean

    PubMed Central

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Jin, Longguo; Zhang, Lijuan; Chang, Ru-Zhen; Lu, Wei; Lin, Min; Qiu, Li-Juan

    2015-01-01

    Glyphosate is a widely used non-selective herbicide with broad spectrum of weed control around the world. At present, most of the commercial glyphosate tolerant soybeans utilize glyphosate tolerant gene CP4-EPSPS or glyphosate acetyltransferase gene GAT separately. In this study, both glyphosate tolerant gene G2-EPSPS and glyphosate degraded gene GAT were co-transferred into soybean and transgenic plants showed high tolerance to glyphosate. Molecular analysis including PCR, Sothern blot, qRT-PCR, and Western blot revealed that target genes have been integrated into genome and expressed effectively at both mRNA and protein levels. Furthermore, the glyphosate tolerance analysis showed that no typical symptom was observed when compared with a glyphosate tolerant line HJ06-698 derived from GR1 transgenic soybean even at fourfold labeled rate of Roundup. Chlorophyll and shikimic acid content analysis of transgenic plant also revealed that these two indexes were not significantly altered after glyphosate application. These results indicated that co-expression of G2-EPSPS and GAT conferred high tolerance to the herbicide glyphosate in soybean. Therefore, combination of tolerant and degraded genes provides a new strategy for developing glyphosate tolerant transgenic crops. PMID:26528311

  8. Evolution of akirin family in gene and genome levels and coexpressed patterns among family members and rel gene in croaker.

    PubMed

    Liu, Tianxing; Gao, Yunhang; Xu, Tianjun

    2015-09-01

    Akirins, which are highly conserved nuclear proteins, are present throughout the metazoan and regulate innate immunity, embryogenesis, myogenesis, and carcinogenesis. This study reports all akirin genes from miiuy croaker and analyzes comprehensively the akirin gene family combined with akirin genes from other species. A second nuclear localization signal (NLS) is observed in akirin2 homologues, which is not in akirin1 homologues in all teleosts and most other vertebrates. Thus, we deduced that the loss of second NLS in akirin1 homologues in teleosts likely occurred in an ancestor to all Osteichthyes after splitting with cartilaginous fish. Significantly, the akirin2(2) gene included six exons interrupted by five introns in the miiuy croaker, which may be caused by the intron insertion event as a novel evidence for the variation of akirin gene structure in some species. In addition, comparison of the genomic neighborhood genes of akirin1, akirin2(1), and akirin2(2) demonstrates a strong level of conserved synteny across the teleost classes, which further proved the deduction of Macqueen and Johnston 2009 that the produce of akirin paralogues can be attributed to whole-genome duplications and the loss of some akirin paralogues after genome duplications. Furthermore, akirin gene family members and relish gene are ubiquitously expressed across all tissues, and their expression levels are increased in three immune tissues after infection with Vibrio anguillarum. Combined with the expression patterns of LEAP-1 and LEAP-2 from miiuy croaker, an intricate network of co-regulation among family members is established. Thus, it is further proved that akirins acted in concert with the relish protein to induce the expression of a subset of downstream pathway elements in the NF-kB dependent signaling pathway. PMID:25912355

  9. Oppositely imprinted genes H19 and insulin-like growth factor 2 are coexpressed in human androgenetic trophoblast.

    PubMed Central

    Mutter, G L; Stewart, C L; Chaponot, M L; Pomponio, R J

    1993-01-01

    Human uniparental gestations such as gynogenetic ovarian teratomas and androgenetic complete hydatidiform moles provide a model to evaluate the integrity of parent-specific gene expression--i.e., imprinting--in the absence of a complementary parental genetic contribution. We studied expression, in these tissues, of the oppositely imprinted genes H19, which is an embryonic nontranslated RNA, and insulin-like growth factor type 2 (IGF2). Normal gestations only express H19 from the maternal allele and express IGF2 from the paternal allele, whereas neither is expressed from the maternal genome of gynogenetic gestations, and both are expressed from the paternal genome of androgenetic gestations. Coexpression of H19 and IGF2 in the androgenetic tissues was in a single population of cells, mononuclear trophoblast--the same cell type expressing these genes in biparental placentas. These results demonstrate that a biparental genome may be required for expression of the reciprocal IGF2/H19 imprint. Alternatively, biparental expression may be a normal feature of some imprinted genes in specific cell types. Additional experiments with other imprinted genes will clarify whether this reflects global failure of the imprinting process or a change specific to the IGF2/H19 locus. Images Figure 1 Figure 2 Figure 3 PMID:7692725

  10. Enhancement of heavy metal accumulation by tissue specific co-expression of iaaM and ACC deaminase genes in plants.

    PubMed

    Zhang, Yong; Zhao, Lihong; Wang, Yao; Yang, Baoyu; Chen, Shiyun

    2008-06-01

    1-Aminocyclopropane deaminase (ACC) and tryptophan monooxygenase are two enzymes involved in plant senescence-inhibiting and growth-promoting regulation, respectively. In this study, two binary vectors were constructed in which the Agrobacterium iaaM gene was under the transcriptional control of a xylem-specific glycine-rich protein promoter alone, or co-expressed with the bacterial ACC deaminase gene, which was driven by the constitutive CaMV 35S promoter. Transgenic petunia shoots co-expressing both genes were able to root on medium supplemented with 7.5 mg l(-1) CoCl2. When T1 transgenic tobacco plants were grown in sand supplemented with Cu2+ and Co2+, tissue specific co-expression of both iaaM and ACC deaminase genes showed faster growth with larger biomass with a more extensive root system, and accumulated a greater amount of heavy metals than the empty vector control plants. When T1 transgenic tobacco plants were grown in soil watered with different concentrations of CuSO4, xylem specific expression of the iaaM gene caused the accumulation of more Cu2+ than the empty vector control at lower CuSO4 concentrations, but showed severe toxic symptoms at concentration of 100 mg l(-1) CuSO4. T1 transgenic plants co-expressing both genes accumulated more heavy metals into the plant shoots and can tolerate CuSO4 at 150 mg l(-1). In addition, plants co-expressing these two genes can grow well in a complex contaminated soil containing both inorganic and organic pollutants, while the growth of the control plants was greatly inhibited. PMID:18471863

  11. Co-expression of perforin and granzyme B genes induces apoptosis and inhibits the tumorigenicity of laryngeal cancer cell line Hep-2

    PubMed Central

    Li, Xiu-Ying; Li, Zhi; An, Gui-Jie; Liu, Sha; Lai, Yan-Dong

    2014-01-01

    Granzyme B and perforin, two of the most important components, have shown anticancer properties in various cancers, but their effects in laryngeal cancer remain unexplored. Here we decided to examine the effects of Granzyme B and perforin in Hep-2 cells and clarify the role of perforin and granzyme B in the tumorigenicity of laryngeal cancer cell line. Hep-2 cells were transfected with pVAX1-PIG co-expression vector (comprising perforin and granzyme B genes), and then the growth and apoptosis of these Hep-2 cells were evaluated. The tumorigenicity of Hep-2 cell line co-expressing perforin and granzyme B genes was tested in BALB/c nu/nu mice. We found that the co-expression of perforin and granzyme B genes could obviously inhibit cell focus formation and induce cell apoptosis in Hep-2 cells. Furthermore, after subcutaneous injection of Hep-2 cells transfected with pVAX1-PIG, an extensive delay in tumor growth was observed in BALB/c-nu/nu mice. Moreover, our studies demonstrated that the anticancer activity of perforin and granzyme B was sustainable in vivo as tumor development by inducing cell apoptosis. Taken together, our data indicate that the co-expression of perforin and granzyme B genes exhibits anticancer potential, and hopefully provide potential therapeutic applications in laryngeal cancer. PMID:24696715

  12. Differential Co-Expression between α-Synuclein and IFN-γ Signaling Genes across Development and in Parkinson’s Disease

    PubMed Central

    Liscovitch, Noa; French, Leon

    2014-01-01

    Expression patterns of the alpha-synuclein gene (SNCA) were studied across anatomy, development, and disease to better characterize its role in the brain. In this postmortem study, negative spatial co-expression between SNCA and 73 interferon-γ (IFN-γ) signaling genes was observed across many brain regions. Recent animal studies have demonstrated that IFN-γ induces loss of dopamine neurons and nigrostriatal degeneration. This opposing pattern between SNCA and IFN-γ signaling genes increases with age (rho = −0.78). In contrast, a meta-analysis of four microarray experiments representing 126 substantia nigra samples reveals a switch to positive co-expression in Parkinson’s disease (p<0.005). Use of genome-wide testing demonstrates this relationship is specific to SNCA (p<0.002). This change in co-expression suggests an immunomodulatory role of SNCA that may provide insight into neurodegeneration. Genes showing similar co-expression patterns have been previously linked to Alzheimer’s (ANK1) and Parkinson’s disease (UBE2E2, PCMT1, HPRT1 and RIT2). PMID:25493648

  13. Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments

    PubMed Central

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J.; Nandi, Asoke K.

    2013-01-01

    The binarization of consensus partition matrices (Bi-CoPaM) method has, among its unique features, the ability to perform ensemble clustering over the same set of genes from multiple microarray datasets by using various clustering methods in order to generate tunable tight clusters. Therefore, we have used the Bi-CoPaM method to the most synchronized 500 cell-cycle-regulated yeast genes from different microarray datasets to produce four tight, specific and exclusive clusters of co-expressed genes. We found 19 genes formed the tightest of the four clusters and this included the gene CMR1/YDL156W, which was an uncharacterized gene at the time of our investigations. Two very recent proteomic and biochemical studies have independently revealed many facets of CMR1 protein, although the precise functions of the protein remain to be elucidated. Our computational results complement these biological results and add more evidence to their recent findings of CMR1 as potentially participating in many of the DNA-metabolism processes such as replication, repair and transcription. Interestingly, our results demonstrate the close co-expressions of CMR1 and the replication protein A (RPA), the cohesion complex and the DNA polymerases α, δ and ɛ, as well as suggest functional relationships between CMR1 and the respective proteins. In addition, the analysis provides further substantial evidence that the expression of the CMR1 gene could be regulated by the MBF complex. In summary, the application of a novel analytic technique in large biological datasets has provided supporting evidence for a gene of previously unknown function, further hypotheses to test, and a more general demonstration of the value of sophisticated methods to explore new large datasets now so readily generated in biological experiments. PMID:23349438

  14. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond

    PubMed Central

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Moral-Chávez, Víctor Del; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-01

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments. PMID:26527724

  15. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.

    PubMed

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-01

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments. PMID:26527724

  16. CorSig: A General Framework for Estimating Statistical Significance of Correlation and Its Application to Gene Co-Expression Analysis

    PubMed Central

    Wang, Hong-Qiang; Tsai, Chung-Jui

    2013-01-01

    With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. Software

  17. Gene Co-Expression Analysis Inferring the Crosstalk of Ethylene and Gibberellin in Modulating the Transcriptional Acclimation of Cassava Root Growth in Different Seasons

    PubMed Central

    Saithong, Treenut; Saerue, Samorn; Kalapanulak, Saowalak; Sojikul, Punchapat; Narangajavana, Jarunya; Bhumiratana, Sakarindr

    2015-01-01

    Cassava is a crop of hope for the 21st century. Great advantages of cassava over other crops are not only the capacity of carbohydrates, but it is also an easily grown crop with fast development. As a plant which is highly tolerant to a poor environment, cassava has been believed to own an effective acclimation process, an intelligent mechanism behind its survival and sustainability in a wide range of climates. Herein, we aimed to investigate the transcriptional regulation underlying the adaptive development of a cassava root to different seasonal cultivation climates. Gene co-expression analysis suggests that AP2-EREBP transcription factor (ERF1) orthologue (D142) played a pivotal role in regulating the cellular response to exposing to wet and dry seasons. The ERF shows crosstalk with gibberellin, via ent-Kaurene synthase (D106), in the transcriptional regulatory network that was proposed to modulate the downstream regulatory system through a distinct signaling mechanism. While sulfur assimilation is likely to be a signaling regulation for dry crop growth response, calmodulin-binding protein is responsible for regulation in the wet crop. With our initiative study, we hope that our findings will pave the way towards sustainability of cassava production under various kinds of stress considering the future global climate change. PMID:26366737

  18. Computation in gene networks

    NASA Astrophysics Data System (ADS)

    Ben-Hur, Asa; Siegelmann, Hava T.

    2004-03-01

    Genetic regulatory networks have the complex task of controlling all aspects of life. Using a model of gene expression by piecewise linear differential equations we show that this process can be considered as a process of computation. This is demonstrated by showing that this model can simulate memory bounded Turing machines. The simulation is robust with respect to perturbations of the system, an important property for both analog computers and biological systems. Robustness is achieved using a condition that ensures that the model equations, that are generally chaotic, follow a predictable dynamics.

  19. Use of Semisupervised Clustering and Feature-Selection Techniques for Identification of Co-expressed Genes.

    PubMed

    Saha, Sriparna; Alok, Abhay Kumar; Ekbal, Asif

    2016-07-01

    Studying the patterns hidden in gene-expression data helps to understand the functionality of genes. In general, clustering techniques are widely used for the identification of natural partitionings from the gene expression data. In order to put constraints on dimensionality, feature selection is the key issue because not all features are important from clustering point of view. Moreover some limited amount of supervised information can help to fine tune the obtained clustering solution. In this paper, the problem of simultaneous feature selection and semisupervised clustering is formulated as a multiobjective optimization (MOO) task. A modern simulated annealing-based MOO technique namely AMOSA is utilized as the background optimization methodology. Here, features and cluster centers are represented in the form of a string and the assignment of genes to different clusters is done using a point symmetry-based distance. Six optimization criteria based on several internal and external cluster validity indices are utilized. In order to generate the supervised information, a popular clustering technique, Fuzzy C-mean, is utilized. Appropriate subset of features, proper number of clusters and the proper partitioning are determined using the search capability of AMOSA. The effectiveness of this proposed semisupervised clustering technique, Semi-FeaClustMOO, is demonstrated on five publicly available benchmark gene-expression datasets. Comparison results with the existing techniques for gene-expression data clustering again reveal the superiority of the proposed technique. Statistical and biological significance tests have also been carried out. PMID:26208367

  20. Three tightly linked genes encoding human type I keratins: conservation of sequence in the 5'-untranslated leader and 5'-upstream regions of coexpressed keratin genes.

    PubMed Central

    RayChaudhury, A; Marchuk, D; Lindhurst, M; Fuchs, E

    1986-01-01

    We have isolated and subcloned three separate segments of human DNA which share strong sequence homology with a previously sequenced gene encoding a type I keratin, K14 (50 kilodaltons). Restriction endonuclease mapping has demonstrated that these three genes are tightly linked chromosomally, whereas the K14 gene appears to be separate. As judged by positive hybridization-translation and Northern blot analyses, the central linked gene encodes a keratin, K17, which is expressed in abundance with K14 and two other type I keratins in cultured human epidermal cells. None of these other epidermal keratin mRNAs appears to be generated from the K17 gene through differential splicing of its transcript. The sequence of the K17 gene reveals striking homologies not only with the coding portions and intron positions of the K14 gene, but also with its 5'-noncoding and 5'-upstream sequences. These similarities may provide an important clue in elucidating the molecular mechanisms underlying the coexpression of the two genes. Images PMID:2431270

  1. Identification of microRNA-regulated gene networks by expression analysis of target genes.

    PubMed

    Gennarino, Vincenzo Alessandro; D'Angelo, Giovanni; Dharmalingam, Gopuraja; Fernandez, Serena; Russolillo, Giorgio; Sanges, Remo; Mutarelli, Margherita; Belcastro, Vincenzo; Ballabio, Andrea; Verde, Pasquale; Sardiello, Marco; Banfi, Sandro

    2012-06-01

    MicroRNAs (miRNAs) and transcription factors control eukaryotic cell proliferation, differentiation, and metabolism through their specific gene regulatory networks. However, differently from transcription factors, our understanding of the processes regulated by miRNAs is currently limited. Here, we introduce gene network analysis as a new means for gaining insight into miRNA biology. A systematic analysis of all human miRNAs based on Co-expression Meta-analysis of miRNA Targets (CoMeTa) assigns high-resolution biological functions to miRNAs and provides a comprehensive, genome-scale analysis of human miRNA regulatory networks. Moreover, gene cotargeting analyses show that miRNAs synergistically regulate cohorts of genes that participate in similar processes. We experimentally validate the CoMeTa procedure through focusing on three poorly characterized miRNAs, miR-519d/190/340, which CoMeTa predicts to be associated with the TGFβ pathway. Using lung adenocarcinoma A549 cells as a model system, we show that miR-519d and miR-190 inhibit, while miR-340 enhances TGFβ signaling and its effects on cell proliferation, morphology, and scattering. Based on these findings, we formalize and propose co-expression analysis as a general paradigm for second-generation procedures to recognize bona fide targets and infer biological roles and network communities of miRNAs. PMID:22345618

  2. Identification of microRNA-regulated gene networks by expression analysis of target genes

    PubMed Central

    Gennarino, Vincenzo Alessandro; D'Angelo, Giovanni; Dharmalingam, Gopuraja; Fernandez, Serena; Russolillo, Giorgio; Sanges, Remo; Mutarelli, Margherita; Belcastro, Vincenzo; Ballabio, Andrea; Verde, Pasquale; Sardiello, Marco; Banfi, Sandro

    2012-01-01

    MicroRNAs (miRNAs) and transcription factors control eukaryotic cell proliferation, differentiation, and metabolism through their specific gene regulatory networks. However, differently from transcription factors, our understanding of the processes regulated by miRNAs is currently limited. Here, we introduce gene network analysis as a new means for gaining insight into miRNA biology. A systematic analysis of all human miRNAs based on Co-expression Meta-analysis of miRNA Targets (CoMeTa) assigns high-resolution biological functions to miRNAs and provides a comprehensive, genome-scale analysis of human miRNA regulatory networks. Moreover, gene cotargeting analyses show that miRNAs synergistically regulate cohorts of genes that participate in similar processes. We experimentally validate the CoMeTa procedure through focusing on three poorly characterized miRNAs, miR-519d/190/340, which CoMeTa predicts to be associated with the TGFβ pathway. Using lung adenocarcinoma A549 cells as a model system, we show that miR-519d and miR-190 inhibit, while miR-340 enhances TGFβ signaling and its effects on cell proliferation, morphology, and scattering. Based on these findings, we formalize and propose co-expression analysis as a general paradigm for second-generation procedures to recognize bona fide targets and infer biological roles and network communities of miRNAs. PMID:22345618

  3. Systems toxicology of chemically induced liver and kidney injuries: histopathology-associated gene co-expression modules.

    PubMed

    Te, Jerez A; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2016-09-01

    Organ injuries caused by environmental chemical exposures or use of pharmaceutical drugs pose a serious health risk that may be difficult to assess because of a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific histopathology outcomes via biomarkers will provide a foundation for designing precise and robust diagnostic tests. We identified co-expressed genes (modules) specific to injury endpoints using the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System (TG-GATEs) - a toxicogenomics database containing organ-specific gene expression data matched to dose- and time-dependent chemical exposures and adverse histopathology assessments in Sprague-Dawley rats. We proposed a protocol for selecting gene modules associated with chemical-induced injuries that classify 11 liver and eight kidney histopathology endpoints based on dose-dependent activation of the identified modules. We showed that the activation of the modules for a particular chemical exposure condition, i.e., chemical-time-dose combination, correlated with the severity of histopathological damage in a dose-dependent manner. Furthermore, the modules could distinguish different types of injuries caused by chemical exposures as well as determine whether the injury module activation was specific to the tissue of origin (liver and kidney). The generated modules provide a link between toxic chemical exposures, different molecular initiating events among underlying molecular pathways and resultant organ damage. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. Journal of Applied Toxicology published by John Wiley & Sons, Ltd. PMID:26725466

  4. Characterization of Tusc5, a Unique Adipocyte Gene Co-Expressed in Peripheral Somatosensory Neurons

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tumor suppressor candidate 5 (Tusc5, GenBank nomenclature) is a cold-repressed gene encoding a member of the CD225 domain-containing family, identified through analysis of transcripts differentially-expressed in brown adipose tissue (BAT) with changes in ambient temperature. Tusc5 mRNA was found to ...

  5. Ethanol production by Escherichia coli strains co-expressing Zymomonas PDC and ADH genes

    DOEpatents

    Ingram, Lonnie O.; Conway, Tyrrell; Alterthum, Flavio

    1991-01-01

    A novel operon and plasmids comprising genes which code for the alcohol dehydrogenase and pyruvate decarboxylase activities of Zymomonas mobilis are described. Also disclosed are methods for increasing the growth of microorganisms or eukaryotic cells and methods for reducing the accumulation of undesirable metabolic products in the growth medium of microorganisms or cells.

  6. Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

    PubMed Central

    Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

    2006-01-01

    Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which

  7. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  8. Identification of co-expressed gene signatures in mouse B1, marginal zone and B2 B-cell populations

    PubMed Central

    Mabbott, Neil A; Gray, David

    2014-01-01

    In mice, three major B-cell subsets have been identified with distinct functionalities: B1 B cells, marginal zone B cells and follicular B2 B cells. Here, we used the growing body of publicly available transcriptomics data to create an expression atlas of 84 gene expression microarray data sets of distinct mouse B-cell subsets. These data were subjected to network-based cluster analysis using BioLayout Express3D. Using this analysis tool, genes with related functions clustered together in discrete regions of the network graph and enabled the identification of transcriptional networks that underpinned the functional activity of distinct cell populations. Some gene clusters were expressed highly by most of the cell populations included in this analysis (such as those with activity related to house-keeping functions). Others contained genes with expression patterns specific to distinct B-cell subsets. While these clusters contained many genes typically associated with the activity of the cells they were specifically expressed in, many novel B-cell-subset-specific candidate genes were identified. A large number of uncharacterized genes were also represented in these B-cell lineage-specific clusters. Further analysis of the activities of these uncharacterized candidate genes will lead to the identification of novel B-cell lineage-specific transcription factors and regulators of B-cell function. We also analysed 36 microarray data sets from distinct human B-cell populations. These data showed that mouse and human germinal centre B cells shared similar transcriptional features, whereas mouse B1 B cells were distinct from proposed human B1 B cells. PMID:24032749

  9. Intraisolate Mitochondrial Genetic Polymorphism and Gene Variants Coexpression in Arbuscular Mycorrhizal Fungi

    PubMed Central

    Beaudet, Denis; de la Providencia, Ivan Enrique; Labridy, Manuel; Roy-Bolduc, Alice; Daubois, Laurence; Hijri, Mohamed

    2015-01-01

    Arbuscular mycorrhizal fungi (AMF) are multinucleated and coenocytic organisms, in which the extent of the intraisolate nuclear genetic variation has been a source of debate. Conversely, their mitochondrial genomes (mtDNAs) have appeared to be homogeneous within isolates in all next generation sequencing (NGS)-based studies. Although several lines of evidence have challenged mtDNA homogeneity in AMF, extensive survey to investigate intraisolate allelic diversity has not previously been undertaken. In this study, we used a conventional polymerase chain reaction -based approach on selected mitochondrial regions with a high-fidelity DNA polymerase, followed by cloning and Sanger sequencing. Two isolates of Rhizophagus irregularis were used, one cultivated in vitro for several generations (DAOM-197198) and the other recently isolated from the field (DAOM-242422). At different loci in both isolates, we found intraisolate allelic variation within the mtDNA and in a single copy nuclear marker, which highlighted the presence of several nonsynonymous mutations in protein coding genes. We confirmed that some of this variation persisted in the transcriptome, giving rise to at least four distinct nad4 transcripts in DAOM-197198. We also detected the presence of numerous mitochondrial DNA copies within nuclear genomes (numts), providing insights to understand this important evolutionary process in AMF. Our study reveals that genetic variation in Glomeromycota is higher than what had been previously assumed and also suggests that it could have been grossly underestimated in most NGS-based AMF studies, both in mitochondrial and nuclear genomes, due to the presence of low-level mutations. PMID:25527836

  10. Enhancement of lipase r27RCL production in Pichia pastoris by regulating gene dosage and co-expression with chaperone protein disulfide isomerase.

    PubMed

    Sha, Chong; Yu, Xiao-Wei; Lin, Nai-Xin; Zhang, Meng; Xu, Yan

    2013-12-10

    Pichia pastoris has been successfully used in the production of many secreted and intracellular recombinant proteins, but there is still a large room of improvement for this expression system. Two factors drastically influence the lipase r27RCL production from Rhizopus chinensis CCTCC M201021, which are gene dosage and protein folding in the endoplasmic reticulum (ER). Regarding the effect of gene dosage, the enzyme activity for recombinant strain with three copies lipase gene was 1.95-fold higher than that for recombinant strain with only one copy lipase gene. In addition, the lipase production was further improved by co-expression with chaperone PDI involved in the disulfide bond formation in the ER. Overall, the maximum enzyme activity reached 355U/mL by the recombinant strain with one copy chaperone gene PDI plus five copies lipase gene proRCL in shaking flasks, which was 2.74-fold higher than that for the control strain with only one copy lipase gene. Overall, co-expression with PDI vastly increased the capacity for processing proteins of ER in P. pastoris. PMID:24315648

  11. Dynamic Visualization of Co-expression in Systems Genetics Data

    SciTech Connect

    New, Joshua Ryan; Huang, Jian; Chesler, Elissa J

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biological networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.

  12. Co-expression of interleukin 12 enhances antitumor effects of a novel chimeric promoter-mediated suicide gene therapy in an immunocompetent mouse model

    SciTech Connect

    Xu, Yu; Liu, Zhengchun; Kong, Haiyan; Sun, Wenjie; Liao, Zhengkai; Zhou, Fuxiang; Xie, Conghua; and others

    2011-09-09

    Highlights: {yields} A novel chimeric promoter consisting of CArG element and hTERT promoter was developed. {yields} The promoter was characterized with radiation-inducibility and tumor-specificity. {yields} Suicide gene system driven by the promoter showed remarkable cytotoxicity in vitro. {yields} Co-expression of IL12 enhanced the promoter mediated suicide gene therapy in vivo. -- Abstract: The human telomerase reverse transcriptase (hTERT) promoter has been widely used in target gene therapy of cancer. However, low transcriptional activity limited its clinical application. Here, we designed a novel dual radiation-inducible and tumor-specific promoter system consisting of CArG elements and the hTERT promoter, resulting in increased expression of reporter genes after gamma-irradiation. Therapeutic and side effects of adenovirus-mediated horseradish peroxidase (HRP)/indole-3-acetic (IAA) system downstream of the chimeric promoter were evaluated in mice bearing Lewis lung carcinoma, combining with or without adenovirus-mediated interleukin 12 (IL12) gene driven by the cytomegalovirus promoter. The combination treatment showed more effective suppression of tumor growth than those with single agent alone, being associated with pronounced intratumoral T-lymphocyte infiltration and minor side effects. Our results suggest that the combination treatment with HRP/IAA system driven by the novel chimeric promoter and the co-expression of IL12 might be an effective and safe target gene therapy strategy of cancer.

  13. Construction of a gene-gene interaction network with a combined score across multiple approaches.

    PubMed

    Zhang, A M; Song, H; Shen, Y H; Liu, Y

    2015-01-01

    Recent progress in computational methods for inves-tigating physical and functional gene interactions has provided new insights into the complexity of biological processes. An essential part of these methods is presented visually in the form of gene interaction networks that can be valuable in exploring the mechanisms of disease. Here, a combined network based on gene pairs with an extra layer of re-liability was constructed after converting and combining the gene pair scores using a novel algorithm across multiple approaches. Four groups of kidney cancer data sets from ArrayExpress were downloaded and analyzed to identify differentially expressed genes using a rank prod-ucts analysis tool. Gene co-expression network, protein-protein interac-tion, co-occurrence network and a combined network were constructed using empirical Bayesian meta-analysis approach, Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, an odds ratio formula of the cBioPortal for Cancer Genomics and a novel rank algorithm with combined score, respectively. The topological features of these networks were then compared to evaluate their performances. The results indicated that the gene pairs and their relationship rank-ings were not uniform. The values of topological parameters, such as clustering coefficient and the fitting coefficient R(2) of interaction net-work constructed using our ranked based combination score, were much greater than the other networks. The combined network had a classic small world property which transferred information quickly and displayed great resilience to the dysfunction of low-degree hubs with high-clustering and short average path length. It also followed distinct-ly a scale-free network with a higher reliability. PMID:26125911

  14. Gene co-expression network analysis identifies porcine genes associated with variation in Salmonella shedding

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Salmonella enterica serovar Typhimurium is a gram-negative bacterium that can colonize the gut of humans and several species of food producing farm animals to cause enteric or septicaemic salmonellosis. While many studies have looked into the host genetic response to Salmonella infection, relatively...

  15. A Genome-Wide Association Study for Culm Cellulose Content in Barley Reveals Candidate Genes Co-Expressed with Members of the CELLULOSE SYNTHASE A Gene Family

    PubMed Central

    Houston, Kelly; Burton, Rachel A.; Sznajder, Beata; Rafalski, Antoni J.; Dhugga, Kanwarpal S.; Mather, Diane E.; Taylor, Jillian; Steffenson, Brian J.; Waugh, Robbie; Fincher, Geoffrey B.

    2015-01-01

    Cellulose is a fundamentally important component of cell walls of higher plants. It provides a scaffold that allows the development and growth of the plant to occur in an ordered fashion. Cellulose also provides mechanical strength, which is crucial for both normal development and to enable the plant to withstand both abiotic and biotic stresses. We quantified the cellulose concentration in the culm of 288 two – rowed and 288 six – rowed spring type barley accessions that were part of the USDA funded barley Coordinated Agricultural Project (CAP) program in the USA. When the population structure of these accessions was analysed we identified six distinct populations, four of which we considered to be comprised of a sufficient number of accessions to be suitable for genome-wide association studies (GWAS). These lines had been genotyped with 3072 SNPs so we combined the trait and genetic data to carry out GWAS. The analysis allowed us to identify regions of the genome containing significant associations between molecular markers and cellulose concentration data, including one region cross-validated in multiple populations. To identify candidate genes we assembled the gene content of these regions and used these to query a comprehensive RNA-seq based gene expression atlas. This provided us with gene annotations and associated expression data across multiple tissues, which allowed us to formulate a supported list of candidate genes that regulate cellulose biosynthesis. Several regions identified by our analysis contain genes that are co-expressed with CELLULOSE SYNTHASE A (HvCesA) across a range of tissues and developmental stages. These genes are involved in both primary and secondary cell wall development. In addition, genes that have been previously linked with cellulose synthesis by biochemical methods, such as HvCOBRA, a gene of unknown function, were also associated with cellulose levels in the association panel. Our analyses provide new insights into the

  16. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe.

    PubMed

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338

  17. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe

    PubMed Central

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338

  18. Detection of gene communities in multi-networks reveals cancer drivers

    NASA Astrophysics Data System (ADS)

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-12-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.

  19. Detection of gene communities in multi-networks reveals cancer drivers

    PubMed Central

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-01-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes. PMID:26639632

  20. Neutralization of Bacterial YoeBSpn Toxicity and Enhanced Plant Growth in Arabidopsis thaliana via Co-Expression of the Toxin-Antitoxin Genes

    PubMed Central

    Abu Bakar, Fauziah; Yeo, Chew Chieng; Harikrishna, Jennifer Ann

    2016-01-01

    Bacterial toxin-antitoxin (TA) systems have various cellular functions, including as part of the general stress response. The genome of the Gram-positive human pathogen Streptococcus pneumoniae harbors several putative TA systems, including yefM-yoeBSpn, which is one of four systems that had been demonstrated to be biologically functional. Overexpression of the yoeBSpn toxin gene resulted in cell stasis and eventually cell death in its native host, as well as in Escherichia coli. Our previous work showed that induced expression of a yoeBSpn toxin-Green Fluorescent Protein (GFP) fusion gene apparently triggered apoptosis and was lethal in the model plant, Arabidopsis thaliana. In this study, we investigated the effects of co-expression of the yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic A. thaliana. When co-expressed in Arabidopsis, the YefMSpn antitoxin was found to neutralize the toxicity of YoeBSpn-GFP. Interestingly, the inducible expression of both yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic hybrid Arabidopsis resulted in larger rosette leaves and taller plants with a higher number of inflorescence stems and increased silique production. To our knowledge, this is the first demonstration of a prokaryotic antitoxin neutralizing its cognate toxin in plant cells. PMID:27104531

  1. Neutralization of Bacterial YoeBSpn Toxicity and Enhanced Plant Growth in Arabidopsis thaliana via Co-Expression of the Toxin-Antitoxin Genes.

    PubMed

    Abu Bakar, Fauziah; Yeo, Chew Chieng; Harikrishna, Jennifer Ann

    2016-01-01

    Bacterial toxin-antitoxin (TA) systems have various cellular functions, including as part of the general stress response. The genome of the Gram-positive human pathogen Streptococcus pneumoniae harbors several putative TA systems, including yefM-yoeBSpn, which is one of four systems that had been demonstrated to be biologically functional. Overexpression of the yoeBSpn toxin gene resulted in cell stasis and eventually cell death in its native host, as well as in Escherichia coli. Our previous work showed that induced expression of a yoeBSpn toxin-Green Fluorescent Protein (GFP) fusion gene apparently triggered apoptosis and was lethal in the model plant, Arabidopsis thaliana. In this study, we investigated the effects of co-expression of the yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic A. thaliana. When co-expressed in Arabidopsis, the YefMSpn antitoxin was found to neutralize the toxicity of YoeBSpn-GFP. Interestingly, the inducible expression of both yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic hybrid Arabidopsis resulted in larger rosette leaves and taller plants with a higher number of inflorescence stems and increased silique production. To our knowledge, this is the first demonstration of a prokaryotic antitoxin neutralizing its cognate toxin in plant cells. PMID:27104531

  2. Transient Co-Expression of Post-Transcriptional Gene Silencing Suppressors for Increased in Planta Expression of a Recombinant Anthrax Receptor Fusion Protein

    PubMed Central

    Arzola, Lucas; Chen, Junxing; Rattanaporn, Kittipong; Maclean, James M.; McDonald, Karen A.

    2011-01-01

    Potential epidemics of infectious diseases and the constant threat of bioterrorism demand rapid, scalable, and cost-efficient manufacturing of therapeutic proteins. Molecular farming of tobacco plants provides an alternative for the recombinant production of therapeutics. We have developed a transient production platform that uses Agrobacterium infiltration of Nicotiana benthamiana plants to express a novel anthrax receptor decoy protein (immunoadhesin), CMG2-Fc. This chimeric fusion protein, designed to protect against the deadly anthrax toxins, is composed of the von Willebrand factor A (VWA) domain of human capillary morphogenesis 2 (CMG2), an effective anthrax toxin receptor, and the Fc region of human immunoglobulin G (IgG). We evaluated, in N. benthamiana intact plants and detached leaves, the expression of CMG2-Fc under the control of the constitutive CaMV 35S promoter, and the co-expression of CMG2-Fc with nine different viral suppressors of post-transcriptional gene silencing (PTGS): p1, p10, p19, p21, p24, p25, p38, 2b, and HCPro. Overall, transient CMG2-Fc expression was higher on intact plants than detached leaves. Maximum expression was observed with p1 co-expression at 3.5 days post-infiltration (DPI), with a level of 0.56 g CMG2-Fc per kg of leaf fresh weight and 1.5% of the total soluble protein, a ten-fold increase in expression when compared to absence of suppression. Co-expression with the p25 PTGS suppressor also significantly increased the CMG2-Fc expression level after just 3.5 DPI. PMID:21954339

  3. Building Developmental Gene Regulatory Networks

    PubMed Central

    Li, Enhu; Davidson, Eric H.

    2009-01-01

    Animal development is an elaborate process programmed by genomic regulatory instructions. Regulatory genes encode transcription factors and signal molecules, and their expression is under the control of cis-regulatory modules that define the logic of transcriptional responses to the inputs of other regulatory genes. The functional linkages amongst regulatory genes constitute the gene regulatory networks (GRNs) that govern cell specification and patterning in development. Constructing such networks requires identification of the regulatory genes involved and characterization of their temporal and spatial expression patterns. Interactions (activation/repression) among transcription factors or signals can be investigated by large-scale perturbation analysis, in which the function of each gene is specifically blocked. Resultant expression changes are then integrated to identify direct linkages, and to reveal the structure of the GRN. Predicted GRN linkages can be tested and verified by cis-regulatory analysis. The explanatory power of the GRN was shown in the lineage specification of sea urchin endomesoderm. Acquiring such networks is essential for a systematic and mechanistic understanding of the developmental process. PMID:19530131

  4. Lists2Networks: Integrated analysis of gene/protein lists

    PubMed Central

    2010-01-01

    Background Systems biologists are faced with the difficultly of analyzing results from large-scale studies that profile the activity of many genes, RNAs and proteins, applied in different experiments, under different conditions, and reported in different publications. To address this challenge it is desirable to compare the results from different related studies such as mRNA expression microarrays, genome-wide ChIP-X, RNAi screens, proteomics and phosphoproteomics experiments in a coherent global framework. In addition, linking high-content multilayered experimental results with prior biological knowledge can be useful for identifying functional themes and form novel hypotheses. Results We present Lists2Networks, a web-based system that allows users to upload lists of mammalian genes/proteins onto a server-based program for integrated analysis. The system includes web-based tools to manipulate lists with different set operations, to expand lists using existing mammalian networks of protein-protein interactions, co-expression correlation, or background knowledge co-annotation correlation, as well as to apply gene-list enrichment analyses against many gene-list libraries of prior biological knowledge such as pathways, gene ontology terms, kinase-substrate, microRNA-mRAN, and protein-protein interactions, metabolites, and protein domains. Such analyses can be applied to several lists at once against many prior knowledge libraries of gene-lists associated with specific annotations. The system also contains features that allow users to export networks and share lists with other users of the system. Conclusions Lists2Networks is a user friendly web-based software system expected to significantly ease the computational analysis process for experimental systems biologists employing high-throughput experiments at multiple layers of regulation. The system is freely available at http://www.lists2networks.org. PMID:20152038

  5. Reconstruction of Gene Networks of Iron Response in Shewanella oneidensis

    SciTech Connect

    Yang, Yunfeng; Harris, Daniel P; Luo, Feng; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin Koo; Gao, Haichun; Arkin, Adam; Palumbo, Anthony Vito; Zhou, Jizhong

    2009-01-01

    It is of great interest to study the iron response of the -proteobacterium Shewanella oneidensis since it possesses a high content of iron and is capable of utilizing iron for anaerobic respiration. We report here that the iron response in S. oneidensis is a rapid process. To gain more insights into the bacterial response to iron, temporal gene expression profiles were examined for iron depletion and repletion, resulting in identification of iron-responsive biological pathways in a gene co-expression network. Iron acquisition systems, including genes unique to S. oneidensis, were rapidly and strongly induced by iron depletion, and repressed by iron repletion. Some were required for iron depletion, as exemplified by the mutational analysis of the putative siderophore biosynthesis protein SO3032. Unexpectedly, a number of genes related to anaerobic energy metabolism were repressed by iron depletion and induced by repletion, which might be due to the iron storage potential of their protein products. Other iron-responsive biological pathways include protein degradation, aerobic energy metabolism and protein synthesis. Furthermore, sequence motifs enriched in gene clusters as well as their corresponding DNA-binding proteins (Fur, CRP and RpoH) were identified, resulting in a regulatory network of iron response in S. oneidensis. Together, this work provides an overview of iron response and reveals novel features in S. oneidensis, including Shewanella-specific iron acquisition systems, and suggests the intimate relationship between anaerobic energy metabolism and iron response.

  6. Gene networks controlling petal organogenesis.

    PubMed

    Huang, Tengbo; Irish, Vivian F

    2016-01-01

    One of the biggest unanswered questions in developmental biology is how growth is controlled. Petals are an excellent organ system for investigating growth control in plants: petals are dispensable, have a simple structure, and are largely refractory to environmental perturbations that can alter their size and shape. In recent studies, a number of genes controlling petal growth have been identified. The overall picture of how such genes function in petal organogenesis is beginning to be elucidated. This review will focus on studies using petals as a model system to explore the underlying gene networks that control organ initiation, growth, and final organ morphology. PMID:26428062

  7. Buffering in cyclic gene networks

    NASA Astrophysics Data System (ADS)

    Glyzin, S. D.; Kolesov, A. Yu.; Rozov, N. Kh.

    2016-06-01

    We consider cyclic chains of unidirectionally coupled delay differential-difference equations that are mathematical models of artificial oscillating gene networks. We establish that the buffering phenomenon is realized in these system for an appropriate choice of the parameters: any given finite number of stable periodic motions of a special type, the so-called traveling waves, coexist.

  8. The Gene Network Underlying Hypodontia.

    PubMed

    Yin, W; Bian, Z

    2015-07-01

    Mammalian tooth development is a precise and complicated procedure. Several signaling pathways, such as nuclear factor (NF)-κB and WNT, are key regulators of tooth development. Any disturbance of these signaling pathways can potentially affect or block normal tooth development, and presently, there are more than 150 syndromes and 80 genes known to be related to tooth agenesis. Clarifying the interaction and crosstalk among these genes will provide important information regarding the mechanisms underlying missing teeth. In the current review, we summarize recently published findings on genes related to isolated and syndromic tooth agenesis; most of these genes function as positive regulators of cell proliferation or negative regulators of cell differentiation and apoptosis. Furthermore, we explore the corresponding networks involving these genes in addition to their implications for the clinical management of tooth agenesis. We conclude that this requires further study to improve patients' quality of life in the future. PMID:25910507

  9. Comparison of co-expression measures: mutual information, correlation, and model based indices

    PubMed Central

    2012-01-01

    Background Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). Results We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. Conclusion The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships

  10. Integrating Genetic and Network Analysis to Characterize Genes Related to Mouse Weight

    PubMed Central

    Zhang, Bin; Wang, Susanna; Plaisier, Christopher; Castellanos, Ruth; Brozell, Alec; Schadt, Eric E; Drake, Thomas A

    2006-01-01

    Systems biology approaches that are based on the genetics of gene expression have been fruitful in identifying genetic regulatory loci related to complex traits. We use microarray and genetic marker data from an F2 mouse intercross to examine the large-scale organization of the gene co-expression network in liver, and annotate several gene modules in terms of 22 physiological traits. We identify chromosomal loci (referred to as module quantitative trait loci, mQTL) that perturb the modules and describe a novel approach that integrates network properties with genetic marker information to model gene/trait relationships. Specifically, using the mQTL and the intramodular connectivity of a body weight–related module, we describe which factors determine the relationship between gene expression profiles and weight. Our approach results in the identification of genetic targets that influence gene modules (pathways) that are related to the clinical phenotypes of interest. PMID:16934000