Science.gov

Sample records for coexpressed gene networks

  1. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  2. Arabidopsis gene co-expression network and its functional modules

    PubMed Central

    Mao, Linyong; Van Hemert, John L; Dash, Sudhansu; Dickerson, Julie A

    2009-01-01

    Background Biological networks characterize the interactions of biomolecules at a systems-level. One important property of biological networks is the modular structure, in which nodes are densely connected with each other, but between which there are only sparse connections. In this report, we attempted to find the relationship between the network topology and formation of modular structure by comparing gene co-expression networks with random networks. The organization of gene functional modules was also investigated. Results We constructed a genome-wide Arabidopsis gene co-expression network (AGCN) by using 1094 microarrays. We then analyzed the topological properties of AGCN and partitioned the network into modules by using an efficient graph clustering algorithm. In the AGCN, 382 hub genes formed a clique, and they were densely connected only to a small subset of the network. At the module level, the network clustering results provide a systems-level understanding of the gene modules that coordinate multiple biological processes to carry out specific biological functions. For instance, the photosynthesis module in AGCN involves a very large number (> 1000) of genes which participate in various biological processes including photosynthesis, electron transport, pigment metabolism, chloroplast organization and biogenesis, cofactor metabolism, protein biosynthesis, and vitamin metabolism. The cell cycle module orchestrated the coordinated expression of hundreds of genes involved in cell cycle, DNA metabolism, and cytoskeleton organization and biogenesis. We also compared the AGCN constructed in this study with a graphical Gaussian model (GGM) based Arabidopsis gene network. The photosynthesis, protein biosynthesis, and cell cycle modules identified from the GGM network had much smaller module sizes compared with the modules found in the AGCN, respectively. Conclusion This study reveals new insight into the topological properties of biological networks. The

  3. Gene Coexpression Network Analysis as a Source of Functional Annotation for Rice Genes

    PubMed Central

    Childs, Kevin L.; Davidson, Rebecca M.; Buell, C. Robin

    2011-01-01

    With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa) gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional annotation of those

  4. Random matrix analysis of localization properties of gene coexpression network

    NASA Astrophysics Data System (ADS)

    Jalan, Sarika; Solymosi, Norbert; Vattay, Gábor; Li, Baowen

    2010-04-01

    We analyze gene coexpression network under the random matrix theory framework. The nearest-neighbor spacing distribution of the adjacency matrix of this network follows Gaussian orthogonal statistics of random matrix theory (RMT). Spectral rigidity test follows random matrix prediction for a certain range and deviates afterwards. Eigenvector analysis of the network using inverse participation ratio suggests that the statistics of bulk of the eigenvalues of network is consistent with those of the real symmetric random matrix, whereas few eigenvalues are localized. Based on these IPR calculations, we can divide eigenvalues in three sets: (a) The nondegenerate part that follows RMT. (b) The nondegenerate part, at both ends and at intermediate eigenvalues, which deviates from RMT and expected to contain information about important nodes in the network. (c) The degenerate part with zero eigenvalue, which fluctuates around RMT-predicted value. We identify nodes corresponding to the dominant modes of the corresponding eigenvectors and analyze their structural properties.

  5. Analysis of bHLH coding genes using gene co-expression network approach.

    PubMed

    Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

    2016-07-01

    Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species. PMID:27178572

  6. Investigating the Combinatory Effects of Biological Networks on Gene Co-expression

    PubMed Central

    Zhang, Cheng; Lee, Sunjae; Mardinoglu, Adil; Hua, Qiang

    2016-01-01

    Co-expressed genes often share similar functions, and gene co-expression networks have been widely used in studying the functionality of gene modules. Previous analysis indicated that genes are more likely to be co-expressed if they are either regulated by the same transcription factors, forming protein complexes or sharing similar topological properties in protein-protein interaction networks. Here, we reconstructed transcriptional regulatory and protein-protein networks for Saccharomyces cerevisiae using well-established databases, and we evaluated their co-expression activities using publically available gene expression data. Based on our network-dependent analysis, we found that genes that were co-regulated in the transcription regulatory networks and shared similar neighbors in the protein-protein networks were more likely to be co-expressed. Moreover, their biological functions were closely related. PMID:27445830

  7. Annotation of gene function in citrus using gene expression information and co-expression networks

    PubMed Central

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks

  8. Identification of hub genes and pathways associated with retinoblastoma based on co-expression network analysis.

    PubMed

    Wang, Q L; Chen, X; Zhang, M H; Shen, Q H; Qin, Z M

    2015-01-01

    The objective of this paper was to identify hub genes and pathways associated with retinoblastoma using centrality analysis of the co-expression network and pathway-enrichment analysis. The co-expression network of retinoblastoma was constructed by weighted gene co-expression network analysis (WGCNA) based on differentially expressed (DE) genes, and clusters were obtained through the molecular complex detection (MCODE) algorithm. Degree centrality analysis of the co-expression network was performed to explore hub genes present in retinoblastoma. Pathway-enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Validation of hub gene expression in retinoblastoma was performed by reverse transcription-polymerase chain reaction (RT-PCR) analysis. The co-expression network based on 221 DE genes between retinoblastoma and normal controls consisted of 210 nodes and 3965 edges, and 5 clusters of the network were evaluated. By assessing the centrality analysis of the co-expression network, 21 hub genes were identified, such as SNORD115-41, RASSF2, and SNORD115-44. According to RT-PCR analysis, 16 of the 21 hub genes were differently expressed, including RASSF2 and CDCA7, and 5 were not differently expressed in retinoblastoma compared to normal controls. Pathway analysis showed that genes in 2 clusters were enriched in 3 pathways: purine metabolism, p53 signaling pathway, and melanogenesis. In this study, we successfully identified 16 hub genes and 3 pathways associated with retinoblastoma, which may be potential biomarkers for early detection and therapy for retinoblastoma. PMID:26662407

  9. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network.

    PubMed

    Ruan, Xiyun; Li, Hongyun; Liu, Bo; Chen, Jie; Zhang, Shibao; Sun, Zeqiang; Liu, Shuangqing; Sun, Fahai; Liu, Qingyong

    2015-08-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson's correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson's correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  10. Elucidating gene function and function evolution through comparison of co-expression networks of plants

    PubMed Central

    Hansen, Bjoern O.; Vaid, Neha; Musialak-Lange, Magdalena; Janowski, Marcin; Mutwil, Marek

    2014-01-01

    The analysis of gene expression data has shown that transcriptionally coordinated (co-expressed) genes are often functionally related, enabling scientists to use expression data in gene function prediction. This Focused Review discusses our original paper (Large-scale co-expression approach to dissect secondary cell wall formation across plant species, Frontiers in Plant Science 2:23). In this paper we applied cross-species analysis to co-expression networks of genes involved in cellulose biosynthesis. We showed that the co-expression networks from different species are highly similar, indicating that whole biological pathways are conserved across species. This finding has two important implications. First, the analysis can transfer gene function annotation from well-studied plants, such as Arabidopsis, to other, uncharacterized plant species. As the analysis finds genes that have similar sequence and similar expression pattern across different organisms, functionally equivalent genes can be identified. Second, since co-expression analyses are often noisy, a comparative analysis should have higher performance, as parts of co-expression networks that are conserved are more likely to be functionally relevant. In this Focused Review, we outline the comparative analysis done in the original paper and comment on the recent advances and approaches that allow comparative analyses of co-function networks. We hypothesize that in comparison to simple co-expression analysis, comparative analysis would yield more accurate gene function predictions. Finally, by combining comparative analysis with genomic information of green plants, we propose a possible composition of cellulose biosynthesis machinery during earlier stages of plant evolution. PMID:25191328

  11. Construction of citrus gene coexpression networks from microarray data using random matrix theory.

    PubMed

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  12. Construction of citrus gene coexpression networks from microarray data using random matrix theory

    PubMed Central

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G.

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  13. Evaluation of Gene Association Methods for Coexpression Network Construction and Biological Knowledge Discovery

    PubMed Central

    Kumari, Sapna; Nie, Jeff; Chen, Huann-Sheng; Ma, Hao; Stewart, Ron; Li, Xiang; Lu, Meng-Zhu; Taylor, William M.; Wei, Hairong

    2012-01-01

    Background Constructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical. Methods and Results In this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods. Conclusions We found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction. PMID:23226279

  14. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering.

    PubMed

    Gao, Chuan; McDowell, Ian C; Zhao, Shiwen; Brown, Christopher D; Engelhardt, Barbara E

    2016-07-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues. PMID:27467526

  15. Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering

    PubMed Central

    McDowell, Ian C.; Zhao, Shiwen; Brown, Christopher D.; Engelhardt, Barbara E.

    2016-01-01

    Identifying latent structure in high-dimensional genomic data is essential for exploring biological processes. Here, we consider recovering gene co-expression networks from gene expression data, where each network encodes relationships between genes that are co-regulated by shared biological mechanisms. To do this, we develop a Bayesian statistical model for biclustering to infer subsets of co-regulated genes that covary in all of the samples or in only a subset of the samples. Our biclustering method, BicMix, allows overcomplete representations of the data, computational tractability, and joint modeling of unknown confounders and biological signals. Compared with related biclustering methods, BicMix recovers latent structure with higher precision across diverse simulation scenarios as compared to state-of-the-art biclustering methods. Further, we develop a principled method to recover context specific gene co-expression networks from the estimated sparse biclustering matrices. We apply BicMix to breast cancer gene expression data and to gene expression data from a cardiovascular study cohort, and we recover gene co-expression networks that are differential across ER+ and ER- samples and across male and female samples. We apply BicMix to the Genotype-Tissue Expression (GTEx) pilot data, and we find tissue specific gene networks. We validate these findings by using our tissue specific networks to identify trans-eQTLs specific to one of four primary tissues. PMID:27467526

  16. Reconstruction of gene co-expression network from microarray data using local expression patterns

    PubMed Central

    2014-01-01

    Background Biological networks connect genes, gene products to one another. A network of co-regulated genes may form gene clusters that can encode proteins and take part in common biological processes. A gene co-expression network describes inter-relationships among genes. Existing techniques generally depend on proximity measures based on global similarity to draw the relationship between genes. It has been observed that expression profiles are sharing local similarity rather than global similarity. We propose an expression pattern based method called GeCON to extract Gene CO-expression Network from microarray data. Pair-wise supports are computed for each pair of genes based on changing tendencies and regulation patterns of the gene expression. Gene pairs showing negative or positive co-regulation under a given number of conditions are used to construct such gene co-expression network. We construct co-expression network with signed edges to reflect up- and down-regulation between pairs of genes. Most existing techniques do not emphasize computational efficiency. We exploit a fast correlogram matrix based technique for capturing the support of each gene pair to construct the network. Results We apply GeCON to both real and synthetic gene expression data. We compare our results using the DREAM (Dialogue for Reverse Engineering Assessments and Methods) Challenge data with three well known algorithms, viz., ARACNE, CLR and MRNET. Our method outperforms other algorithms based on in silico regulatory network reconstruction. Experimental results show that GeCON can extract functionally enriched network modules from real expression data. Conclusions In view of the results over several in-silico and real expression datasets, the proposed GeCON shows satisfactory performance in predicting co-expression network in a computationally inexpensive way. We further establish that a simple expression pattern matching is helpful in finding biologically relevant gene network. In

  17. Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory

    PubMed Central

    Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071

  18. Discovery of core biotic stress responsive genes in Arabidopsis by weighted gene co-expression network analysis.

    PubMed

    Amrine, Katherine C H; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  19. Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by Weighted Gene Co-Expression Network Analysis

    PubMed Central

    Amrine, Katherine C. H.; Blanco-Ulate, Barbara; Cantu, Dario

    2015-01-01

    Intricate signal networks and transcriptional regulators translate the recognition of pathogens into defense responses. In this study, we carried out a gene co-expression analysis of all currently publicly available microarray data, which were generated in experiments that studied the interaction of the model plant Arabidopsis thaliana with microbial pathogens. This work was conducted to identify (i) modules of functionally related co-expressed genes that are differentially expressed in response to multiple biotic stresses, and (ii) hub genes that may function as core regulators of disease responses. Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 microarrays that involved microbial infections of Arabidopsis plants with a wide array of fungal and bacterial pathogens with biotrophic, hemibiotrophic, and necrotrophic lifestyles. WGCNA produced a network with scale-free and small-world properties composed of 205 distinct clusters of co-expressed genes. Modules of functionally related co-expressed genes that are differentially regulated in response to multiple pathogens were identified by integrating differential gene expression testing with functional enrichment analyses of gene ontology terms, known disease associated genes, transcriptional regulators, and cis-regulatory elements. The significance of functional enrichments was validated by comparisons with randomly generated networks. Network topology was then analyzed to identify intra- and inter-modular gene hubs. Based on high connectivity, and centrality in meta-modules that are clearly enriched in defense responses, we propose a list of 66 target genes for reverse genetic experiments to further dissect the Arabidopsis immune system. Our results show that statistical-based data trimming prior to network analysis allows the integration of expression datasets generated by different groups, under different

  20. Characterization of Genes for Beef Marbling Based on Applying Gene Coexpression Network

    PubMed Central

    Lim, Dajeong; Kim, Nam-Kuk; Lee, Seung-Hwan; Park, Hye-Sun; Cho, Yong-Min; Chai, Han-Ha; Kim, Heebal

    2014-01-01

    Marbling is an important trait in characterization beef quality and a major factor for determining the price of beef in the Korean beef market. In particular, marbling is a complex trait and needs a system-level approach for identifying candidate genes related to the trait. To find the candidate gene associated with marbling, we used a weighted gene coexpression network analysis from the expression value of bovine genes. Hub genes were identified; they were topologically centered with large degree and BC values in the global network. We performed gene expression analysis to detect candidate genes in M. longissimus with divergent marbling phenotype (marbling scores 2 to 7) using qRT-PCR. The results demonstrate that transmembrane protein 60 (TMEM60) and dihydropyrimidine dehydrogenase (DPYD) are associated with increasing marbling fat. We suggest that the network-based approach in livestock may be an important method for analyzing the complex effects of candidate genes associated with complex traits like marbling or tenderness. PMID:24624372

  1. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

    PubMed Central

    RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

    2015-01-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  2. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis

    PubMed Central

    Creanza, Teresa Maria; Liguori, Maria; Liuni, Sabino; Nuzziello, Nicoletta; Ancona, Nicola

    2016-01-01

    Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment. PMID:27314336

  3. Meta-Analysis of Differential Connectivity in Gene Co-Expression Networks in Multiple Sclerosis.

    PubMed

    Creanza, Teresa Maria; Liguori, Maria; Liuni, Sabino; Nuzziello, Nicoletta; Ancona, Nicola

    2016-01-01

    Differential gene expression analyses to investigate multiple sclerosis (MS) molecular pathogenesis cannot detect genes harboring genetic and/or epigenetic modifications that change the gene functions without affecting their expression. Differential co-expression network approaches may capture changes in functional interactions resulting from these alterations. We re-analyzed 595 mRNA arrays from publicly available datasets by studying changes in gene co-expression networks in MS and in response to interferon (IFN)-β treatment. Interestingly, MS networks show a reduced connectivity relative to the healthy condition, and the treatment activates the transcription of genes and increases their connectivity in MS patients. Importantly, the analysis of changes in gene connectivity in MS patients provides new evidence of association for genes already implicated in MS by single-nucleotide polymorphism studies and that do not show differential expression. This is the case of amiloride-sensitive cation channel 1 neuronal (ACCN1) that shows a reduced number of interacting partners in MS networks, and it is known for its role in synaptic transmission and central nervous system (CNS) development. Furthermore, our study confirms a deregulation of the vitamin D system: among the transcription factors that potentially regulate the deregulated genes, we find TCF3 and SP1 that are both involved in vitamin D3-induced p27Kip1 expression. Unveiling differential network properties allows us to gain systems-level insights into disease mechanisms and may suggest putative targets for the treatment. PMID:27314336

  4. Identification of crucial genes in intracranial aneurysm based on weighted gene coexpression network analysis.

    PubMed

    Zheng, X; Xue, C; Luo, G; Hu, Y; Luo, W; Sun, X

    2015-05-01

    The rupture of intracranial aneurysm (IA) is the leading cause for devastating subarachnoid hemorrhage. This study aimed to investigate genes related to IA and potential diagnosis targets. Two data sets (GSE15629 and GSE54083) were downloaded from Gene Expression Omnibus database. GSE15629 contained eight RI (ruptured IA), six UI (unruptured IA) and five control IA samples. GSE54083 included 8 RI, 5 UI and 10 superficial temporal artery samples. In total, 452 differentially expressed genes (DEGs) between RI and control, and 570 DEGs between UI and control, were identified. Protein-protein interaction networks for two kinds of DEGs related to RI and UI were constructed, respectively. Module networks were searched for DEGs related to RI or UI based on WGCNA (weighted gene coexpression network analysis). In the significant modules, FOS, CCL2, COL4A2 and CXCL5 were screened as crucial nodes with high degrees. Among them, FOS and CCL2 were enriched in immune response and COL4A2 was involved in the ECM (extracellular matrix) pathway, whereas CXCL5 was related to cytokine-cytokine receptor pathway. Taken together, FOS, CCL2, COL4A2 and CXCL5 might participate in the pathogenesis of RI or UI, and could serve as potential diagnosis targets. PMID:25721208

  5. Gene Coexpression Analyses Differentiate Networks Associated with Diverse Cancers Harboring TP53 Missense or Null Mutations

    PubMed Central

    Oros Klein, Kathleen; Oualkacha, Karim; Lafond, Marie-Hélène; Bhatnagar, Sahir; Tonin, Patricia N.; Greenwood, Celia M. T.

    2016-01-01

    In a variety of solid cancers, missense mutations in the well-established TP53 tumor suppressor gene may lead to the presence of a partially-functioning protein molecule, whereas mutations affecting the protein encoding reading frame, often referred to as null mutations, result in the absence of p53 protein. Both types of mutations have been observed in the same cancer type. As the resulting tumor biology may be quite different between these two groups, we used RNA-sequencing data from The Cancer Genome Atlas (TCGA) from four different cancers with poor prognosis, namely ovarian, breast, lung and skin cancers, to compare the patterns of coexpression of genes in tumors grouped according to their TP53 missense or null mutation status. We used Weighted Gene Coexpression Network analysis (WGCNA) and a new test statistic built on differences between groups in the measures of gene connectivity. For each cancer, our analysis identified a set of genes showing differential coexpression patterns between the TP53 missense- and null mutation-carrying groups that was robust to the choice of the tuning parameter in WGCNA. After comparing these sets of genes across the four cancers, one gene (KIR3DL2) consistently showed differential coexpression patterns between the null and missense groups. KIR3DL2 is known to play an important role in regulating the immune response, which is consistent with our observation that this gene's strongly-correlated partners implicated many immune-related pathways. Examining mutation-type-related changes in correlations between sets of genes may provide new insight into tumor biology. PMID:27536319

  6. Chronic Ethanol Exposure Produces Time- and Brain Region-Dependent Changes in Gene Coexpression Networks

    PubMed Central

    Osterndorff-Kahanek, Elizabeth A.; Becker, Howard C.; Lopez, Marcelo F.; Farris, Sean P.; Tiwari, Gayatri R.; Nunez, Yury O.; Harris, R. Adron; Mayfield, R. Dayne

    2015-01-01

    Repeated ethanol exposure and withdrawal in mice increases voluntary drinking and represents an animal model of physical dependence. We examined time- and brain region-dependent changes in gene coexpression networks in amygdala (AMY), nucleus accumbens (NAC), prefrontal cortex (PFC), and liver after four weekly cycles of chronic intermittent ethanol (CIE) vapor exposure in C57BL/6J mice. Microarrays were used to compare gene expression profiles at 0-, 8-, and 120-hours following the last ethanol exposure. Each brain region exhibited a large number of differentially expressed genes (2,000-3,000) at the 0- and 8-hour time points, but fewer changes were detected at the 120-hour time point (400-600). Within each region, there was little gene overlap across time (~20%). All brain regions were significantly enriched with differentially expressed immune-related genes at the 8-hour time point. Weighted gene correlation network analysis identified modules that were highly enriched with differentially expressed genes at the 0- and 8-hour time points with virtually no enrichment at 120 hours. Modules enriched for both ethanol-responsive and cell-specific genes were identified in each brain region. These results indicate that chronic alcohol exposure causes global ‘rewiring‘ of coexpression systems involving glial and immune signaling as well as neuronal genes. PMID:25803291

  7. Gene co-expression networks shed light into diseases of brain iron accumulation

    PubMed Central

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M.; Botía, Juan A.; Collingwood, Joanna F.; Hardy, John; Milward, Elizabeth A.; Ryten, Mina; Houlden, Henry

    2016-01-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700

  8. Gene co-expression networks shed light into diseases of brain iron accumulation.

    PubMed

    Bettencourt, Conceição; Forabosco, Paola; Wiethoff, Sarah; Heidari, Moones; Johnstone, Daniel M; Botía, Juan A; Collingwood, Joanna F; Hardy, John; Milward, Elizabeth A; Ryten, Mina; Houlden, Henry

    2016-03-01

    Aberrant brain iron deposition is observed in both common and rare neurodegenerative disorders, including those categorized as Neurodegeneration with Brain Iron Accumulation (NBIA), which are characterized by focal iron accumulation in the basal ganglia. Two NBIA genes are directly involved in iron metabolism, but whether other NBIA-related genes also regulate iron homeostasis in the human brain, and whether aberrant iron deposition contributes to neurodegenerative processes remains largely unknown. This study aims to expand our understanding of these iron overload diseases and identify relationships between known NBIA genes and their main interacting partners by using a systems biology approach. We used whole-transcriptome gene expression data from human brain samples originating from 101 neuropathologically normal individuals (10 brain regions) to generate weighted gene co-expression networks and cluster the 10 known NBIA genes in an unsupervised manner. We investigated NBIA-enriched networks for relevant cell types and pathways, and whether they are disrupted by iron loading in NBIA diseased tissue and in an in vivo mouse model. We identified two basal ganglia gene co-expression modules significantly enriched for NBIA genes, which resemble neuronal and oligodendrocytic signatures. These NBIA gene networks are enriched for iron-related genes, and implicate synapse and lipid metabolism related pathways. Our data also indicates that these networks are disrupted by excessive brain iron loading. We identified multiple cell types in the origin of NBIA disorders. We also found unforeseen links between NBIA networks and iron-related processes, and demonstrate convergent pathways connecting NBIAs and phenotypically overlapping diseases. Our results are of further relevance for these diseases by providing candidates for new causative genes and possible points for therapeutic intervention. PMID:26707700

  9. New Method for Joint Network Analysis Reveals Common and Different Coexpression Patterns among Genes and Proteins in Breast Cancer

    PubMed Central

    2016-01-01

    We focus on characterizing common and different coexpression patterns among RNAs and proteins in breast cancer tumors. To address this problem, we introduce Joint Random Forest (JRF), a novel nonparametric algorithm to simultaneously estimate multiple coexpression networks by effectively borrowing information across protein and gene expression data. The performance of JRF was evaluated through extensive simulation studies using different network topologies and data distribution functions. Advantages of JRF over other algorithms that estimate class-specific networks separately were observed across all simulation settings. JRF also outperformed a competing method based on Gaussian graphic models. We then applied JRF to simultaneously construct gene and protein coexpression networks based on protein and RNAseq data from CPTAC-TCGA breast cancer study. We identified interesting common and differential coexpression patterns among genes and proteins. This information can help to cast light on the potential disease mechanisms of breast cancer. PMID:26733076

  10. The Structure of a Gene Co-Expression Network Reveals Biological Functions Underlying eQTLs

    PubMed Central

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology. PMID:23577081

  11. The structure of a gene co-expression network reveals biological functions underlying eQTLs.

    PubMed

    Villa-Vialaneix, Nathalie; Liaubet, Laurence; Laurent, Thibault; Cherel, Pierre; Gamot, Adrien; SanCristobal, Magali

    2013-01-01

    What are the commonalities between genes, whose expression level is partially controlled by eQTL, especially with regard to biological functions? Moreover, how are these genes related to a phenotype of interest? These issues are particularly difficult to address when the genome annotation is incomplete, as is the case for mammalian species. Moreover, the direct link between gene expression and a phenotype of interest may be weak, and thus difficult to handle. In this framework, the use of a co-expression network has proven useful: it is a robust approach for modeling a complex system of genetic regulations, and to infer knowledge for yet unknown genes. In this article, a case study was conducted with a mammalian species. It showed that the use of a co-expression network based on partial correlation, combined with a relevant clustering of nodes, leads to an enrichment of biological functions of around 83%. Moreover, the use of a spatial statistics approach allowed us to superimpose additional information related to a phenotype; this lead to highlighting specific genes or gene clusters that are related to the network structure and the phenotype. Three main results are worth noting: first, key genes were highlighted as a potential focus for forthcoming biological experiments; second, a set of biological functions, which support a list of genes under partial eQTL control, was set up by an overview of the global structure of the gene expression network; third, pH was found correlated with gene clusters, and then with related biological functions, as a result of a spatial analysis of the network topology. PMID:23577081

  12. A contribution to the study of plant development evolution based on gene co-expression networks

    PubMed Central

    Romero-Campero, Francisco J.; Lucas-Reina, Eva; Said, Fatima E.; Romero, José M.; Valverde, Federico

    2013-01-01

    Phototrophic eukaryotes are among the most successful organisms on Earth due to their unparalleled efficiency at capturing light energy and fixing carbon dioxide to produce organic molecules. A conserved and efficient network of light-dependent regulatory modules could be at the bases of this success. This regulatory system conferred early advantages to phototrophic eukaryotes that allowed for specialization, complex developmental processes and modern plant characteristics. We have studied light-dependent gene regulatory modules from algae to plants employing integrative-omics approaches based on gene co-expression networks. Our study reveals some remarkably conserved ways in which eukaryotic phototrophs deal with day length and light signaling. Here we describe how a family of Arabidopsis transcription factors involved in photoperiod response has evolved from a single algal gene according to the innovation, amplification and divergence theory of gene evolution by duplication. These modifications of the gene co-expression networks from the ancient unicellular green algae Chlamydomonas reinhardtii to the modern brassica Arabidopsis thaliana may hint on the evolution and specialization of plants and other organisms. PMID:23935602

  13. Identification of Common Regulators of Genes in Co-Expression Networks Affecting Muscle and Meat Properties

    PubMed Central

    Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2015-01-01

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network analysis (WGCNA) groups genes into modules based on patterns of co-expression, which can be linked to phenotypes by correlation analysis of trait values and the module eigengenes, i.e. the first principal component of a given module. Network hub genes and regulators of the genes in the modules are likely to play an important role in the emergence of respective traits. In order to detect common regulators of genes in modules showing association with meat quality traits, we identified eQTL for each of these genes, including the highly connected hub genes. Additionally, the module eigengene values were used for association analyses in order to derive a joint eQTL for the respective module. Thereby major sites of orchestrated regulation of genes within trait-associated modules were detected as hotspots of eQTL of many genes of a module and of its eigengene. These sites harbor likely common regulators of genes in the modules. We exemplarily showed the consistent impact of candidate common regulators on the expression of members of respective modules by RNAi knockdown experiments. In fact, Cxcr7 was identified and validated as a regulator of genes in a module, which is involved in the function of defense response in muscle cells. Zfp36l2 was confirmed as a regulator of genes of a module related to cell death or apoptosis pathways. The integration of eQTL in module networks enabled to interpret the differentially-regulated genes from a systems perspective. By integrating genome-wide genomic and transcriptomic data, employing co-expression and eQTL analyses, the study revealed likely regulators that are involved in the fine-tuning and synchronization of genes with trait

  14. Identification of common regulators of genes in co-expression networks affecting muscle and meat properties.

    PubMed

    Ponsuksili, Siriluck; Siengdee, Puntita; Du, Yang; Trakooljul, Nares; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2015-01-01

    Understanding the genetic contributions behind skeletal muscle composition and metabolism is of great interest in medicine and agriculture. Attempts to dissect these complex traits combine genome-wide genotyping, expression data analyses and network analyses. Weighted gene co-expression network analysis (WGCNA) groups genes into modules based on patterns of co-expression, which can be linked to phenotypes by correlation analysis of trait values and the module eigengenes, i.e. the first principal component of a given module. Network hub genes and regulators of the genes in the modules are likely to play an important role in the emergence of respective traits. In order to detect common regulators of genes in modules showing association with meat quality traits, we identified eQTL for each of these genes, including the highly connected hub genes. Additionally, the module eigengene values were used for association analyses in order to derive a joint eQTL for the respective module. Thereby major sites of orchestrated regulation of genes within trait-associated modules were detected as hotspots of eQTL of many genes of a module and of its eigengene. These sites harbor likely common regulators of genes in the modules. We exemplarily showed the consistent impact of candidate common regulators on the expression of members of respective modules by RNAi knockdown experiments. In fact, Cxcr7 was identified and validated as a regulator of genes in a module, which is involved in the function of defense response in muscle cells. Zfp36l2 was confirmed as a regulator of genes of a module related to cell death or apoptosis pathways. The integration of eQTL in module networks enabled to interpret the differentially-regulated genes from a systems perspective. By integrating genome-wide genomic and transcriptomic data, employing co-expression and eQTL analyses, the study revealed likely regulators that are involved in the fine-tuning and synchronization of genes with trait

  15. Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells

    PubMed Central

    Mason, Mike J; Fan, Guoping; Plath, Kathrin; Zhou, Qing; Horvath, Steve

    2009-01-01

    Background Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we investigated the use of unsigned and signed network analysis to identify pluripotency and differentiation related genes. Results We show that signed networks provide a better systems level understanding of the regulatory mechanisms of ES cells than unsigned networks, using two independent murine ES cell expression data sets. Specifically, using signed weighted gene co-expression network analysis (WGCNA), we found a pluripotency module and a differentiation module, which are not identified in unsigned networks. We confirmed the importance of these modules by incorporating genome-wide TF binding data for key ES cell regulators. Interestingly, we find that the pluripotency module is enriched with genes related to DNA damage repair and mitochondrial function in addition to transcriptional regulation. Using a connectivity measure of module membership, we not only identify known regulators of ES cells but also show that Mrpl15, Msh6, Nrf1, Nup133, Ppif, Rbpj, Sh3gl2, and Zfp39, among other genes, have important roles in maintaining ES cell pluripotency and self-renewal. We also report highly significant relationships between module membership and epigenetic modifications (histone modifications and promoter CpG methylation status), which are known to play a role in controlling gene expression during ES cell self-renewal and differentiation. Conclusion Our systems biologic re-analysis of gene expression, transcription factor binding, epigenetic and gene ontology data provides a novel integrative view of ES cell biology. PMID:19619308

  16. Key genes for modulating information flow play a temporal role as breast tumor coexpression networks are dynamically rewired by letrozole

    PubMed Central

    2013-01-01

    Background Genes do not act in isolation but instead as part of complex regulatory networks. To understand how breast tumors adapt to the presence of the drug letrozole, at the molecular level, it is necessary to consider how the expression levels of genes in these networks change relative to one another. Methods Using transcriptomic data generated from sequential tumor biopsy samples, taken at diagnosis, following 10-14 days and following 90 days of letrozole treatment, and a pairwise partial correlation statistic, we build temporal gene coexpression networks. We characterize the structure of each network and identify genes that hold prominent positions for maintaining network integrity and controlling information-flow. Results Letrozole treatment leads to extensive rewiring of the breast tumor coexpression network. Approximately 20% of gene-gene relationships are conserved over time in the presence of letrozole while 80% of relationships are condition dependent. The positions of influence within the networks are transiently held with few genes stably maintaining high centrality scores across the three time points. Conclusions Genes integral for maintaining network integrity and controlling information flow are dynamically changing as the breast tumor coexpression network adapts to perturbation by the drug letrozole. PMID:23819860

  17. Identification of key genes for laryngeal squamous cell carcinoma using weighted co-expression network analysis

    PubMed Central

    LI, XIAO-TIAN

    2016-01-01

    Laryngeal squamous cell carcinoma (LSCC) is the most common malignant tumor in the head and neck, and can seriously affect the daily life of patients. To study the mechanisms of LSCC, the microarray of GSE51958 was analyzed in the present study. GSE51958 was downloaded from Gene Expression Omnibus, and included a collection of LSCC tissue samples and matched adjacent non-cancerous tissue samples from 10 patients. Differentially-expressed genes (DEGs) were identified using limma package. Next, a weighted co-expression network was constructed for the DEGs by WGCNA package in R. Modules of the weighted co-expression network were obtained through constructing a hierarchical clustering tree using the hybrid dynamic shear tree method. Using the clusterProfiler package, the potential functions of DEGs in the modules correlated with LSCC were predicted by pathway enrichment analysis. In total, 959 DEGs were screened from the LSCC samples compared with the adjacent non-cancerous samples, including 553 upregulated and 406 downregulated genes. The appointed black, brown, gray, pink and yellow modules were screened for the DEGs in the weighted co-expression network. For the DEGs in the brown and yellow modules, the enriched pathways were cytokine-cytokine receptor interaction and metabolic pathways, respectively. The DEGs in the pink module were involved in the majority of pathways. With high connectivity degrees in the pink module, TPX2, microtubule-associated (TPX2; degree, 25), minichromosome maintenance complex component 2 (MCM2; degree, 25), ubiquitin-like with PHD and ring finger domains 1 (UHRF1; degree, 22), cyclin-dependent kinase 2 (CDK2; degree, 20) and protein regulator of cytokinesis 1 (PRC1; degree, 20) may be involved in LSCC. Overall, In conclusion, from the integrated bioinformatics analysis of genes that may be associated with LSCC, 959 DEGs were obtained from LSCC samples compared with adjacent non-cancerous samples, and TPX2, MCM2, UHRF1, CDK2 and PRC1 were

  18. Integrated gene co-expression network analysis in the growth phase of Mycobacterium tuberculosis reveals new potential drug targets.

    PubMed

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Verma, Srikant Prasad; Kumar, Sanjiv; Ramachandran, Srinivasan

    2013-11-01

    We have carried out weighted gene co-expression network analysis of Mycobacterium tuberculosis to gain insights into gene expression architecture during log phase growth. The differentially expressed genes between at least one pair of 11 different M. tuberculosis strains as source of biological variability were used for co-expression network analysis. This data included genes with highest coefficient of variation in expression. Five distinct modules were identified using topological overlap based clustering. All the modules together showed significant enrichment in biological processes: fatty acid biosynthesis, cell membrane, intracellular membrane bound organelle, DNA replication, Quinone biosynthesis, cell shape and peptidoglycan biosynthesis, ribosome and structural constituents of ribosome and transposition. We then extracted the co-expressed connections which were supported either by transcriptional regulatory network or STRING database or high edge weight of topological overlap. The genes trpC, nadC, pitA, Rv3404c, atpA, pknA, Rv0996, purB, Rv2106 and Rv0796 emerged as top hub genes. After overlaying this network on the iNJ661 metabolic network, the reactions catalyzed by 15 highly connected metabolic genes were knocked down in silico and evaluated by Flux Balance Analysis. The results showed that in 12 out of 15 cases, in 11 more than 50% of reactions catalyzed by genes connected through co-expressed connections also had altered fluxes. The modules 'Turquoise', 'Blue' and 'Red' also showed enrichment in essential genes. We could map 152 of the previously known or proposed drug targets in these modules and identified 15 new potential drug targets based on their high degree of co-expressed connections and strong correlation with module eigengenes. PMID:24056838

  19. ALCOdb: Gene Coexpression Database for Microalgae.

    PubMed

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems. PMID:26644461

  20. ALCOdb: Gene Coexpression Database for Microalgae

    PubMed Central

    Aoki, Yuichi; Okamura, Yasunobu; Ohta, Hiroyuki; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    In the era of energy and food shortage, microalgae have gained much attention as promising sources of biofuels and food ingredients. However, only a small fraction of microalgal genes have been functionally characterized. Here, we have developed the Algae Gene Coexpression database (ALCOdb; http://alcodb.jp), which provides gene coexpression information to survey gene modules for a function of interest. ALCOdb currently supports two model algae: the green alga Chlamydomonas reinhardtii and the red alga Cyanidioschyzon merolae. Users can retrieve coexpression information for genes of interest through three unique data pages: (i) Coexpressed Gene List; (ii) Gene Information; and (iii) Coexpressed Gene Network. In addition to the basal coexpression information, ALCOdb also provides several advanced functionalities such as an expression profile viewer and a differentially expressed gene search tool. Using these user interfaces, we demonstrated that our gene coexpression data have the potential to detect functionally related genes and are useful in extrapolating the biological roles of uncharacterized genes. ALCOdb will facilitate molecular and biochemical studies of microalgal biological phenomena, such as lipid metabolism and organelle development, and promote the evolutionary understanding of plant cellular systems. PMID:26644461

  1. Microarray and Co-expression Network Analysis of Genes Associated with Acute Doxorubicin Cardiomyopathy in Mice.

    PubMed

    Wei, Sheng-Nan; Zhao, Wen-Jie; Zeng, Xiang-Jun; Kang, Yu-Ming; Du, Jie; Li, Hui-Hua

    2015-10-01

    Clinical use of doxorubicin (DOX) in cancer therapy is limited by its dose-dependent cardiotoxicity. But molecular mechanisms underlying this phenomenon have not been well defined. This study was to investigate the effect of DOX on the changes of global genomics in hearts. Acute cardiotoxicity was induced by giving C57BL/6J mice a single intraperitoneal injection of DOX (15 mg/kg). Cardiac function and apoptosis were monitored using echocardiography and TUNEL assay at days 1, 3 and 5. Myocardial glucose and ATP levels were measured. Microarray assays were used to screen gene expression profiles in the hearts at day 5, and the results were confirmed with qPCR analysis. DOX administration caused decreased cardiac function, increased cardiomyocyte apoptosis and decreased glucose and ATP levels. Microarrays showed 747 up-regulated genes and 438 down-regulated genes involved in seven main functional categories. Among them, metabolic pathway was the most affected by DOX. Several key genes, including 2,3-bisphosphoglycerate mutase (Bpgm), hexokinase 2, pyruvate dehydrogenase kinase, isoenzyme 4 and fructose-2,6-bisphosphate 2-phosphatase, are closely related to glucose metabolism. Gene co-expression networks suggested the core role of Bpgm in DOX cardiomyopathy. These results obtained in mice were further confirmed in cultured cardiomyocytes. In conclusion, genes involved in glucose metabolism, especially Bpgm, may play a central role in the pathogenesis of DOX-induced cardiotoxicity. PMID:25575753

  2. Shared Pathways Among Autism Candidate Genes Determined by Co-expression Network Analysis of the Developing Human Brain Transcriptome.

    PubMed

    Mahfouz, Ahmed; Ziats, Mark N; Rennert, Owen M; Lelieveldt, Boudewijn P F; Reinders, Marcel J T

    2015-12-01

    Autism spectrum disorder (ASD) is a neurodevelopmental syndrome known to have a significant but complex genetic etiology. Hundreds of diverse genes have been implicated in ASD; yet understanding how many genes, each with disparate function, can all be linked to a single clinical phenotype remains unclear. We hypothesized that understanding functional relationships between autism candidate genes during normal human brain development may provide convergent mechanistic insight into the genetic heterogeneity of ASD. We analyzed the co-expression relationships of 455 genes previously implicated in autism using the BrainSpan human transcriptome database, across 16 anatomical brain regions spanning prenatal life through adulthood. We discovered modules of ASD candidate genes with biologically relevant temporal co-expression dynamics, which were enriched for functional ontologies related to synaptogenesis, apoptosis, and GABA-ergic neurons. Furthermore, we also constructed co-expression networks from the entire transcriptome and found that ASD candidate genes were enriched in modules related to mitochondrial function, protein translation, and ubiquitination. Hub genes central to these ASD-enriched modules were further identified, and their functions supported these ontological findings. Overall, our multi-dimensional co-expression analysis of ASD candidate genes in the normal developing human brain suggests the heterogeneous set of ASD candidates share transcriptional networks related to synapse formation and elimination, protein turnover, and mitochondrial function. PMID:26399424

  3. Comparison of low and high dose ionising radiation using topological analysis of gene coexpression networks

    PubMed Central

    2012-01-01

    Background The growing use of imaging procedures in medicine has raised concerns about exposure to low-dose ionising radiation (LDIR). While the disastrous effects of high dose ionising radiation (HDIR) is well documented, the detrimental effects of LDIR is not well understood and has been a topic of much debate. Since little is known about the effects of LDIR, various kinds of wet-lab and computational analyses are required to advance knowledge in this domain. In this paper we carry out an “upside-down pyramid” form of systems biology analysis of microarray data. We characterised the global genomic response following 10 cGy (low dose) and 100 cGy (high dose) doses of X-ray ionising radiation at four time points by analysing the topology of gene coexpression networks. This study includes a rich experimental design and state-of-the-art computational systems biology methods of analysis to study the differences in the transcriptional response of skin cells exposed to low and high doses of radiation. Results Using this method we found important genes that have been linked to immune response, cell survival and apoptosis. Furthermore, we also were able to identify genes such as BRCA1, ABCA1, TNFRSF1B, MLLT11 that have been associated with various types of cancers. We were also able to detect many genes known to be associated with various medical conditions. Conclusions Our method of applying network topological differences can aid in identifying the differences among similar (eg: radiation effect) yet very different biological conditions (eg: different dose and time) to generate testable hypotheses. This is the first study where a network level analysis was performed across two different radiation doses at various time points, thereby illustrating changes in the cellular response over time. PMID:22594378

  4. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    SciTech Connect

    Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali; Tuskan, Gerald A; Kalluri, Udaya C

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  5. A Predictive Coexpression Network Identifies Novel Genes Controlling the Seed-to-Seedling Phase Transition in Arabidopsis thaliana1[OPEN

    PubMed Central

    Silva, Anderson Tadeu; Ribone, Pamela A.

    2016-01-01

    The transition from a quiescent dry seed to an actively growing photoautotrophic seedling is a complex and crucial trait for plant propagation. This study provides a detailed description of global gene expression in seven successive developmental stages of seedling establishment in Arabidopsis (Arabidopsis thaliana). Using the transcriptome signature from these developmental stages, we obtained a coexpression gene network that highlights interactions between known regulators of the seed-to-seedling transition and predicts the functions of uncharacterized genes in seedling establishment. The coexpressed gene data sets together with the transcriptional module indicate biological functions related to seedling establishment. Characterization of the homeodomain leucine zipper I transcription factor AtHB13, which is expressed during the seed-to-seedling transition, demonstrated that this gene regulates some of the network nodes and affects late seedling establishment. Knockout mutants for athb13 showed increased primary root length as compared with wild-type (Columbia-0) seedlings, suggesting that this transcription factor is a negative regulator of early root growth, possibly repressing cell division and/or cell elongation or the length of time that cells elongate. The signal transduction pathways present during the early phases of the seed-to-seedling transition anticipate the control of important events for a vigorous seedling, such as root growth. This study demonstrates that a gene coexpression network together with transcriptional modules can provide insights that are not derived from comparative transcript profiling alone. PMID:26888061

  6. Topological and functional discovery in a gene coexpression meta-network of gastric cancer.

    PubMed

    Aggarwal, Amit; Guo, Dong Li; Hoshida, Yujin; Yuen, Siu Tsan; Chu, Kent-Man; So, Samuel; Boussioutas, Alex; Chen, Xin; Bowtell, David; Aburatani, Hiroyuki; Leung, Suet Yi; Tan, Patrick

    2006-01-01

    Gastric cancer is a leading cause of global cancer mortality, but comparatively little is known about the cellular pathways regulating different aspects of the gastric cancer phenotype. To achieve a better understanding of gastric cancer at the levels of systems topology, functional modules, and constituent genes, we assembled and systematically analyzed a consensus gene coexpression meta-network of gastric cancer incorporating >300 tissue samples from four independent patient populations (the "gastrome"). We find that the gastrome exhibits a hierarchical scale-free architecture, with an internal structure comprising multiple deeply embedded modules associated with diverse cellular functions. Individual modules display distinct subtopologies, with some (cellular proliferation) being integrated within the primary network, and others (ribosomal biosynthesis) being relatively isolated. One module associated with intestinal differentiation exhibited a remarkably high degree of autonomy, raising the possibility that its specific topological features may contribute towards the frequent occurrence of intestinal metaplasia in gastric cancer. At the single-gene level, we discovered a novel conserved interaction between the PLA2G2A prognostic marker and the EphB2 receptor, and used tissue microarrays to validate the PLA2G2A/EphB2 association. Finally, because EphB2 is a known target of the Wnt signaling pathway, we tested and provide evidence that the Wnt pathway may also similarly regulate PLA2G2A. Many of these findings were not discernible by studying the single patient populations in isolation. Thus, besides enhancing our knowledge of gastric cancer, our results show the broad utility of applying meta-analytic approaches to genome-wide data for the purposes of biological discovery. PMID:16397236

  7. Discovering gene re-ranking efficiency and conserved gene-gene relationships derived from gene co-expression network analysis on breast cancer data

    PubMed Central

    Bourdakou, Marilena M.; Athanasiadis, Emmanouil I.; Spyrou, George M.

    2016-01-01

    Systemic approaches are essential in the discovery of disease-specific genes, offering a different perspective and new tools on the analysis of several types of molecular relationships, such as gene co-expression or protein-protein interactions. However, due to lack of experimental information, this analysis is not fully applicable. The aim of this study is to reveal the multi-potent contribution of statistical network inference methods in highlighting significant genes and interactions. We have investigated the ability of statistical co-expression networks to highlight and prioritize genes for breast cancer subtypes and stages in terms of: (i) classification efficiency, (ii) gene network pattern conservation, (iii) indication of involved molecular mechanisms and (iv) systems level momentum to drug repurposing pipelines. We have found that statistical network inference methods are advantageous in gene prioritization, are capable to contribute to meaningful network signature discovery, give insights regarding the disease-related mechanisms and boost drug discovery pipelines from a systems point of view. PMID:26892392

  8. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

    SciTech Connect

    Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia; Callister, Stephen J.; Wright, Aaron T.; Westbye, Alexander; Beatty, J. T.; Lang, Andrew S.

    2014-08-28

    The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigated preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional

  9. Identification of hub genes of pneumocyte senescence induced by thoracic irradiation using weighted gene co-expression network analysis

    PubMed Central

    XING, YONGHUA; ZHANG, JUNLING; LU, LU; LI, DEGUAN; WANG, YUEYING; HUANG, SONG; LI, CHENGCHENG; ZHANG, ZHUBO; LI, JIANGUO; MENG, AIMIN

    2016-01-01

    Irradiation commonly causes pneumocyte senescence, which may lead to severe fatal lung injury characterized by pulmonary dysfunction and respiratory failure. However, the molecular mechanism underlying the induction of pneumocyte senescence by irradiation remains to be elucidated. In the present study, weighted gene co-expression network analysis (WGCNA) was used to screen for differentially expressed genes, and to identify the hub genes and gene modules, which may be critical for senescence. A total of 2,916 differentially expressed genes were identified between the senescence and non-senescence groups following thoracic irradiation. In total, 10 gene modules associated with cell senescence were detected, and six hub genes were identified, including B-cell scaffold protein with ankyrin repeats 1, translocase of outer mitochondrial membrane 70 homolog A, actin filament-associated protein 1, Cd84, Nuf2 and nuclear factor erythroid 2. These genes were markedly associated with cell proliferation, cell division and cell cycle arrest. The results of the present study demonstrated that WGCNA of microarray data may provide further insight into the molecular mechanism underlying pneumocyte senescence. PMID:26572216

  10. Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes.

    PubMed

    Riquelme Medina, Ignacio; Lubovac-Pilav, Zelmina

    2016-01-01

    Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body's inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D. PMID:27257970

  11. Gene Co-Expression Network Analysis for Identifying Modules and Functionally Enriched Pathways in Type 1 Diabetes

    PubMed Central

    Riquelme Medina, Ignacio; Lubovac-Pilav, Zelmina

    2016-01-01

    Type 1 diabetes (T1D) is a complex disease, caused by the autoimmune destruction of the insulin producing pancreatic beta cells, resulting in the body’s inability to produce insulin. While great efforts have been put into understanding the genetic and environmental factors that contribute to the etiology of the disease, the exact molecular mechanisms are still largely unknown. T1D is a heterogeneous disease, and previous research in this field is mainly focused on the analysis of single genes, or using traditional gene expression profiling, which generally does not reveal the functional context of a gene associated with a complex disorder. However, network-based analysis does take into account the interactions between the diabetes specific genes or proteins and contributes to new knowledge about disease modules, which in turn can be used for identification of potential new biomarkers for T1D. In this study, we analyzed public microarray data of T1D patients and healthy controls by applying a systems biology approach that combines network-based Weighted Gene Co-Expression Network Analysis (WGCNA) with functional enrichment analysis. Novel co-expression gene network modules associated with T1D were elucidated, which in turn provided a basis for the identification of potential pathways and biomarker genes that may be involved in development of T1D. PMID:27257970

  12. Assessing the Biological Significance of Gene Expression Signatures and Co-Expression Modules by Studying Their Network Properties

    PubMed Central

    Minguez, Pablo; Dopazo, Joaquin

    2011-01-01

    Microarray experiments have been extensively used to define signatures, which are sets of genes that can be considered markers of experimental conditions (typically diseases). Paradoxically, in spite of the apparent functional role that might be attributed to such gene sets, signatures do not seem to be reproducible across experiments. Given the close relationship between function and protein interaction, network properties can be used to study to what extent signatures are composed of genes whose resulting proteins show a considerable level of interaction (and consequently a putative common functional role). We have analysed 618 signatures and 507 modules of co-expression in cancer looking for significant values of four main protein-protein interaction (PPI) network parameters: connection degree, cluster coefficient, betweenness and number of components. A total of 3904 gene ontology (GO) modules, 146 KEGG pathways, and 263 Biocarta pathways have been used as functional modules of reference. Co-expression modules found in microarray experiments display a high level of connectivity, similar to the one shown by conventional modules based on functional definitions (GO, KEGG and Biocarta). A general observation for all the classes studied is that the networks formed by the modules improve their topological parameters when an external protein is allowed to be introduced within the paths (up to the 70% of GO modules show network parameters beyond the random expectation). This fact suggests that functional definitions are incomplete and some genes might still be missing. Conversely, signatures are clearly not capturing the altered functions in the corresponding studies. This is probably because the way in which the genes have been selected in the signatures is too conservative. These results suggest that gene selection methods which take into account relationships among genes should be superior to methods that assume independence among genes outside their functional

  13. Screening genes crucial for pediatric pilocytic astrocytoma using weighted gene coexpression network analysis combined with methylation data analysis.

    PubMed

    Zhao, H; Cai, W; Su, S; Zhi, D; Lu, J; Liu, S

    2014-10-01

    To identify novel genes associated with pediatric pilocytic astrocytoma (PA) for better understanding the molecular mechanism underlying the pediatric PA pathogenesis. Gene expression profile data of GSE50161 and GSE44971 and the methylation data of GSE44684 were downloaded from Gene Expression Omnibus. The differentially expressed genes (DEGs) between PA and normal control samples were screened using the limma package in R, and then used to construct weighted gene coexpression network (WGCN) using the WGCN analysis (WGCNA) package in R. Significant modules of DEGs were selected using the clustering analysis. Function enrichment analysis of the DEGs in significant modules were performed using the WGCNA package and clusterprofiler package in R. Correlation between methylation sites of DEGs and PA was analyzed using the CpGassoc package in R. Totally, 3479 DEGs were screened in PA samples. Thereinto, 3424 DEGs were used to construct the WGCN. Several significant modules of DEGs were selected based on the WGCN, in which the turquoise module was positively related to PA, whereas blue module was negatively related to PA. DEGs (for example, DOCK2 (dedicator of cytokinesis 2), DOCK8 and FCGR2A (Fc fragment of IgG, low affinity IIa)) in blue module were mainly involved in Fc gamma R-mediated phagocytosis pathway and natural killer cell-mediated cytotoxicity pathway. Methylations of 14 DEGs among the top 30 genes in blue module were related to PA. Our data suggest that DOCK2, DOCK8 and FCGR2A may represent potential therapeutic targets in PA that merits further investigation. PMID:25257306

  14. A genome-wide cis-regulatory element discovery method based on promoter sequences and gene co-expression networks

    PubMed Central

    2013-01-01

    Background Deciphering cis-regulatory networks has become an attractive yet challenging task. This paper presents a simple method for cis-regulatory network discovery which aims to avoid some of the common problems of previous approaches. Results Using promoter sequences and gene expression profiles as input, rather than clustering the genes by the expression data, our method utilizes co-expression neighborhood information for each individual gene, thereby overcoming the disadvantages of current clustering based models which may miss specific information for individual genes. In addition, rather than using a motif database as an input, it implements a simple motif count table for each enumerated k-mer for each gene promoter sequence. Thus, it can be used for species where previous knowledge of cis-regulatory motifs is unknown and has the potential to discover new transcription factor binding sites. Applications on Saccharomyces cerevisiae and Arabidopsis have shown that our method has a good prediction accuracy and outperforms a phylogenetic footprinting approach. Furthermore, the top ranked gene-motif regulatory clusters are evidently functionally co-regulated, and the regulatory relationships between the motifs and the enriched biological functions can often be confirmed by literature. Conclusions Since this method is simple and gene-specific, it can be readily utilized for insufficiently studied species or flexibly used as an additional step or data source for previous transcription regulatory networks discovery models. PMID:23368633

  15. Gene Co-Expression Network Analysis Provides Novel Insights into Myostatin Regulation at Three Different Mouse Developmental Timepoints

    PubMed Central

    Yang, Xuerong; Koltes, James E.; Park, Carissa A.; Chen, Daiwen; Reecy, James M.

    2015-01-01

    Myostatin (Mstn) knockout mice exhibit large increases in skeletal muscle mass. However, relatively few of the genes that mediate or modify MSTN effects are known. In this study, we performed co-expression network analysis using whole transcriptome microarray data from MSTN-null and wild-type mice to identify genes involved in important biological processes and pathways related to skeletal muscle and adipose development. Genes differentially expressed between wild-type and MSTN-null mice were further analyzed for shared DNA motifs using DREME. Differentially expressed genes were identified at 13.5 d.p.c. during primary myogenesis and at d35 during postnatal muscle development, but not at 17.5 d.p.c. during secondary myogenesis. In total, 283 and 2034 genes were differentially expressed at 13.5 d.p.c. and d35, respectively. Over-represented transcription factor binding sites in differentially expressed genes included SMAD3, SP1, ZFP187, and PLAGL1. The use of regulatory (RIF) and phenotypic (PIF) impact factor and differential hubbing co-expression analyses identified both known and potentially novel regulators of skeletal muscle growth, including Apobec2, Atp2a2, and Mmp13 at d35 and Sox2, Tmsb4x, and Vdac1 at 13.5 d.p.c. Among the genes with the highest PIF scores were many fiber type specifying genes. The use of RIF, PIF, and differential hubbing analyses identified both known and potentially novel regulators of muscle development. These results provide new details of how MSTN may mediate transcriptional regulation as well as insight into novel regulators of MSTN signal transduction that merit further study regarding their physiological roles in muscle and adipose development. PMID:25695797

  16. Coexpression network analysis of the genes regulated by two types of resistance responses to powdery mildew in wheat.

    PubMed

    Zhang, Juncheng; Zheng, Hongyuan; Li, Yiwen; Li, Hongjie; Liu, Xin; Qin, Huanju; Dong, Lingli; Wang, Daowen

    2016-01-01

    Powdery mildew disease caused by Blumeria graminis f. sp. tritici (Bgt) inflicts severe economic losses in wheat crops. A systematic understanding of the molecular mechanisms involved in wheat resistance to Bgt is essential for effectively controlling the disease. Here, using the diploid wheat Triticum urartu as a host, the genes regulated by immune (IM) and hypersensitive reaction (HR) resistance responses to Bgt were investigated through transcriptome sequencing. Four gene coexpression networks (GCNs) were developed using transcriptomic data generated for 20 T. urartu accessions showing IM, HR or susceptible responses. The powdery mildew resistance regulated (PMRR) genes whose expression was significantly correlated with Bgt resistance were identified, and they tended to be hubs and enriched in six major modules. A wide occurrence of negative regulation of PMRR genes was observed. Three new candidate immune receptor genes (TRIUR3_13045, TRIUR3_01037 and TRIUR3_06195) positively associated with Bgt resistance were discovered. Finally, the involvement of TRIUR3_01037 in Bgt resistance was tentatively verified through cosegregation analysis in a F2 population and functional expression assay in Bgt susceptible leaf cells. This research provides insights into the global network properties of PMRR genes. Potential molecular differences between IM and HR resistance responses to Bgt are discussed. PMID:27033636

  17. Coexpression network analysis of the genes regulated by two types of resistance responses to powdery mildew in wheat

    PubMed Central

    Zhang, Juncheng; Zheng, Hongyuan; Li, Yiwen; Li, Hongjie; Liu, Xin; Qin, Huanju; Dong, Lingli; Wang, Daowen

    2016-01-01

    Powdery mildew disease caused by Blumeria graminis f. sp. tritici (Bgt) inflicts severe economic losses in wheat crops. A systematic understanding of the molecular mechanisms involved in wheat resistance to Bgt is essential for effectively controlling the disease. Here, using the diploid wheat Triticum urartu as a host, the genes regulated by immune (IM) and hypersensitive reaction (HR) resistance responses to Bgt were investigated through transcriptome sequencing. Four gene coexpression networks (GCNs) were developed using transcriptomic data generated for 20 T. urartu accessions showing IM, HR or susceptible responses. The powdery mildew resistance regulated (PMRR) genes whose expression was significantly correlated with Bgt resistance were identified, and they tended to be hubs and enriched in six major modules. A wide occurrence of negative regulation of PMRR genes was observed. Three new candidate immune receptor genes (TRIUR3_13045, TRIUR3_01037 and TRIUR3_06195) positively associated with Bgt resistance were discovered. Finally, the involvement of TRIUR3_01037 in Bgt resistance was tentatively verified through cosegregation analysis in a F2 population and functional expression assay in Bgt susceptible leaf cells. This research provides insights into the global network properties of PMRR genes. Potential molecular differences between IM and HR resistance responses to Bgt are discussed. PMID:27033636

  18. Learning from Co-expression Networks: Possibilities and Challenges

    PubMed Central

    Serin, Elise A. R.; Nijveen, Harm; Hilhorst, Henk W. M.; Ligterink, Wilco

    2016-01-01

    Plants are fascinating and complex organisms. A comprehensive understanding of the organization, function and evolution of plant genes is essential to disentangle important biological processes and to advance crop engineering and breeding strategies. The ultimate aim in deciphering complex biological processes is the discovery of causal genes and regulatory mechanisms controlling these processes. The recent surge of omics data has opened the door to a system-wide understanding of the flow of biological information underlying complex traits. However, dealing with the corresponding large data sets represents a challenging endeavor that calls for the development of powerful bioinformatics methods. A popular approach is the construction and analysis of gene networks. Such networks are often used for genome-wide representation of the complex functional organization of biological systems. Network based on similarity in gene expression are called (gene) co-expression networks. One of the major application of gene co-expression networks is the functional annotation of unknown genes. Constructing co-expression networks is generally straightforward. In contrast, the resulting network of connected genes can become very complex, which limits its biological interpretation. Several strategies can be employed to enhance the interpretation of the networks. A strategy in coherence with the biological question addressed needs to be established to infer reliable networks. Additional benefits can be gained from network-based strategies using prior knowledge and data integration to further enhance the elucidation of gene regulatory relationships. As a result, biological networks provide many more applications beyond the simple visualization of co-expressed genes. In this study we review the different approaches for co-expression network inference in plants. We analyse integrative genomics strategies used in recent studies that successfully identified candidate genes taking advantage of

  19. Time ordering of gene coexpression.

    PubMed

    Leng, Xiaoyan; Müller, Hans-Georg

    2006-10-01

    Temporal microarray gene expression profiles allow characterization of gene function through time dynamics of gene coexpression within the same genetic pathway. In this paper, we define and estimate a global time shift characteristic for each gene via least squares, inferred from pairwise curve alignments. These time shift characteristics of individual genes reflect a time ordering that is derived from ob- served temporal gene expression profiles. Once these time shift characteristics are obtained for each gene, they can be entered into further analyses, such as clustering. We illustrate the proposed methodology using Drosophila embryonic development and yeast cell-cycle gene expression profiles, as well as simulations. Feasibility is demonstrated through the successful recovery of time ordering. Estimated time shifts for Drosophila maternal and zygotic genes provide excellent discrimination between these two categories and confirm known genetic pathways through the time order of gene expression. The application to yeast cell-cycle data establishes a natural time order of genes that is in line with cell-cycle phases. The method does not require periodicity of gene expression profiles. Asymptotic justifications are also provided. PMID:16495429

  20. Co-expression network analysis of differentially expressed genes associated with metastasis in prolactin pituitary tumors.

    PubMed

    Zhang, Wei; Zang, Zhenle; Song, Yechun; Yang, Hui; Yin, Qing

    2014-07-01

    The aim of the present study was to construct a co‑expression network of differently expressed genes (DEGs) in prolactin pituitary (PRL) tumor metastasis. The gene expression profile, GSE22812 was downloaded from the Gene Expression Omnibus database, and including five non‑invasive, two invasive and six aggressive‑invasive PRL tumor samples. Compared with non‑invasive samples, DEGs were identified in invasive and aggressive‑invasive samples using a limma package in R language. The expression values of DEGs were hierarchically clustered. Next, Gene Ontology (GO) function enrichment and Kyoto Encyclopedia of Genes and Genomes pathway analysis of DEGs were performed via The Database for Annotation, Visualization and Integrated Discovery. Finally, gene pairs of DEGs between non‑invasive and aggressive‑invasive samples were identified using the Spearman cor( ) function in R language. Compared with the non‑invasive samples, 61 and 89 DEGs were obtained from invasive and aggressive‑invasive samples, respectively. Cluster analysis showed that four genes were shared by the two samples, including upregulated solute carrier family 2, facilitated glucose transporter member 11 (SLC2A11) and teneurin transmembrane protein 1 (TENM1) and downregulated importin 7 (IPO7) and chromogranin B (CHGB). In the invasive samples, the most significant GO terms responded to cyclic adenosine monophosphate and a glucocorticoid stimulus. However, this occurred in the cell cycle, and was in response to hormone stimulation in aggressive‑invasive samples. The co‑expression network of DEGs showed different gene pairs and modules, and SLC2A11 and CHGB occurred in two co‑expression networks within different co‑expressed pairs. In the present study, the co‑expression network was constructed using bioinformatics methods. SLC2A11, TENM1, IPO7 and CHGB are hypothesized to be closely associated with metastasis of PRL. Furthermore, CHGB and SLC2A11 may be significant in PRL

  1. Coexpression analysis of human genes across many microarray data sets.

    PubMed

    Lee, Homin K; Hsu, Amy K; Sajdak, Jon; Qin, Jie; Pavlidis, Paul

    2004-06-01

    We present a large-scale analysis of mRNA coexpression based on 60 large human data sets containing a total of 3924 microarrays. We sought pairs of genes that were reliably coexpressed (based on the correlation of their expression profiles) in multiple data sets, establishing a high-confidence network of 8805 genes connected by 220,649 "coexpression links" that are observed in at least three data sets. Confirmed positive correlations between genes were much more common than confirmed negative correlations. We show that confirmation of coexpression in multiple data sets is correlated with functional relatedness, and show how cluster analysis of the network can reveal functionally coherent groups of genes. Our findings demonstrate how the large body of accumulated microarray data can be exploited to increase the reliability of inferences about gene function. PMID:15173114

  2. Immuno-Navigator, a batch-corrected coexpression database, reveals cell type-specific gene networks in the immune system

    PubMed Central

    Vandenbon, Alexis; Dinh, Viet H.; Mikami, Norihisa; Kitagawa, Yohko; Teraguchi, Shunsuke; Ohkura, Naganari; Sakaguchi, Shimon

    2016-01-01

    High-throughput gene expression data are one of the primary resources for exploring complex intracellular dynamics in modern biology. The integration of large amounts of public data may allow us to examine general dynamical relationships between regulators and target genes. However, obstacles for such analyses are study-specific biases or batch effects in the original data. Here we present Immuno-Navigator, a batch-corrected gene expression and coexpression database for 24 cell types of the mouse immune system. We systematically removed batch effects from the underlying gene expression data and showed that this removal considerably improved the consistency between inferred correlations and prior knowledge. The data revealed widespread cell type-specific correlation of expression. Integrated analysis tools allow users to use this correlation of expression for the generation of hypotheses about biological networks and candidate regulators in specific cell types. We show several applications of Immuno-Navigator as examples. In one application we successfully predicted known regulators of importance in naturally occurring Treg cells from their expression correlation with a set of Treg-specific genes. For one high-scoring gene, integrin β8 (Itgb8), we confirmed an association between Itgb8 expression in forkhead box P3 (Foxp3)-positive T cells and Treg-specific epigenetic remodeling. Our results also suggest that the regulation of Treg-specific genes within Treg cells is relatively independent of Foxp3 expression, supporting recent results pointing to a Foxp3-independent component in the development of Treg cells. PMID:27078110

  3. Functional Analysis and Characterization of Differential Coexpression Networks

    PubMed Central

    Hsu, Chia-Lang; Juan, Hsueh-Fen; Huang, Hsuan-Cheng

    2015-01-01

    Differential coexpression analysis is emerging as a complement to conventional differential gene expression analysis. The identified differential coexpression links can be assembled into a differential coexpression network (DCEN) in response to environmental stresses or genetic changes. Differential coexpression analyses have been successfully used to identify condition-specific modules; however, the structural properties and biological significance of general DCENs have not been well investigated. Here, we analyzed two independent Saccharomyces cerevisiae DCENs constructed from large-scale time-course gene expression profiles in response to different situations. Topological analyses show that DCENs are tree-like networks possessing scale-free characteristics, but not small-world. Functional analyses indicate that differentially coexpressed gene pairs in DCEN tend to link different biological processes, achieving complementary or synergistic effects. Furthermore, the gene pairs lacking common transcription factors are sensitive to perturbation and hence lead to differential coexpression. Based on these observations, we integrated transcriptional regulatory information into DCEN and identified transcription factors that might cause differential coexpression by gain or loss of activation in response to different situations. Collectively, our results not only uncover the unique structural characteristics of DCEN but also provide new insights into interpretation of DCEN to reveal its biological significance and infer the underlying gene regulatory dynamics. PMID:26282208

  4. Rat Hepatocytes Weighted Gene Co-Expression Network Analysis Identifies Specific Modules and Hub Genes Related to Liver Regeneration after Partial Hepatectomy

    PubMed Central

    Zhou, Yun; Xu, Jiucheng; Liu, Yunqing; Li, Juntao; Chang, Cuifang; Xu, Cunshuan

    2014-01-01

    The recovery of liver mass is mainly mediated by proliferation of hepatocytes after 2/3 partial hepatectomy (PH) in rats. Studying the gene expression profiles of hepatocytes after 2/3 PH will be helpful to investigate the molecular mechanisms of liver regeneration (LR). We report here the first application of weighted gene co-expression network analysis (WGCNA) to analyze the biological implications of gene expression changes associated with LR. WGCNA identifies 12 specific gene modules and some hub genes from hepatocytes genome-scale microarray data in rat LR. The results suggest that upregulated MCM5 may promote hepatocytes proliferation during LR; BCL3 may play an important role by activating or inhibiting NF-kB pathway; MAPK9 may play a permissible role in DNA replication by p38 MAPK inactivation in hepatocytes proliferation stage. Thus, WGCNA can provide novel insight into understanding the molecular mechanisms of LR. PMID:24743545

  5. A Null Model for Pearson Coexpression Networks

    PubMed Central

    Gobbi, Andrea; Jurman, Giuseppe

    2015-01-01

    Gene coexpression networks inferred by correlation from high-throughput profiling such as microarray data represent simple but effective structures for discovering and interpreting linear gene relationships. In recent years, several approaches have been proposed to tackle the problem of deciding when the resulting correlation values are statistically significant. This is most crucial when the number of samples is small, yielding a non-negligible chance that even high correlation values are due to random effects. Here we introduce a novel hard thresholding solution based on the assumption that a coexpression network inferred by randomly generated data is expected to be empty. The threshold is theoretically derived by means of an analytic approach and, as a deterministic independent null model, it depends only on the dimensions of the starting data matrix, with assumptions on the skewness of the data distribution compatible with the structure of gene expression levels data. We show, on synthetic and array datasets, that the proposed threshold is effective in eliminating all false positive links, with an offsetting cost in terms of false negative detected edges. PMID:26030917

  6. A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes

    PubMed Central

    de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806

  7. From SNP co-association to RNA co-expression: Novel insights into gene networks for intramuscular fatty acid composition in porcine

    PubMed Central

    2014-01-01

    Background Fatty acids (FA) play a critical role in energy homeostasis and metabolic diseases; in the context of livestock species, their profile also impacts on meat quality for healthy human consumption. Molecular pathways controlling lipid metabolism are highly interconnected and are not fully understood. Elucidating these molecular processes will aid technological development towards improvement of pork meat quality and increased knowledge of FA metabolism, underpinning metabolic diseases in humans. Results The results from genome-wide association studies (GWAS) across 15 phenotypes were subjected to an Association Weight Matrix (AWM) approach to predict a network of 1,096 genes related to intramuscular FA composition in pigs. To identify the key regulators of FA metabolism, we focused on the minimal set of transcription factors (TF) that the explored the majority of the network topology. Pathway and network analyses pointed towards a trio of TF as key regulators of FA metabolism: NCOA2, FHL2 and EP300. Promoter sequence analyses confirmed that these TF have binding sites for some well-know regulators of lipid and carbohydrate metabolism. For the first time in a non-model species, some of the co-associations observed at the genetic level were validated through co-expression at the transcriptomic level based on real-time PCR of 40 genes in adipose tissue, and a further 55 genes in liver. In particular, liver expression of NCOA2 and EP300 differed between pig breeds (Iberian and Landrace) extreme in terms of fat deposition. Highly clustered co-expression networks in both liver and adipose tissues were observed. EP300 and NCOA2 showed centrality parameters above average in the both networks. Over all genes, co-expression analyses confirmed 28.9% of the AWM predicted gene-gene interactions in liver and 33.0% in adipose tissue. The magnitude of this validation varied across genes, with up to 60.8% of the connections of NCOA2 in adipose tissue being validated via co-expression

  8. Transcriptome comparison and gene coexpression network analysis provide a systems view of citrus response to ‘Candidatus Liberibacter asiaticus’ infection

    PubMed Central

    2013-01-01

    Background Huanglongbing (HLB) is arguably the most destructive disease for the citrus industry. HLB is caused by infection of the bacterium, Candidatus Liberibacter spp. Several citrus GeneChip studies have revealed thousands of genes that are up- or down-regulated by infection with Ca. Liberibacter asiaticus. However, whether and how these host genes act to protect against HLB remains poorly understood. Results As a first step towards a mechanistic view of citrus in response to the HLB bacterial infection, we performed a comparative transcriptome analysis and found that a total of 21 Probesets are commonly up-regulated by the HLB bacterial infection. In addition, a number of genes are likely regulated specifically at early, late or very late stages of the infection. Furthermore, using Pearson correlation coefficient-based gene coexpression analysis, we constructed a citrus HLB response network consisting of 3,507 Probesets and 56,287 interactions. Genes involved in carbohydrate and nitrogen metabolic processes, transport, defense, signaling and hormone response were overrepresented in the HLB response network and the subnetworks for these processes were constructed. Analysis of the defense and hormone response subnetworks indicates that hormone response is interconnected with defense response. In addition, mapping the commonly up-regulated HLB responsive genes into the HLB response network resulted in a core subnetwork where transport plays a key role in the citrus response to the HLB bacterial infection. Moreover, analysis of a phloem protein subnetwork indicates a role for this protein and zinc transporters or zinc-binding proteins in the citrus HLB defense response. Conclusion Through integrating transcriptome comparison and gene coexpression network analysis, we have provided for the first time a systems view of citrus in response to the Ca. Liberibacter spp. infection causing HLB. PMID:23324561

  9. Protein Co-Expression Network Analysis (ProCoNA)

    SciTech Connect

    Gibbs, David L.; Baratt, Arie; Baric, Ralph; Kawaoka, Yoshihiro; Smith, Richard D.; Orwoll, Eric S.; Katze, Michael G.; Mcweeney, Shannon K.

    2013-06-01

    Biological networks are important for elucidating disease etiology due to their ability to model complex high dimensional data and biological systems. Proteomics provides a critical data source for such models, but currently lacks robust de novo methods for network construction, which could bring important insights in systems biology. We have evaluated the construction of network models using methods derived from weighted gene co-expression network analysis (WGCNA). We show that approximately scale-free peptide networks, composed of statistically significant modules, are feasible and biologically meaningful using two mouse lung experiments and one human plasma experiment. Within each network, peptides derived from the same protein are shown to have a statistically higher topological overlap and concordance in abundance, which is potentially important for inferring protein abundance. The module representatives, called eigenpeptides, correlate significantly with biological phenotypes. Furthermore, within modules, we find significant enrichment for biological function and known interactions (gene ontology and protein-protein interactions). Biological networks are important tools in the analysis of complex systems. In this paper we evaluate the application of weighted co-expression network analysis to quantitative proteomics data. Protein co-expression networks allow novel approaches for biological interpretation, quality control, inference of protein abundance, a framework for potentially resolving degenerate peptide-protein mappings, and a biomarker signature discovery.

  10. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance. PMID:25288767

  11. Gene coexpression measures in large heterogeneous samples using count statistics

    PubMed Central

    Wang, Y. X. Rachel; Waterman, Michael S.; Huang, Haiyan

    2014-01-01

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the “big data” challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance. PMID:25288767

  12. Network-Based Identification of Biomarkers Coexpressed with Multiple Pathways

    PubMed Central

    Guo, Nancy Lan; Wan, Ying-Wooi

    2014-01-01

    Unraveling complex molecular interactions and networks and incorporating clinical information in modeling will present a paradigm shift in molecular medicine. Embedding biological relevance via modeling molecular networks and pathways has become increasingly important for biomarker identification in cancer susceptibility and metastasis studies. Here, we give a comprehensive overview of computational methods used for biomarker identification, and provide a performance comparison of several network models used in studies of cancer susceptibility, disease progression, and prognostication. Specifically, we evaluated implication networks, Boolean networks, Bayesian networks, and Pearson’s correlation networks in constructing gene coexpression networks for identifying lung cancer diagnostic and prognostic biomarkers. The results show that implication networks, implemented in Genet package, identified sets of biomarkers that generated an accurate prediction of lung cancer risk and metastases; meanwhile, implication networks revealed more biologically relevant molecular interactions than Boolean networks, Bayesian networks, and Pearson’s correlation networks when evaluated with MSigDB database. PMID:25392692

  13. Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks

    PubMed Central

    2015-01-01

    Background Bladder cancer is the most common malignant tumor of the urinary system and it is a heterogeneous disease with both superficial and invasive growth. However, its aetiological agent is still unclear. And it is indispensable to find key genes or modules causing the bladder cancer. Based on gene expression microarray datasets, constructing differential co-expression networks (DCNs) is an important method to investigate diseases and there have been some relevant good tools such as R package 'WGCNA', 'DCGL'. Results Employing an integrated strategy, 36 up-regulated differentially expressed genes (DEGs) and 356 down-regulated DEGs were selected and main functions of those DEGs are cellular physiological precess(24 up-regulated DEGs; 167 down-regulated DEGs) and cellular metabolism (19 up-regulated DEGs; 104 down-regulated DEGs). The up-regulated DEGs are mainly involved in the the pathways related to "metabolism". By comparing two DCNs between the normal and cancer states, we found some great changes in hub genes and topological structure, which suggest that the modules of two different DCNs change a lot. Especially, we screened some hub genes of a differential subnetwork between the normal and the cancer states and then do bioinformatics analysis for them. Conclusions Through constructing and analyzing two differential co-expression networks at different states using the screened DEGs, we found some hub genes associated with the bladder cancer. The results of the bioinformatics analysis for those hub genes will support the biological experiments and the further treatment of the bladder cancer. PMID:25707808

  14. Integrating mRNA and miRNA Weighted Gene Co-Expression Networks with eQTLs in the Nucleus Accumbens of Subjects with Alcohol Dependence.

    PubMed

    Mamdani, Mohammed; Williamson, Vernell; McMichael, Gowon O; Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; van der Vaart, Andrew D; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S; Miles, Michael F; Dick, Danielle; Riley, Brien P; Dumur, Catherine; Vladimirov, Vladimir I

    2015-01-01

    Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263

  15. Integrating mRNA and miRNA Weighted Gene Co-Expression Networks with eQTLs in the Nucleus Accumbens of Subjects with Alcohol Dependence

    PubMed Central

    Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; D. van der Vaart, Andrew; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S.; Miles, Michael F.; Dick, Danielle; Riley, Brien P.; Dumur, Catherine; Vladimirov, Vladimir I.

    2015-01-01

    Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263

  16. Understanding the progression of atherosclerosis through gene profiling and co-expression network analysis in Apob(tm2Sgy)Ldlr(tm1Her) double knockout mice.

    PubMed

    Deshpande, Vrushali; Sharma, Ankit; Mukhopadhyay, Rupak; Thota, Lakshmi Narasimha Rao; Ghatge, Madankumar; Vangala, Rajani Kanth; Kakkar, Vijay V; Mundkur, Lakshmi

    2016-06-01

    The objective of the study was to gain molecular insights into the progression of atherosclerosis in Apob(tm2Sgy)Ldlr(tm1Her) mice, using transcriptome profiles. Weighted gene co network analysis (WGCNA) and time course analysis using limma were used to study disease progression from 0 to 20weeks. Five co-expression modules were identified by WGCNA using the expression values of 2153 genes. Genes associated with autophagy, endoplasmic reticulum stress, inflammation and lipid metabolism were differentially expressed at early stages of atherosclerosis. Time course analysis highlighted activation of inflammatory gene signaling at 4weeks, cell proliferation and calcification at 8weeks, amyloid like structures and oxidative stress at 14weeks and enhanced production of inflammatory cytokines at 20weeks. Our results suggest that maximum gene perturbations occur during early atherosclerosis which could be the danger signals associated with subclinical disease. Understanding these genes and associated pathways can help in improvement of diagnostic and therapeutic targets for atherosclerosis. PMID:27133569

  17. Map3k1, Il6st, Gzmk, and Hspb3 gene coexpression network in the mechanism of freezing reaction in mice.

    PubMed

    Kondaurova, Elena M; Naumenko, Vladimir S; Sinyakova, Nadezda A; Kulikov, Alexander V

    2011-02-01

    Freezing reaction (catalepsy) is a natural passive defensive strategy in animals. An exaggerated form of catalepsy is a symptom of grave brain dysfunction. Catalepsy in mice was shown to be linked to the Map3k1, Il6st, Gzmk, and Hspb3 genes as potential candidates for a high predisposition to catalepsy. The study sought to test the hypothesis of an association between catalepsy and expression of these genes in the brain. Thegenes' mRNA levels were measured in the hypothalamus, hippocampus, frontal cortex, striatum, and midbrain of catalepsy-resistant AKR/J strain and catalepsy-prone strains CBA/Lac, ASC (antidepressant-sensitive cataleptic) and the congenic line AKR.CBA-D13M76C. No association between expression of any investigated genes and predisposition to catalepsy was found. At the same time, multivariate analysis revealed interactions among the expressions of Map3k1, Il6st, Gzmk, and Hspb3 genes in the brain structures. A factor analysis of all variables produced two independent factors explaining 76.2% of the total variance. The catalepsy-resistant AKR strain was distinguished from the catalepsy-prone strains CBA, ASC, and AKR.CBA-D13M76C by factor 1. It was suggested that a high predisposition to catalepsy in mice can be defined by the Map3k1, Il6st, Gzmk, and Hspb3 genes' coexpression network. PMID:21162133

  18. Discriminative gene co-expression network analysis uncovers novel modules involved in the formation of phosphate deficiency-induced root hairs in Arabidopsis.

    PubMed

    Salazar-Henao, Jorge E; Lin, Wen-Dar; Schmidt, Wolfgang

    2016-01-01

    Cell fate and differentiation in the Arabidopsis root epidermis are genetically defined but remain plastic to environmental signals such as limited availability of inorganic phosphate (Pi). Root hairs of Pi-deficient plants are more frequent and longer than those of plants grown under Pi-replete conditions. To dissect genes involved in Pi deficiency-induced root hair morphogenesis, we constructed a co-expression network of Pi-responsive genes against a customized database that was assembled from experiments in which differentially expressed genes that encode proteins with validated functions in root hair development were over-represented. To further filter out less relevant genes, we combined this procedure with a search for common cis-regulatory elements in the promoters of the selected genes. In addition to well-described players and processes such as auxin signalling and modifications of primary cell walls, we discovered several novel aspects in the biology of root hairs induced by Pi deficiency, including cell cycle control, putative plastid-to-nucleus signalling, pathogen defence, reprogramming of cell wall-related carbohydrate metabolism, and chromatin remodelling. This approach allows the discovery of novel of aspects of a biological process from transcriptional profiles with high sensitivity and accuracy. PMID:27220366

  19. Discriminative gene co-expression network analysis uncovers novel modules involved in the formation of phosphate deficiency-induced root hairs in Arabidopsis

    PubMed Central

    Salazar-Henao, Jorge E.; Lin, Wen-Dar; Schmidt, Wolfgang

    2016-01-01

    Cell fate and differentiation in the Arabidopsis root epidermis are genetically defined but remain plastic to environmental signals such as limited availability of inorganic phosphate (Pi). Root hairs of Pi-deficient plants are more frequent and longer than those of plants grown under Pi-replete conditions. To dissect genes involved in Pi deficiency-induced root hair morphogenesis, we constructed a co-expression network of Pi-responsive genes against a customized database that was assembled from experiments in which differentially expressed genes that encode proteins with validated functions in root hair development were over-represented. To further filter out less relevant genes, we combined this procedure with a search for common cis-regulatory elements in the promoters of the selected genes. In addition to well-described players and processes such as auxin signalling and modifications of primary cell walls, we discovered several novel aspects in the biology of root hairs induced by Pi deficiency, including cell cycle control, putative plastid-to-nucleus signalling, pathogen defence, reprogramming of cell wall-related carbohydrate metabolism, and chromatin remodelling. This approach allows the discovery of novel of aspects of a biological process from transcriptional profiles with high sensitivity and accuracy. PMID:27220366

  20. Anesthetic Propofol-Induced Gene Expression Changes in Patients Undergoing Coronary Artery Bypass Graft Surgery Based on Dynamical Differential Coexpression Network Analysis

    PubMed Central

    Huang, Li-Jun; Chen, Na-Mi

    2016-01-01

    We aimed to determine the influence of anesthetic propofol on gene expression in patients treated by coronary artery bypass graft (CABG) surgery based on differential coexpression network (DCN) and to further reveal the novel mechanisms of the cardioprotective effects of propofol. Firstly, we constructed the DCN for disease condition based on Pearson correlation coefficient (PCC) and weight value. Secondly, the inference of modules was applied to search modules from DCN with same members but varied connectivity. Furthermore, we measured the statistical significance of the modules for selecting differential modules (DMs). Finally, attract method was used for DMs analysis to select key modules. Based on the δ value, 11928 edges and 2956 nodes were chosen to construct DCNs. A total of 29 seed genes were selected. Moreover, by quantifying connectivity changes in shared gene modules across different conditions, 8 DMs with higher connectivity dynamics were identified. Then, we extracted key modules using attract method, there were 8 key modules, and the top 3 modules were module 1, 2, and 3. Furthermore, GCG, PPY, and PON1 were initial seed genes of these 3 key modules, respectively. Accordingly, GCG and PON1 might exert important roles in the cardioprotective effects of propofol during CABG. PMID:27437027

  1. Anesthetic Propofol-Induced Gene Expression Changes in Patients Undergoing Coronary Artery Bypass Graft Surgery Based on Dynamical Differential Coexpression Network Analysis.

    PubMed

    Yu, Da; Huang, Li-Jun; Chen, Na-Mi

    2016-01-01

    We aimed to determine the influence of anesthetic propofol on gene expression in patients treated by coronary artery bypass graft (CABG) surgery based on differential coexpression network (DCN) and to further reveal the novel mechanisms of the cardioprotective effects of propofol. Firstly, we constructed the DCN for disease condition based on Pearson correlation coefficient (PCC) and weight value. Secondly, the inference of modules was applied to search modules from DCN with same members but varied connectivity. Furthermore, we measured the statistical significance of the modules for selecting differential modules (DMs). Finally, attract method was used for DMs analysis to select key modules. Based on the δ value, 11928 edges and 2956 nodes were chosen to construct DCNs. A total of 29 seed genes were selected. Moreover, by quantifying connectivity changes in shared gene modules across different conditions, 8 DMs with higher connectivity dynamics were identified. Then, we extracted key modules using attract method, there were 8 key modules, and the top 3 modules were module 1, 2, and 3. Furthermore, GCG, PPY, and PON1 were initial seed genes of these 3 key modules, respectively. Accordingly, GCG and PON1 might exert important roles in the cardioprotective effects of propofol during CABG. PMID:27437027

  2. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes.

    PubMed

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-05-31

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  3. Genes Frequently Coexpressed with Hoxc8 Provide Insight into the Discovery of Target Genes

    PubMed Central

    Kalyani, Ruthala; Lee, Ji-Yeon; Min, Hyehyun; Yoon, Heejei; Kim, Myoung Hee

    2016-01-01

    Identifying Hoxc8 target genes is at the crux of understanding the Hoxc8-mediated regulatory networks underlying its roles during development. However, identification of these genes remains difficult due to intrinsic factors of Hoxc8, such as low DNA binding specificity, context-dependent regulation, and unknown cofactors. Therefore, as an alternative, the present study attempted to test whether the roles of Hoxc8 could be inferred by simply analyzing genes frequently coexpressed with Hoxc8, and whether these genes include putative target genes. Using archived gene expression datasets in which Hoxc8 was differentially expressed, we identified a total of 567 genes that were positively coexpressed with Hoxc8 in at least four out of eight datasets. Among these, 23 genes were coexpressed in six datasets. Gene sets associated with extracellular matrix and cell adhesion were most significantly enriched, followed by gene sets for skeletal system development, morphogenesis, cell motility, and transcriptional regulation. In particular, transcriptional regulators, including paralogs of Hoxc8, known Hox co-factors, and transcriptional remodeling factors were enriched. We randomly selected Adam19, Ptpn13, Prkd1, Tgfbi, and Aldh1a3, and validated their coexpression in mouse embryonic tissues and cell lines following TGF-β2 treatment or ectopic Hoxc8 expression. Except for Aldh1a3, all genes showed concordant expression with that of Hoxc8, suggesting that the coexpressed genes might include direct or indirect target genes. Collectively, we suggest that the coexpressed genes provide a resource for constructing Hoxc8-mediated regulatory networks. PMID:27025388

  4. Transcriptional modules related to hepatocellular carcinoma survival: coexpression network analysis.

    PubMed

    Xu, Xinsen; Zhou, Yanyan; Miao, Runchen; Chen, Wei; Qu, Kai; Pang, Qing; Liu, Chang

    2016-06-01

    We performed weighted gene coexpression network analysis (WGCNA) to gain insights into the molecular aspects of hepatocellular carcinoma (HCC). Raw microarray datasets (including 488 samples) were downloaded from the Gene Expression Omnibus (GEO) website. Data were normalized using the RMA algorithm. We utilized the WGCNA to identify the coexpressed genes (modules) after non-specific filtering. Correlation and survival analyses were conducted using the modules, and gene ontology (GO) enrichment was applied to explore the possible mechanisms. Eight distinct modules were identified by the WGCNA. Pink and red modules were associated with liver function, whereas turquoise and black modules were inversely correlated with tumor staging. Poor outcomes were found in the low expression group in the turquoise module and in the high expression group in the red module. In addition, GO enrichment analysis suggested that inflammation, immune, virus-related, and interferon-mediated pathways were enriched in the turquoise module. Several potential biomarkers, such as cyclin-dependent kinase 1 (CDK1), topoisomerase 2α (TOP2A), and serpin peptidase inhibitor clade C (antithrombin) member 1 (SERPINC1), were also identified. In conclusion, gene signatures identified from the genome-based assays could contribute to HCC stratification. WGCNA was able to identify significant groups of genes associated with cancer prognosis. PMID:27052251

  5. Weighted gene co-expression network analysis of colorectal cancer liver metastasis genome sequencing data and screening of anti-metastasis drugs.

    PubMed

    Gao, Bo; Shao, Qin; Choudhry, Hani; Marcus, Victoria; Dong, Kung; Ragoussis, Jiannis; Gao, Zu-Hua

    2016-09-01

    Approximately 9% of cancer-related deaths are caused by colorectal cancer (CRC). CRC patients are prone to liver metastasis, which is the most important cause for the high CRC mortality rate. Understanding the molecular mechanism of CRC liver metastasis could help us to find novel targets for the effective treatment of this deadly disease. Using weighted gene co-expression network analysis on the sequencing data of CRC with and with metastasis, we identified 5 colorectal cancer liver metastasis related modules which were labeled as brown, blue, grey, yellow and turquoise. In the brown module, which represents the metastatic tumor in the liver, gene ontology (GO) analysis revealed functions including the G-protein coupled receptor protein signaling pathway, epithelial cell differentiation and cell surface receptor linked signal transduction. In the blue module, which represents the primary CRC that has metastasized, GO analysis showed that the genes were mainly enriched in GO terms including G-protein coupled receptor protein signaling pathway, cell surface receptor linked signal transduction, and negative regulation of cell differentiation. In the yellow and turquoise modules, which represent the primary non-metastatic CRC, 13 downregulated CRC liver metastasis-related candidate miRNAs were identified (e.g. hsa-miR-204, hsa-miR-455, etc.). Furthermore, analyzing the DrugBank database and mining the literature identified 25 and 12 candidate drugs that could potentially block the metastatic processes of the primary tumor and inhibit the progression of metastatic tumors in the liver, respectively. Data generated from this study not only furthers our understanding of the genetic alterations that drive the metastatic process, but also guides the development of molecular-targeted therapy of colorectal cancer liver metastasis. PMID:27571956

  6. ImmuCo: a database of gene co-expression in immune cells.

    PubMed

    Wang, Pingzhang; Qi, Huiying; Song, Shibin; Li, Shuang; Huang, Ningyu; Han, Wenling; Ma, Dalong

    2015-01-01

    Current gene co-expression databases and correlation networks do not support cell-specific analysis. Gene co-expression and expression correlation are subtly different phenomena, although both are likely to be functionally significant. Here, we report a new database, ImmuCo (http://immuco.bjmu.edu.cn), which is a cell-specific database that contains information about gene co-expression in immune cells, identifying co-expression and correlation between any two genes. The strength of co-expression of queried genes is indicated by signal values and detection calls, whereas expression correlation and strength are reflected by Pearson correlation coefficients. A scatter plot of the signal values is provided to directly illustrate the extent of co-expression and correlation. In addition, the database allows the analysis of cell-specific gene expression profile across multiple experimental conditions and can generate a list of genes that are highly correlated with the queried genes. Currently, the database covers 18 human cell groups and 10 mouse cell groups, including 20,283 human genes and 20,963 mouse genes. More than 8.6 × 10(8) and 7.4 × 10(8) probe set combinations are provided for querying each human and mouse cell group, respectively. Sample applications support the distinctive advantages of the database. PMID:25326331

  7. Co-expression network-based analysis of hippocampal expression data associated with Alzheimer's disease using a novel algorithm

    PubMed Central

    YUE, HONG; YANG, BO; YANG, FANG; HU, XIAO-LI; KONG, FAN-BIN

    2016-01-01

    Recent progress in bioinformatics has facilitated the clarification of biological processes associated with complex diseases. Numerous methods of co-expression analysis have been proposed for use in the study of pairwise relationships among genes. In the present study, a combined network based on gene pairs was constructed following the conversion and combination of gene pair score values using a novel algorithm across multiple approaches. Three hippocampal expression profiles of patients with Alzheimer's disease (AD) and normal controls were extracted from the ArrayExpress database, and a total of 144 differentially expressed (DE) genes across multiple studies were identified by a rank product (RP) method. Five groups of co-expression gene pairs and five networks were identified and constructed using four existing methods [weighted gene co-expression network analysis (WGCNA), empirical Bayesian (EB), differentially co-expressed genes and links (DCGL), search tool for the retrieval of interacting genes/proteins database (STRING)] and a novel rank-based algorithm with combined score, respectively. Topological analysis indicated that the co-expression network constructed by the WGCNA method had the tendency to exhibit small-world characteristics, and the combined co-expression network was confirmed to be a scale-free network. Functional analysis of the co-expression gene pairs was conducted by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. The co-expression gene pairs were mostly enriched in five pathways, namely proteasome, oxidative phosphorylation, Parkinson's disease, Huntington's disease and AD. This study provides a new perspective to co-expression analysis. Since different methods of analysis often present varying abilities, the novel combination algorithm may provide a more credible and robust outcome, and could be used to complement to traditional co-expression analysis. PMID:27168792

  8. Genomic Complexity Places Less Restrictions on the Evolution of Young Coexpression Networks than Protein–Protein Interactions

    PubMed Central

    Wei, Wen; Jin, Yan-Ting; Du, Meng-Ze; Wang, Ju; Rao, Nini; Guo, Feng-Biao

    2016-01-01

    The differences in evolutionary patterns of young protein–protein interactions (PPIs) among distinct species have long been a puzzle. However, based on our genome-wide analysis of available integrated experimental data, we confirm that young genes preferentially integrate into ancestral PPI networks, and that this manner is consistent in all of six model organisms with widely different levels of phenotypic complexity. We demonstrate that the level of restrictions placed on the evolution of biological networks declines with a decrease of phenotypic complexity. Compared with young PPI networks, new co-expression links have less evolutionary restrictions, so a young gene with a high possibility to be coexpressed other young genes relatively frequently emerges in the four simpler genomes among the six studied. However, it is not favorable for such young–young coexpression in terms of a young gene evolving into a coexpression hub, so the coexpression pattern could gradually decline. To explain this apparent contradiction, we suggest that young genes that are initially peripheral to networks are temporarily coexpressed with other young genes, driving functional evolution because of low selective pressure. However, as the expression levels of genes increase and they gradually develop a greater effect on fitness, young genes start to be coexpressed more with members of ancestral networks and less with other young genes. Our findings provide new insights into the evolution of biological networks. PMID:27521813

  9. Pathways of Lipid Metabolism in Marine Algae, Co-Expression Network, Bottlenecks and Candidate Genes for Enhanced Production of EPA and DHA in Species of Chromista

    PubMed Central

    Mühlroth, Alice; Li, Keshuai; Røkke, Gunvor; Winge, Per; Olsen, Yngvar; Hohmann-Marriott, Martin F.; Vadstein, Olav; Bones, Atle M.

    2013-01-01

    The importance of n-3 long chain polyunsaturated fatty acids (LC-PUFAs) for human health has received more focus the last decades, and the global consumption of n-3 LC-PUFA has increased. Seafood, the natural n-3 LC-PUFA source, is harvested beyond a sustainable capacity, and it is therefore imperative to develop alternative n-3 LC-PUFA sources for both eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3). Genera of algae such as Nannochloropsis, Schizochytrium, Isochrysis and Phaedactylum within the kingdom Chromista have received attention due to their ability to produce n-3 LC-PUFAs. Knowledge of LC-PUFA synthesis and its regulation in algae at the molecular level is fragmentary and represents a bottleneck for attempts to enhance the n-3 LC-PUFA levels for industrial production. In the present review, Phaeodactylum tricornutum has been used to exemplify the synthesis and compartmentalization of n-3 LC-PUFAs. Based on recent transcriptome data a co-expression network of 106 genes involved in lipid metabolism has been created. Together with recent molecular biological and metabolic studies, a model pathway for n-3 LC-PUFA synthesis in P. tricornutum has been proposed, and is compared to industrialized species of Chromista. Limitations of the n-3 LC-PUFA synthesis by enzymes such as thioesterases, elongases, acyl-CoA synthetases and acyltransferases are discussed and metabolic bottlenecks are hypothesized such as the supply of the acetyl-CoA and NADPH. A future industrialization will depend on optimization of chemical compositions and increased biomass production, which can be achieved by exploitation of the physiological potential, by selective breeding and by genetic engineering. PMID:24284429

  10. Pathways of lipid metabolism in marine algae, co-expression network, bottlenecks and candidate genes for enhanced production of EPA and DHA in species of Chromista.

    PubMed

    Mühlroth, Alice; Li, Keshuai; Røkke, Gunvor; Winge, Per; Olsen, Yngvar; Hohmann-Marriott, Martin F; Vadstein, Olav; Bones, Atle M

    2013-11-01

    The importance of n-3 long chain polyunsaturated fatty acids (LC-PUFAs) for human health has received more focus the last decades, and the global consumption of n-3 LC-PUFA has increased. Seafood, the natural n-3 LC-PUFA source, is harvested beyond a sustainable capacity, and it is therefore imperative to develop alternative n-3 LC-PUFA sources for both eicosapentaenoic acid (EPA, 20:5n-3) and docosahexaenoic acid (DHA, 22:6n-3). Genera of algae such as Nannochloropsis, Schizochytrium, Isochrysis and Phaedactylum within the kingdom Chromista have received attention due to their ability to produce n-3 LC-PUFAs. Knowledge of LC-PUFA synthesis and its regulation in algae at the molecular level is fragmentary and represents a bottleneck for attempts to enhance the n-3 LC-PUFA levels for industrial production. In the present review, Phaeodactylum tricornutum has been used to exemplify the synthesis and compartmentalization of n-3 LC-PUFAs. Based on recent transcriptome data a co-expression network of 106 genes involved in lipid metabolism has been created. Together with recent molecular biological and metabolic studies, a model pathway for n-3 LC-PUFA synthesis in P. tricornutum has been proposed, and is compared to industrialized species of Chromista. Limitations of the n-3 LC-PUFA synthesis by enzymes such as thioesterases, elongases, acyl-CoA synthetases and acyltransferases are discussed and metabolic bottlenecks are hypothesized such as the supply of the acetyl-CoA and NADPH. A future industrialization will depend on optimization of chemical compositions and increased biomass production, which can be achieved by exploitation of the physiological potential, by selective breeding and by genetic engineering. PMID:24284429

  11. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  12. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research.

    PubMed

    Li, Junyi; Li, Yi-Xue; Li, Yuan-Yuan

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies. PMID:27597964

  13. Differential Regulatory Analysis Based on Coexpression Network in Cancer Research

    PubMed Central

    2016-01-01

    With rapid development of high-throughput techniques and accumulation of big transcriptomic data, plenty of computational methods and algorithms such as differential analysis and network analysis have been proposed to explore genome-wide gene expression characteristics. These efforts are aiming to transform underlying genomic information into valuable knowledges in biological and medical research fields. Recently, tremendous integrative research methods are dedicated to interpret the development and progress of neoplastic diseases, whereas differential regulatory analysis (DRA) based on gene coexpression network (GCN) increasingly plays a robust complement to regular differential expression analysis in revealing regulatory functions of cancer related genes such as evading growth suppressors and resisting cell death. Differential regulatory analysis based on GCN is prospective and shows its essential role in discovering the system properties of carcinogenesis features. Here we briefly review the paradigm of differential regulatory analysis based on GCN. We also focus on the applications of differential regulatory analysis based on GCN in cancer research and point out that DRA is necessary and extraordinary to reveal underlying molecular mechanism in large-scale carcinogenesis studies. PMID:27597964

  14. ComPlEx: conservation and divergence of co-expression networks in A. thaliana, Populus and O. sativa

    PubMed Central

    2014-01-01

    Background Divergence in gene regulation has emerged as a key mechanism underlying species differentiation. Comparative analysis of co-expression networks across species can reveal conservation and divergence in the regulation of genes. Results We inferred co-expression networks of A. thaliana, Populus spp. and O. sativa using state-of-the-art methods based on mutual information and context likelihood of relatedness, and conducted a comprehensive comparison of these networks across a range of co-expression thresholds. In addition to quantifying gene-gene link and network neighbourhood conservation, we also applied recent advancements in network analysis to do cross-species comparisons of network properties such as scale free characteristics and gene centrality as well as network motifs. We found that in all species the networks emerged as scale free only above a certain co-expression threshold, and that the high-centrality genes upholding this organization tended to be conserved. Network motifs, in particular the feed-forward loop, were found to be significantly enriched in specific functional subnetworks but where much less conserved across species than gene centrality. Although individual gene-gene co-expression had massively diverged, up to ~80% of the genes still had a significantly conserved network neighbourhood. For genes with multiple predicted orthologs, about half had one ortholog with conserved regulation and another ortholog with diverged or non-conserved regulation. Furthermore, the most sequence similar ortholog was not the one with the most conserved gene regulation in over half of the cases. Conclusions We have provided a comprehensive analysis of gene regulation evolution in plants and built a web tool for Comparative analysis of Plant co-Expression networks (ComPlEx, http://complex.plantgenie.org/). The tool can be particularly useful for identifying the ortholog with the most conserved regulation among several sequence-similar alternatives and

  15. MIClique: An algorithm to identify differentially coexpressed disease gene subset from microarray data.

    PubMed

    Zhang, Huanping; Song, Xiaofeng; Wang, Huinan; Zhang, Xiaobai

    2009-01-01

    Computational analysis of microarray data has provided an effective way to identify disease-related genes. Traditional disease gene selection methods from microarray data such as statistical test always focus on differentially expressed genes in different samples by individual gene prioritization. These traditional methods might miss differentially coexpressed (DCE) gene subsets because they ignore the interaction between genes. In this paper, MIClique algorithm is proposed to identify DEC gene subsets based on mutual information and clique analysis. Mutual information is used to measure the coexpression relationship between each pair of genes in two different kinds of samples. Clique analysis is a commonly used method in biological network, which generally represents biological module of similar function. By applying the MIClique algorithm to real gene expression data, some DEC gene subsets which correlated under one experimental condition but uncorrelated under another condition are detected from the graph of colon dataset and leukemia dataset. PMID:20169000

  16. VTCdb: a gene co-expression database for the crop species Vitis vinifera (grapevine)

    PubMed Central

    2013-01-01

    Background Gene expression datasets in model plants such as Arabidopsis have contributed to our understanding of gene function and how a single underlying biological process can be governed by a diverse network of genes. The accumulation of publicly available microarray data encompassing a wide range of biological and environmental conditions has enabled the development of additional capabilities including gene co-expression analysis (GCA). GCA is based on the understanding that genes encoding proteins involved in similar and/or related biological processes may exhibit comparable expression patterns over a range of experimental conditions, developmental stages and tissues. We present an open access database for the investigation of gene co-expression networks within the cultivated grapevine, Vitis vinifera. Description The new gene co-expression database, VTCdb (http://vtcdb.adelaide.edu.au/Home.aspx), offers an online platform for transcriptional regulatory inference in the cultivated grapevine. Using condition-independent and condition-dependent approaches, grapevine co-expression networks were constructed using the latest publicly available microarray datasets from diverse experimental series, utilising the Affymetrix Vitis vinifera GeneChip (16 K) and the NimbleGen Grape Whole-genome microarray chip (29 K), thus making it possible to profile approximately 29,000 genes (95% of the predicted grapevine transcriptome). Applications available with the online platform include the use of gene names, probesets, modules or biological processes to query the co-expression networks, with the option to choose between Affymetrix or Nimblegen datasets and between multiple co-expression measures. Alternatively, the user can browse existing network modules using interactive network visualisation and analysis via CytoscapeWeb. To demonstrate the utility of the database, we present examples from three fundamental biological processes (berry development, photosynthesis and

  17. Co-expression analysis reveals a group of genes potentially involved in regulation of plant response to iron-deficiency.

    PubMed

    Li, Hua; Wang, Lei; Yang, Zhi Min

    2015-01-01

    Iron (Fe) is an essential element for plant growth and development. Iron deficiency results in abnormal metabolisms from respiration to photosynthesis. Exploration of Fe-deficient responsive genes and their networks is critically important to understand molecular mechanisms leading to the plant adaptation to soil Fe-limitation. Co-expression genes are a cluster of genes that have a similar expression pattern to execute relatively biological functions at a stage of development or under a certain environmental condition. They may share a common regulatory mechanism. In this study, we investigated Fe-starved-related co-expression genes from Arabidopsis. From the biological process GO annotation of TAIR (The Arabidopsis Information Resource), 180 iron-deficient responsive genes were detected. Using ATTED-II database, we generated six gene co-expression networks. Among these, two modules of PYE and IRT1 were successfully constructed. There are 30 co-expression genes that are incorporated in the two modules (12 in PYE-module and 18 in IRT1-module). Sixteen of the co-expression genes were well characterized. The remaining genes (14) are poorly or not functionally identified with iron stress. Validation of the 14 genes using real-time PCR showed differential expression under iron-deficiency. Most of the co-expression genes (23/30) could be validated in pye and fit mutant plants with iron-deficiency. We further identified iron-responsive cis-elements upstream of the co-expression genes and found that 22 out of 30 genes contain the iron-responsive motif IDE1. Furthermore, some auxin and ethylene-responsive elements were detected in the promoters of the co-expression genes. These results suggest that some of the genes can be also involved in iron stress response through the phytohormone-responsive pathways. PMID:25300251

  18. Co-expression network analysis identifies transcriptional modules in the mouse liver.

    PubMed

    Liu, Wei; Ye, Hua

    2014-10-01

    The mouse liver transcriptome has been extensively studied but little is known about the global hepatic gene network of the mouse under normal physiological conditions. Understanding this will help reveal the transcriptional organization of the liver and elucidate its functional complexity. Here, weighted gene co-expression network analysis (WGCNA) was carried out to explore gene co-expression networks using large-scale microarray data from normal mouse livers. A total of 7,203 genes were parsed into 16 gene modules associated with protein catabolism, RNA processing, muscle contraction, transcriptional regulation, oxidation reduction, sterol biosynthesis, translation, fatty acid metabolism, immune response and others. The modules were organized into higher order co-expression groups. Hub genes in each module were found to be critical for module function. In sum, the analyses revealed the gene modular map of the mouse liver under normal physiological condition. These results provide a systems-level framework to help understand the complexity of the mouse liver at the molecular level, and should be beneficial in annotating uncharacterized genes. PMID:24816893

  19. Sharing and Specificity of Co-expression Networks across 35 Human Tissues

    PubMed Central

    Pierson, Emma; Koller, Daphne; Battle, Alexis; Mostafavi, Sara

    2015-01-01

    To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are available for a majority of the tissues, and therefore statistical inference of networks in this setting is highly underpowered. To address this problem, we infer tissue-specific gene co-expression networks for 35 tissues in the GTEx dataset using a novel algorithm, GNAT, that uses a hierarchy of tissues to share data between related tissues. We show that this transfer learning approach increases the accuracy with which networks are learned. Analysis of these networks reveals that tissue-specific transcription factors are hubs that preferentially connect to genes with tissue specific functions. Additionally, we observe that genes with tissue-specific functions lie at the peripheries of our networks. We identify numerous modules enriched for Gene Ontology functions, and show that modules conserved across tissues are especially likely to have functions common to all tissues, while modules that are upregulated in a particular tissue are often instrumental to tissue-specific function. Finally, we provide a web tool, available at mostafavilab.stat.ubc.ca/GNAT, which allows exploration of gene function and regulation in a tissue-specific manner. PMID:25970446

  20. CoExpNetViz: Comparative Co-Expression Networks Construction and Visualization Tool

    PubMed Central

    Tzfadia, Oren; Diels, Tim; De Meyer, Sam; Vandepoele, Klaas; Aharoni, Asaph; Van de Peer, Yves

    2016-01-01

    Motivation: Comparative transcriptomics is a common approach in functional gene discovery efforts. It allows for finding conserved co-expression patterns between orthologous genes in closely related plant species, suggesting that these genes potentially share similar function and regulation. Several efficient co-expression-based tools have been commonly used in plant research but most of these pipelines are limited to data from model systems, which greatly limit their utility. Moreover, in addition, none of the existing pipelines allow plant researchers to make use of their own unpublished gene expression data for performing a comparative co-expression analysis and generate multi-species co-expression networks. Results: We introduce CoExpNetViz, a computational tool that uses a set of query or “bait” genes as an input (chosen by the user) and a minimum of one pre-processed gene expression dataset. The CoExpNetViz algorithm proceeds in three main steps; (i) for every bait gene submitted, co-expression values are calculated using mutual information and Pearson correlation coefficients, (ii) non-bait (or target) genes are grouped based on cross-species orthology, and (iii) output files are generated and results can be visualized as network graphs in Cytoscape. Availability: The CoExpNetViz tool is freely available both as a PHP web server (link: http://bioinformatics.psb.ugent.be/webtools/coexpr/) (implemented in C++) and as a Cytoscape plugin (implemented in Java). Both versions of the CoExpNetViz tool support LINUX and Windows platforms. PMID:26779228

  1. Novel structural co-expression analysis linking the NPM1-associated ribosomal biogenesis network to chronic myelogenous leukemia

    PubMed Central

    Chan, Lawrence WC; Lin, Xihong; Yung, Godwin; Lui, Thomas; Chiu, Ya Ming; Wang, Fengfeng; Tsui, Nancy BY; Cho, William CS; Yip, SP; Siu, Parco M.; Wong, SC Cesar; Yung, Benjamin YM

    2015-01-01

    Co-expression analysis reveals useful dysregulation patterns of gene cooperativeness for understanding cancer biology and identifying new targets for treatment. We developed a structural strategy to identify co-expressed gene networks that are important for chronic myelogenous leukemia (CML). This strategy compared the distributions of expressional correlations between CML and normal states, and it identified a data-driven threshold to classify strongly co-expressed networks that had the best coherence with CML. Using this strategy, we found a transcriptome-wide reduction of co-expression connectivity in CML, reflecting potentially loosened molecular regulation. Conversely, when we focused on nucleophosmin 1 (NPM1) associated networks, NPM1 established more co-expression linkages with BCR-ABL pathways and ribosomal protein networks in CML than normal. This finding implicates a new role of NPM1 in conveying tumorigenic signals from the BCR-ABL oncoprotein to ribosome biogenesis, affecting cellular growth. Transcription factors may be regulators of the differential co-expression patterns between CML and normal. PMID:26205693

  2. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    PubMed

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. PMID:26619072

  3. A co-expression modules based gene selection for cancer recognition.

    PubMed

    Lu, Xinguo; Deng, Yong; Huang, Lei; Feng, Bingtao; Liao, Bo

    2014-12-01

    Gene expression profiles are used to recognize patient samples for cancer diagnosis and therapy. Gene selection is crucial to high recognition performance. In usual gene selection methods the genes are considered as independent individuals and the correlation among genes is not used efficiently. In this description, a co-expression modules based gene selection method for cancer recognition is proposed. First, in the cancer dataset a weighted correlation network is constructed according to the correlation between each pair of genes, different modules from this network are identified and the significant modules are selected for following exploration. Second, based on these informative modules information gain is applied to selecting the feature genes for cancer recognition. Then using LOOCV, the experiments with different classification algorithms are conducted and the results show that the proposed method makes better classification accuracy than traditional gene selection methods. At last, via gene ontology enrichment analysis the biological significance of the co-expressed genes in specific modules was verified. PMID:24440175

  4. Genome-Wide Tissue-Specific Gene Expression, Co-expression and Regulation of Co-expressed Genes in Adult Nematode Ascaris suum

    PubMed Central

    Rosa, Bruce A.; Jasmer, Douglas P.; Mitreva, Makedonka

    2014-01-01

    Background Caenorhabditis elegans has traditionally been used as a model for studying nematode biology, but its small size limits the ability for researchers to perform some experiments such as high-throughput tissue-specific gene expression studies. However, the dissection of individual tissues is possible in the parasitic nematode Ascaris suum due to its relatively large size. Here, we take advantage of the recent genome sequencing of Ascaris suum and the ability to physically dissect its separate tissues to produce a wide-scale tissue-specific nematode RNA-seq datasets, including data on three non-reproductive tissues (head, pharynx, and intestine) in both male and female worms, as well as four reproductive tissues (testis, seminal vesicle, ovary, and uterus). We obtained fundamental information about the biology of diverse cell types and potential interactions among tissues within this multicellular organism. Methodology/Principal Findings Overexpression and functional enrichment analyses identified many putative biological functions enriched in each tissue studied, including functions which have not been previously studied in detail in nematodes. Putative tissue-specific transcriptional factors and corresponding binding motifs that regulate expression in each tissue were identified, including the intestine-enriched ELT-2 motif/transcription factor previously described in nematode intestines. Constitutively expressed and novel genes were also characterized, with the largest number of novel genes found to be overexpressed in the testis. Finally, a putative acetylcholine-mediated transcriptional network connecting biological activity in the head to the male reproductive system is described using co-expression networks, along with a similar ecdysone-mediated system in the female. Conclusions/Significance The expression profiles, co-expression networks and co-expression regulation of the 10 tissues studied and the tissue-specific analysis presented here are a

  5. Uncovering the liver's role in immunity through RNA co-expression networks.

    PubMed

    Harrall, Kylie K; Kechris, Katerina J; Tabakoff, Boris; Hoffman, Paula L; Hines, Lisa M; Tsukamoto, Hidekazu; Pravenec, Michal; Printz, Morton; Saba, Laura M

    2016-10-01

    Gene co-expression analysis has proven to be a powerful tool for ascertaining the organization of gene products into networks that are important for organ function. An organ, such as the liver, engages in a multitude of functions important for the survival of humans, rats, and other animals; these liver functions include energy metabolism, metabolism of xenobiotics, immune system function, and hormonal homeostasis. With the availability of organ-specific transcriptomes, we can now examine the role of RNA transcripts (both protein-coding and non-coding) in these functions. A systems genetic approach for identifying and characterizing liver gene networks within a recombinant inbred panel of rats was used to identify genetically regulated transcriptional networks (modules). For these modules, biological consensus was found between functional enrichment analysis and publicly available phenotypic quantitative trait loci (QTL). In particular, the biological function of two liver modules could be linked to immune response. The eigengene QTLs for these co-expression modules were located at genomic regions coincident with highly significant phenotypic QTLs; these phenotypes were related to rheumatoid arthritis, food preference, and basal corticosterone levels in rats. Our analysis illustrates that genetically and biologically driven RNA-based networks, such as the ones identified as part of this research, provide insight into the genetic influences on organ functions. These networks can pinpoint phenotypes that manifest through the interaction of many organs/tissues and can identify unannotated or under-annotated RNA transcripts that play a role in these phenotypes. PMID:27401171

  6. DTW-MIC Coexpression Networks from Time-Course Data.

    PubMed

    Riccadonna, Samantha; Jurman, Giuseppe; Visintainer, Roberto; Filosi, Michele; Furlanello, Cesare

    2016-01-01

    When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. PMID:27031641

  7. DTW-MIC Coexpression Networks from Time-Course Data

    PubMed Central

    Riccadonna, Samantha; Jurman, Giuseppe; Visintainer, Roberto; Filosi, Michele; Furlanello, Cesare

    2016-01-01

    When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy. PMID:27031641

  8. Co-expression of mitosis-regulating genes contributes to malignant progression and prognosis in oligodendrogliomas.

    PubMed

    Liu, Yanwei; Hu, Huimin; Zhang, Chuanbao; Wang, Haoyuan; Zhang, Wenlong; Wang, Zheng; Li, Mingyang; Zhang, Wei; Zhou, Dabiao; Jiang, Tao

    2015-11-10

    The clinical prognosis of patients with glioma is determined by tumor grades, but tumors of different subtypes with equal malignancy grade usually have different prognosis that is largely determined by genetic abnormalities. Oligodendrogliomas (ODs) are the second most common type of gliomas. In this study, integrative analyses found that distribution of TCGA transcriptomic subtypes was associated with grade progression in ODs. To identify critical gene(s) associated with tumor grades and TCGA subtypes, we analyzed 34 normal brain tissue (NBT), 146 WHO grade II and 130 grade III ODs by microarray and RNA sequencing, and identified a co-expression network of six genes (AURKA, NDC80, CENPK, KIAA0101, TIMELESS and MELK) that was associated with tumor grades and TCGA subtypes as well as Ki-67 expression. Validation of the six genes was performed by qPCR in additional 28 ODs. Importantly, these genes also were validated in four high-grade recurrent gliomas and the initial lower-grade gliomas resected from the same patients. Finally, the RNA data on two genes with the highest discrimination potential (AURKA and NDC80) and Ki-67 were validated on an independent cohort (5 NBTs and 86 ODs) by immunohistochemistry. Knockdown of AURKA and NDC80 by siRNAs suppressed Ki-67 expression and proliferation of gliomas cells. Survival analysis showed that high expression of the six genes corporately indicated a poor survival outcome. Correlation and protein interaction analysis provided further evidence for this co-expression network. These data suggest that the co-expression of the six mitosis-regulating genes was associated with malignant progression and prognosis in ODs. PMID:26468983

  9. Co-expression of mitosis-regulating genes contributes to malignant progression and prognosis in oligodendrogliomas

    PubMed Central

    Liu, Yanwei; Hu, Huimin; Zhang, Chuanbao; Wang, Haoyuan; Zhang, Wenlong; Wang, Zheng; Li, Mingyang; Zhang, Wei; Zhou, Dabiao; Jiang, Tao

    2015-01-01

    The clinical prognosis of patients with glioma is determined by tumor grades, but tumors of different subtypes with equal malignancy grade usually have different prognosis that is largely determined by genetic abnormalities. Oligodendrogliomas (ODs) are the second most common type of gliomas. In this study, integrative analyses found that distribution of TCGA transcriptomic subtypes was associated with grade progression in ODs. To identify critical gene(s) associated with tumor grades and TCGA subtypes, we analyzed 34 normal brain tissue (NBT), 146 WHO grade II and 130 grade III ODs by microarray and RNA sequencing, and identified a co-expression network of six genes (AURKA, NDC80,CENPK, KIAA0101, TIMELESS and MELK) that was associated with tumor grades and TCGA subtypes as well as Ki-67 expression. Validation of the six genes was performed by qPCR in additional 28 ODs. Importantly, these genes also were validated in four high-grade recurrent gliomas and the initial lower-grade gliomas resected from the same patients. Finally, the RNA data on two genes with the highest discrimination potential (AURKA and NDC80) and Ki-67 were validated on an independent cohort (5 NBTs and 86 ODs) by immunohistochemistry. Knockdown of AURKA and NDC80 by siRNAs suppressed Ki-67 expression and proliferation of gliomas cells. Survival analysis showed that high expression of the six genes corporately indicated a poor survival outcome. Correlation and protein interaction analysis provided further evidence for this co-expression network. These data suggest that the co-expression of the six mitosis-regulating genes was associated with malignant progression and prognosis in ODs. PMID:26468983

  10. Construction and application of a co-expression network in Mycobacterium tuberculosis.

    PubMed

    Jiang, Jun; Sun, Xian; Wu, Wei; Li, Li; Wu, Hai; Zhang, Lu; Yu, Guohua; Li, Yao

    2016-01-01

    Because of its high pathogenicity and infectivity, tuberculosis is a serious threat to human health. Some information about the functions of the genes in Mycobacterium tuberculosis genome was currently available, but it was not enough to explore transcriptional regulatory mechanisms. Here, we applied the WGCNA (Weighted Gene Correlation Network Analysis) algorithm to mine pooled microarray datasets for the M. tuberculosis H37Rv strain. We constructed a co-expression network that was subdivided into 78 co-expression gene modules. The different response to two kinds of vitro models (a constant 0.2% oxygen hypoxia model and a Wayne model) were explained based on these modules. We identified potential transcription factors based on high Pearson's correlation coefficients between the modules and genes. Three modules that may be associated with hypoxic stimulation were identified, and their potential transcription factors were predicted. In the validation experiment, we determined the expression levels of genes in the modules under hypoxic condition and under overexpression of potential transcription factors (Rv0081, furA (Rv1909c), Rv0324, Rv3334, and Rv3833). The experimental results showed that the three identified modules related to hypoxia and that the overexpression of transcription factors could significantly change the expression levels of genes in the corresponding modules. PMID:27328747

  11. Construction and application of a co-expression network in Mycobacterium tuberculosis

    PubMed Central

    Jiang, Jun; Sun, Xian; Wu, Wei; Li, Li; Wu, Hai; Zhang, Lu; Yu, Guohua; Li, Yao

    2016-01-01

    Because of its high pathogenicity and infectivity, tuberculosis is a serious threat to human health. Some information about the functions of the genes in Mycobacterium tuberculosis genome was currently available, but it was not enough to explore transcriptional regulatory mechanisms. Here, we applied the WGCNA (Weighted Gene Correlation Network Analysis) algorithm to mine pooled microarray datasets for the M. tuberculosis H37Rv strain. We constructed a co-expression network that was subdivided into 78 co-expression gene modules. The different response to two kinds of vitro models (a constant 0.2% oxygen hypoxia model and a Wayne model) were explained based on these modules. We identified potential transcription factors based on high Pearson’s correlation coefficients between the modules and genes. Three modules that may be associated with hypoxic stimulation were identified, and their potential transcription factors were predicted. In the validation experiment, we determined the expression levels of genes in the modules under hypoxic condition and under overexpression of potential transcription factors (Rv0081, furA (Rv1909c), Rv0324, Rv3334, and Rv3833). The experimental results showed that the three identified modules related to hypoxia and that the overexpression of transcription factors could significantly change the expression levels of genes in the corresponding modules. PMID:27328747

  12. Integrated genome-wide association, coexpression network, and expression single nucleotide polymorphism analysis identifies novel pathway in allergic rhinitis

    PubMed Central

    2014-01-01

    Background Allergic rhinitis is a common disease whose genetic basis is incompletely explained. We report an integrated genomic analysis of allergic rhinitis. Methods We performed genome wide association studies (GWAS) of allergic rhinitis in 5633 ethnically diverse North American subjects. Next, we profiled gene expression in disease-relevant tissue (peripheral blood CD4+ lymphocytes) collected from subjects who had been genotyped. We then integrated the GWAS and gene expression data using expression single nucleotide (eSNP), coexpression network, and pathway approaches to identify the biologic relevance of our GWAS. Results GWAS revealed ethnicity-specific findings, with 4 genome-wide significant loci among Latinos and 1 genome-wide significant locus in the GWAS meta-analysis across ethnic groups. To identify biologic context for these results, we constructed a coexpression network to define modules of genes with similar patterns of CD4+ gene expression (coexpression modules) that could serve as constructs of broader gene expression. 6 of the 22 GWAS loci with P-value ≤ 1x10−6 tagged one particular coexpression module (4.0-fold enrichment, P-value 0.0029), and this module also had the greatest enrichment (3.4-fold enrichment, P-value 2.6 × 10−24) for allergic rhinitis-associated eSNPs (genetic variants associated with both gene expression and allergic rhinitis). The integrated GWAS, coexpression network, and eSNP results therefore supported this coexpression module as an allergic rhinitis module. Pathway analysis revealed that the module was enriched for mitochondrial pathways (8.6-fold enrichment, P-value 4.5 × 10−72). Conclusions Our results highlight mitochondrial pathways as a target for further investigation of allergic rhinitis mechanism and treatment. Our integrated approach can be applied to provide biologic context for GWAS of other diseases. PMID:25085501

  13. Protein-protein interaction and gene co-expression maps of ARFs and Aux/IAAs in Arabidopsis

    PubMed Central

    Piya, Sarbottam; Shrestha, Sandesh K.; Binder, Brad; Stewart, C. Neal; Hewezi, Tarek

    2014-01-01

    The phytohormone auxin regulates nearly all aspects of plant growth and development. Based on the current model in Arabidopsis thaliana, Auxin/indole-3-acetic acid (Aux/IAA) proteins repress auxin-inducible genes by inhibiting auxin response transcription factors (ARFs). Experimental evidence suggests that heterodimerization between Aux/IAA and ARF proteins are related to their unique biological functions. The objective of this study was to generate the Aux/IAA-ARF protein-protein interaction map using full length sequences and locate the interacting protein pairs to specific gene co-expression networks in order to define tissue-specific responses of the Aux/IAA-ARF interactome. Pairwise interactions between 19 ARFs and 29 Aux/IAAs resulted in the identification of 213 specific interactions of which 79 interactions were previously unknown. The incorporation of co-expression profiles with protein-protein interaction data revealed a strong correlation of gene co-expression for 70% of the ARF-Aux/IAA interacting pairs in at least one tissue/organ, indicative of the biological significance of these interactions. Importantly, ARF4-8 and 19, which were found to interact with almost all Aux-Aux/IAA showed broad co-expression relationships with Aux/IAA genes, thus, formed the central hubs of the co-expression network. Our analyses provide new insights into the biological significance of ARF-Aux/IAA associations in the morphogenesis and development of various plant tissues and organs. PMID:25566309

  14. Co-expression networks in generation of induced pluripotent stem cells.

    PubMed

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M

    2016-01-01

    We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation. PMID:26892236

  15. Co-expression networks in generation of induced pluripotent stem cells

    PubMed Central

    Paul, Sharan; Pflieger, Lance; Dansithong, Warunee; Figueroa, Karla P.; Gao, Fuying; Coppola, Giovanni; Pulst, Stefan M.

    2016-01-01

    ABSTRACT We developed an adenoviral vector, in which Yamanaka's four reprogramming factors (RFs) were controlled by individual CMV promoters in a single cassette (Ad-SOcMK). This permitted coordinated expression of RFs (SOX2, OCT3/4, c-MYC and KLF4) in a cell for a transient period of time, synchronizing the reprogramming process with the majority of transduced cells assuming induced pluripotent stem cell (iPSC)-like characteristics as early as three days post-transduction. These reprogrammed cells resembled human embryonic stem cells (ESCs) with regard to morphology, biomarker expression, and could be differentiated into cells of the germ layers in vitro and in vivo. These iPSC-like cells, however, failed to expand into larger iPSC colonies. The short and synchronized reprogramming process allowed us to study global transcription changes within short time intervals. Weighted gene co-expression network analysis (WGCNA) identified sixteen large gene co-expression modules, each including members of gene ontology categories involved in cell differentiation and development. In particular, the brown module contained a significant number of ESC marker genes, whereas the turquoise module contained cell-cycle-related genes that were downregulated in contrast to upregulation in human ESCs. Strong coordinated expression of all four RFs via adenoviral transduction may constrain stochastic processes and lead to silencing of genes important for cellular proliferation. PMID:26892236

  16. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    PubMed Central

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively coregulated genes and their annotation using gene ontology analysis and cis-regulatory element discovery. The causal basis for coregulation is detected through the use of quantitative trait locus mapping. PMID:16046823

  17. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    DOE PAGESBeta

    Baldwin, Nicole E.; Chesler, Elissa J.; Kirov, Stefan; Langston, Michael A.; Snoddy, Jay R.; Williams, Robert W.; Zhang, Bing

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively co-regulated genes and their annotation using gene ontology analysis and cis -regulatory element discovery.more » The causal basis for co-regulation is detected through the use of quantitative trait locus mapping.« less

  18. Gene Co-Expression Analysis Predicts Genetic Variants Associated with Drug Responsiveness in Lung Cancer

    PubMed Central

    Shroff, Sanaya; Zhang, Jie; Huang, Kun

    2016-01-01

    Responsiveness to drugs is an important concern in designing personalized treatment for cancer patients. Currently genetic markers are often used to guide targeted therapy. However, deeper understanding of the molecular basis for drug responses and discovery of new predictive biomarkers for drug sensitivity are much needed. In this paper, we present a workflow for identifying condition-specific gene co-expression networks associated with responses to the tyrosine kinase inhibitor, Erlotinib, in lung adenocarcinoma cell lines using data from the Cancer Cell Line Encyclopedia by combining network mining and statistical analysis. Particularly, we have identified multiple gene modules specifically co-expressed in the drug responsive cell lines but not in the unresponsive group. Interestingly, most of these modules are enriched on specific cytobands, suggesting potential copy number variation events on these loci. Our results therefore imply that there are multiple genetic loci with copy number variations associated with the Erlotinib responses. The existence of CNVs in these loci is also confirmed in lung cancer tissue samples using the TCGA data. Since these structural variations are inferred from functional genomics data, these CNVs are functional variations. These results suggest the condition specific gene co- expression network mining approach is an effective approach in predicting candidate biomarkers for drug responses. PMID:27570645

  19. Gene Co-Expression Analysis Predicts Genetic Variants Associated with Drug Responsiveness in Lung Cancer.

    PubMed

    Shroff, Sanaya; Zhang, Jie; Huang, Kun

    2016-01-01

    Responsiveness to drugs is an important concern in designing personalized treatment for cancer patients. Currently genetic markers are often used to guide targeted therapy. However, deeper understanding of the molecular basis for drug responses and discovery of new predictive biomarkers for drug sensitivity are much needed. In this paper, we present a workflow for identifying condition-specific gene co-expression networks associated with responses to the tyrosine kinase inhibitor, Erlotinib, in lung adenocarcinoma cell lines using data from the Cancer Cell Line Encyclopedia by combining network mining and statistical analysis. Particularly, we have identified multiple gene modules specifically co-expressed in the drug responsive cell lines but not in the unresponsive group. Interestingly, most of these modules are enriched on specific cytobands, suggesting potential copy number variation events on these loci. Our results therefore imply that there are multiple genetic loci with copy number variations associated with the Erlotinib responses. The existence of CNVs in these loci is also confirmed in lung cancer tissue samples using the TCGA data. Since these structural variations are inferred from functional genomics data, these CNVs are functional variations. These results suggest the condition specific gene co- expression network mining approach is an effective approach in predicting candidate biomarkers for drug responses. PMID:27570645

  20. Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction

    PubMed Central

    Shoichet, Brian K.; Gillis, Jesse

    2016-01-01

    The expansion of protein-ligand annotation databases has enabled large-scale networking of proteins by ligand similarity. These ligand-based protein networks, which implicitly predict the ability of neighboring proteins to bind related ligands, may complement biologically-oriented gene networks, which are used to predict functional or disease relevance. To quantify the degree to which such ligand-based protein associations might complement functional genomic associations, including sequence similarity, physical protein-protein interactions, co-expression, and disease gene annotations, we calculated a network based on the Similarity Ensemble Approach (SEA: sea.docking.org), where protein neighbors reflect the similarity of their ligands. We also measured the similarity with functional genomic networks over a common set of 1,131 genes, and found that the networks had only small overlaps, which were significant only due to the large scale of the data. Consistent with the view that the networks contain different information, combining them substantially improved Molecular Function prediction within GO (from AUROC~0.63–0.75 for the individual data modalities to AUROC~0.8 in the aggregate). We investigated the boost in guilt-by-association gene function prediction when the networks are combined and describe underlying properties that can be further exploited. PMID:27467773

  1. KaPPA-View4: a metabolic pathway database for representation and analysis of correlation networks of gene co-expression and metabolite co-accumulation and omics data.

    PubMed

    Sakurai, Nozomu; Ara, Takeshi; Ogata, Yoshiyuki; Sano, Ryosuke; Ohno, Takashi; Sugiyama, Kenjiro; Hiruta, Atsushi; Yamazaki, Kiyoshi; Yano, Kentaro; Aoki, Koh; Aharoni, Asaph; Hamada, Kazuki; Yokoyama, Koji; Kawamura, Shingo; Otsuka, Hirofumi; Tokimatsu, Toshiaki; Kanehisa, Minoru; Suzuki, Hideyuki; Saito, Kazuki; Shibata, Daisuke

    2011-01-01

    Correlations of gene-to-gene co-expression and metabolite-to-metabolite co-accumulation calculated from large amounts of transcriptome and metabolome data are useful for uncovering unknown functions of genes, functional diversities of gene family members and regulatory mechanisms of metabolic pathway flows. Many databases and tools are available to interpret quantitative transcriptome and metabolome data, but there are only limited ones that connect correlation data to biological knowledge and can be utilized to find biological significance of it. We report here a new metabolic pathway database, KaPPA-View4 (http://kpv.kazusa.or.jp/kpv4/), which is able to overlay gene-to-gene and/or metabolite-to-metabolite relationships as curves on a metabolic pathway map, or on a combination of up to four maps. This representation would help to discover, for example, novel functions of a transcription factor that regulates genes on a metabolic pathway. Pathway maps of the Kyoto Encyclopedia of Genes and Genomes (KEGG) and maps generated from their gene classifications are available at KaPPA-View4 KEGG version (http://kpv.kazusa.or.jp/kpv4-kegg/). At present, gene co-expression data from the databases ATTED-II, COXPRESdb, CoP and MiBASE for human, mouse, rat, Arabidopsis, rice, tomato and other plants are available. PMID:21097783

  2. ATTED-II: a database of co-expressed genes and cis elements for identifying co-regulated gene groups in Arabidopsis

    PubMed Central

    Obayashi, Takeshi; Kinoshita, Kengo; Nakai, Kenta; Shibaoka, Masayuki; Hayashi, Shinpei; Saeki, Motoshi; Shibata, Daisuke; Saito, Kazuki; Ohta, Hiroyuki

    2007-01-01

    Publicly available database of co-expressed gene sets would be a valuable tool for a wide variety of experimental designs, including targeting of genes for functional identification or for regulatory investigation. Here, we report the construction of an Arabidopsis thaliana trans-factor and cis-element prediction database (ATTED-II) that provides co-regulated gene relationships based on co-expressed genes deduced from microarray data and the predicted cis elements. ATTED-II () includes the following features: (i) lists and networks of co-expressed genes calculated from 58 publicly available experimental series, which are composed of 1388 GeneChip data in A.thaliana; (ii) prediction of cis-regulatory elements in the 200 bp region upstream of the transcription start site to predict co-regulated genes amongst the co-expressed genes; and (iii) visual representation of expression patterns for individual genes. ATTED-II can thus help researchers to clarify the function and regulation of particular genes and gene networks. PMID:17130150

  3. Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets

    PubMed Central

    2014-01-01

    Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624

  4. Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks.

    PubMed

    Rahmani, Bahareh; Zimmermann, Michael T; Grill, Diane E; Kennedy, Richard B; Oberg, Ann L; White, Bill C; Poland, Gregory A; McKinney, Brett A

    2016-01-01

    Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways. PMID:27242890

  5. Recursive Indirect-Paths Modularity (RIP-M) for Detecting Community Structure in RNA-Seq Co-expression Networks

    PubMed Central

    Rahmani, Bahareh; Zimmermann, Michael T.; Grill, Diane E.; Kennedy, Richard B.; Oberg, Ann L.; White, Bill C.; Poland, Gregory A.; McKinney, Brett A.

    2016-01-01

    Clusters of genes in co-expression networks are commonly used as functional units for gene set enrichment detection and increasingly as features (attribute construction) for statistical inference and sample classification. One of the practical challenges of clustering for these purposes is to identify an optimal partition of the network where the individual clusters are neither too large, prohibiting interpretation, nor too small, precluding general inference. Newman Modularity is a spectral clustering algorithm that automatically finds the number of clusters, but for many biological networks the cluster sizes are suboptimal. In this work, we generalize Newman Modularity to incorporate information from indirect paths in RNA-Seq co-expression networks. We implement a merge-and-split algorithm that allows the user to constrain the range of cluster sizes: large enough to capture genes in relevant pathways, yet small enough to resolve distinct functions. We investigate the properties of our recursive indirect-pathways modularity (RIP-M) and compare it with other clustering methods using simulated co-expression networks and RNA-seq data from an influenza vaccine response study. RIP-M had higher cluster assignment accuracy than Newman Modularity for finding clusters in simulated co-expression networks for all scenarios, and RIP-M had comparable accuracy to Weighted Gene Correlation Network Analysis (WGCNA). RIP-M was more accurate than WGCNA for modest hard thresholds and comparable for high, while WGCNA was slightly more accurate for soft thresholds. In the vaccine study data, RIP-M and WGCNA enriched for a comparable number of immunologically relevant pathways. PMID:27242890

  6. Coexpression Pattern Analysis of NPM1-Associated Genes in Chronic Myelogenous Leukemia

    PubMed Central

    Wong, S. C. Cesar; Siu, Parco M.; Yung, Benjamin Y. M.

    2015-01-01

    Background. Nucleophosmin 1 (NPM1) plays an important role in ribosomal synthesis and malignancies, but NPM1 mutations occur rarely in the blast-crisis and chronic-phase chronic myelogenous leukemia (CML) patients. The NPM1-associated gene set (GCM_NPM1), in total 116 genes including NPM1, was chosen as the candidate gene set for the coexpression analysis. We wonder if NPM1-associated genes can affect the ribosomal synthesis and translation process in CML. Results. We presented a distribution-based approach for gene pair classification by identifying a disease-specific cutoff point that classified the coexpressed gene pairs into strong and weak coexpression structures. The differences in the coexpression patterns between the normal and the CML groups were reflected from the overall structure by performing two-sample Kolmogorov-Smirnov test. Our developed method effectively identified the coexpression pattern differences from the overall structure: P  value = 1.71 × 10−22 < 0.05 for the maximum deviation D = 0.109. Moreover, we found that genes involved in the ribosomal synthesis and translation process tended to be coexpressed in the CML group. Conclusion. Our developed method can identify the coexpression difference between two different groups. Dysregulation of ribosomal synthesis and translation process may be related to the CML disease. Our significant findings may provide useful information for the novel CML mechanism exploration and cancer treatment. PMID:25961029

  7. GLITTER: a web-based application for gene link inspection through tissue-specific coexpression.

    PubMed

    Liu, Xiangtao; Yu, Pengfei; Cheng, Chao; Potash, James B; Han, Shizhong

    2016-01-01

    Accumulating evidence supports the polygenic nature of most complex diseases, suggesting the involvement of many susceptibility genes with small effect sizes. Although hundreds of genes may underlie the genetic architecture of complex diseases, those involved in a given disease are probably not randomly distributed, but likely to be functionally related. Protein-protein interaction networks have been used to evaluate the functional relatedness of susceptibility genes. However, these networks do not account for tissue specificity, are limited to protein-coding genes, and are typically biased by incomplete biological knowledge. Here, we present Gene Link Inspector Through Tissue-specific coExpRession (GLITTER), a web-based application for assessing the functional relatedness of susceptibility genes, either coding or noncoding, according to tissue-specific gene expression profiles. GLITTER can also shed light on the specific tissues in which susceptibility genes might exert their functions. We further demonstrate examples of how GLITTER can evaluate the functional relatedness of susceptibility genes underlying schizophrenia and breast cancer, and provide clues about etiology. PMID:27623690

  8. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish.

    PubMed

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-01-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320

  9. Analysis of the dynamic co-expression network of heart regeneration in the zebrafish

    PubMed Central

    Rodius, Sophie; Androsova, Ganna; Götz, Lou; Liechti, Robin; Crespo, Isaac; Merz, Susanne; Nazarov, Petr V.; de Klein, Niek; Jeanty, Céline; González-Rosa, Juan M.; Muller, Arnaud; Bernardin, Francois; Niclou, Simone P.; Vallar, Laurent; Mercader, Nadia; Ibberson, Mark; Xenarios, Ioannis; Azuaje, Francisco

    2016-01-01

    The zebrafish has the capacity to regenerate its heart after severe injury. While the function of a few genes during this process has been studied, we are far from fully understanding how genes interact to coordinate heart regeneration. To enable systematic insights into this phenomenon, we generated and integrated a dynamic co-expression network of heart regeneration in the zebrafish and linked systems-level properties to the underlying molecular events. Across multiple post-injury time points, the network displays topological attributes of biological relevance. We show that regeneration steps are mediated by modules of transcriptionally coordinated genes, and by genes acting as network hubs. We also established direct associations between hubs and validated drivers of heart regeneration with murine and human orthologs. The resulting models and interactive analysis tools are available at http://infused.vital-it.ch. Using a worked example, we demonstrate the usefulness of this unique open resource for hypothesis generation and in silico screening for genes involved in heart regeneration. PMID:27241320

  10. RiceArrayNet: a database for correlating gene expression from transcriptome profiling, and its application to the analysis of coexpressed genes in rice.

    PubMed

    Lee, Tae-Ho; Kim, Yeon-Ki; Pham, Thu Thi Minh; Song, Sang Ik; Kim, Ju-Kon; Kang, Kyu Young; An, Gynheung; Jung, Ki-Hong; Galbraith, David W; Kim, Minkyun; Yoon, Ung-Han; Nahm, Baek Hie

    2009-09-01

    Microarray data can be used to derive understanding of the relationships between the genes involved in various biological systems of an organism, given the availability of databases of gene expression measurements from the complete spectrum of experimental conditions and materials. However, there have been no reports, to date, of such a database being constructed for rice (Oryza sativa). Here, we describe the construction of such a database, called RiceArrayNet (RAN; http://www.ggbio.com/arraynet/), which provides information on coexpression between genes in terms of correlation coefficients (r values). The average number of coexpressed genes is 214, with sd of 440 at r >or= 0.5. Given the correlation between genes in a gene pair, the degrees of closeness between genes can be visualized in a relational tree and a relational network. The distribution of correlated genes according to degree of stringency shows how each gene is related to other genes. As an application of RAN, the 16-member L7Ae ribosomal protein family was explored for coexpressed genes and gene expression values within and between rice and Arabidopsis (Arabidopsis thaliana), and common and unique features in coexpression partners and expression patterns were observed for these family members. We observed a correlation pattern between Os01g0968800, a drought-responsive element-binding transcription factor, Os02g0790500, a trehalose-6-phosphate synthase, and Os06g0219500, a small heat shock factor, reflecting the fact that genes responding to the same biological stresses are regulated together. The RAN database can be used as a tool to gain insight into a particular gene by examining its coexpression partners. PMID:19605550

  11. Co-expression network analysis of Down's syndrome based on microarray data

    PubMed Central

    Zhao, Jianping; Zhang, Zhengguo; Ren, Shumin; Zong, Yanan; Kong, Xiangdong

    2016-01-01

    Down's syndrome (DS) is a type of chromosome disease. The present study aimed to explore the underlying molecular mechanisms of DS. GSE5390 microarray data downloaded from the gene expression omnibus database was used to identify differentially expressed genes (DEGs) in DS. Pathway enrichment analysis of the DEGs was performed, followed by co-expression network construction. Significant differential modules were mined by mutual information, followed by functional analysis. The accuracy of sample classification for the significant differential modules of DEGs was evaluated by leave-one-out cross-validation. A total of 997 DEGs, including 638 upregulated and 359 downregulated genes, were identified. Upregulated DEGs were enriched in 15 pathways, such as cell adhesion molecules, whereas downregulated DEGs were enriched in maturity onset diabetes of the young. Three significant differential modules with the highest discriminative scores (mutual information>0.35) were selected from a co-expression network. The classification accuracy of GSE16677 expression profile samples was 54.55% and 72.73% when characterized by 12 DEGs and 3 significant differential modules, respectively. Genes in significant differential modules were significantly enriched in 5 functions, including the endoplasmic reticulum (P=0.018) and regulation of apoptosis (P=0.061). The identified DEGs, in particular the 12 DEGs in the significant differential modules, such as B-cell lymphoma 2-associated transcription factor 1, heat shock protein 90 kDa beta member 1, UBX domain-containing protein 2 and transmembrane protein 50B, may serve important roles in the pathogenesis of DS. PMID:27588071

  12. A co-expression network analysis reveals lncRNA abnormalities in peripheral blood in early-onset schizophrenia.

    PubMed

    Ren, Yan; Cui, Yuehua; Li, Xinrong; Wang, Binhong; Na, Long; Shi, Junyan; Wang, Liang; Qiu, Lixia; Zhang, Kerang; Liu, Guifen; Xu, Yong

    2015-12-01

    Long non-coding RNAs (lncRNAs) are emerging as important regulators of gene expression and disease processes especially in neuropsychiatric disorders. To explore the potential regulatory roles of lncRNAs in schizophrenia, we performed an integrated co-expression network analysis on lncRNA and mRNA microarray profiles generated from the peripheral blood samples in 19 drug-naïve first-episode early-onset schizophrenia (EOS) patients and 18 demographically matched typically developing controls (TDCs). Using weighted gene co-expression network analysis (WGCNA), we showed that the lncRNAs were organized into co-expressed modules, and two lncRNA modules were associated with EOS. The mRNA networks were constructed and three disease-associated modules were identified. Gene Ontology (GO) analysis indicated that the mRNAs were highly enriched for mitochondrion and related biological processes. Moreover, our results revealed a significant correlation between lncRNAs and mRNAs using the canonical correlation analysis (CCA). Our results suggest that the convergent lncRNA alteration may be involved in the etiologies of EOS, and mitochondrial dysfunction participates in the pathological process of the disease. Our findings may shed light on the pathogenesis of schizophrenia and facilitate future diagnosis and therapeutic strategies. PMID:25967042

  13. Genetic Network Inference: From Co-Expression Clustering to Reverse Engineering

    NASA Technical Reports Server (NTRS)

    Dhaeseleer, Patrik; Liang, Shoudan; Somogyi, Roland

    2000-01-01

    Advances in molecular biological, analytical, and computational technologies are enabling us to systematically investigate the complex molecular processes underlying biological systems. In particular, using high-throughput gene expression assays, we are able to measure the output of the gene regulatory network. We aim here to review datamining and modeling approaches for conceptualizing and unraveling the functional relationships implicit in these datasets. Clustering of co-expression profiles allows us to infer shared regulatory inputs and functional pathways. We discuss various aspects of clustering, ranging from distance measures to clustering algorithms and multiple-duster memberships. More advanced analysis aims to infer causal connections between genes directly, i.e., who is regulating whom and how. We discuss several approaches to the problem of reverse engineering of genetic networks, from discrete Boolean networks, to continuous linear and non-linear models. We conclude that the combination of predictive modeling with systematic experimental verification will be required to gain a deeper insight into living organisms, therapeutic targeting, and bioengineering.

  14. Correlated mRNAs and miRNAs from co-expression and regulatory networks affect porcine muscle and finally meat properties

    PubMed Central

    2013-01-01

    Background Physiological processes aiding the conversion of muscle to meat involve many genes associated with muscle structure and metabolic processes. MicroRNAs regulate networks of genes to orchestrate cellular functions, in turn regulating phenotypes. Results We applied weighted gene co-expression network analysis to identify co-expression modules that correlated to meat quality phenotypes and were highly enriched for genes involved in glucose metabolism, response to wounding, mitochondrial ribosome, mitochondrion, and extracellular matrix. Negative correlation of miRNA with mRNA and target prediction were used to select transcripts out of the modules of trait-associated mRNAs to further identify those genes that are correlated with post mortem traits. Conclusions Porcine muscle co-expression transcript networks that correlated to post mortem traits were identified. The integration of miRNA and mRNA expression analyses, as well as network analysis, enabled us to interpret the differentially-regulated genes from a systems perspective. Linking co-expression networks of transcripts and hierarchically organized pairs of miRNAs and mRNAs to meat properties yields new insight into several biological pathways underlying phenotype differences. These pathways may also be diagnostic for many myopathies, which are accompanied by deficient nutrient and oxygen supply of muscle fibers. PMID:23915301

  15. Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism

    PubMed Central

    Willsey, A. Jeremy; Sanders, Stephan J.; Li, Mingfeng; Dong, Shan; Tebbenkamp, Andrew T.; Muhle, Rebecca A.; Reilly, Steven K.; Lin, Leon; Fertuzinhos, Sofia; Miller, Jeremy A.; Murtha, Michael T.; Bichsel, Candace; Niu, Wei; Cotney, Justin; Ercan-Sencicek, A. Gulhan; Gockley, Jake; Gupta, Abha; Han, Wenqi; He, Xin; Hoffman, Ellen; Klei, Lambertus; Lei, Jing; Liu, Wenzhong; Liu, Li; Lu, Cong; Xu, Xuming; Zhu, Ying; Mane, Shrikant M.; Lein, Edward S.; Wei, Liping; Noonan, James P.; Roeder, Kathryn; Devlin, Bernie; Šestan, Nenad; State, Matthew W.

    2013-01-01

    SUMMARY Autism spectrum disorder (ASD) is a complex developmental syndrome of unknown etiology. Recent studies employing exome- and genome-wide sequencing have identified nine high-confidence ASD (hcASD) genes. Working from the hypothesis that ASD-associated mutations in these biologically pleiotropic genes will disrupt intersecting developmental processes to contribute to a common phenotype, we have attempted to identify time periods, brain regions, and cell types in which these genes converge. We have constructed coexpression networks based on the hcASD “seed” genes, leveraging a rich expression data set encompassing multiple human brain regions across human development and into adulthood. By assessing enrichment of an independent set of probable ASD (pASD) genes, derived from the same sequencing studies, we demonstrate a key point of convergence in midfetal layer 5/6 cortical projection neurons. This approach informs when, where, and in what cell types mutations in these specific genes may be productively studied to clarify ASD pathophysiology. PMID:24267886

  16. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  17. Human transcriptional interactome of chromatin contribute to gene co-expression

    PubMed Central

    2010-01-01

    Background Transcriptional interactome of chromatin is one of the important mechanisms in gene transcription regulation. By chromatin conformation capture and 3D FISH experiments, several chromatin interactions cases among sequence-distant genes or even inter-chromatin genes were reported. However, on genomics level, there is still little evidence to support these mechanisms. Recently based on Hi-C experiment, a genome-wide picture of chromatin interactions in human cells was presented. It provides a useful material for analysing whether the mechanism of transcriptional interactome is common. Results The main work here is to demonstrate whether the effects of transcriptional interactome on gene co-expression exist on genomic level. While controlling the effects of transcription factors control similarities (TCS), we tested the correlation between Hi-C interaction and the mutual ranks of gene co-expression rates (provided by COXPRESdb) of intra-chromatin gene pairs. We used 6,084 genes with both TF annotation and co-expression information, and matched them into 273,458 pairs with similar Hi-C interaction ranks in different cell types. The results illustrate that co-expression is strongly associated with chromatin interaction. Further analysis using GO annotation reveals potential correlation between gene function similarity, Hi-C interaction and their co-expression. Conclusions According to the results in this research, the intra-chromatin interactome may have relation to gene function and associate with co-expression. This study provides evidence for illustrating the effect of transcriptional interactome on transcription regulation. PMID:21156067

  18. Integration of Metabolic Modeling with Gene Co-expression Reveals Transcriptionally Programmed Reactions Explaining Robustness in Mycobacterium tuberculosis

    PubMed Central

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Mittal, Inna; Mobeen, Ahmed; Ramachandran, Srinivasan

    2016-01-01

    Robustness of metabolic networks is accomplished by gene regulation, modularity, re-routing of metabolites and plasticity. Here, we probed robustness against perturbations of biochemical reactions of M. tuberculosis in the form of predicting compensatory trends. In order to investigate the transcriptional programming of genes associated with correlated fluxes, we integrated with gene co-expression network. Knock down of the reactions NADH2r and ATPS responsible for producing the hub metabolites, and Central carbon metabolism had the highest proportion of their associated genes under transcriptional co-expression with genes of their flux correlated reactions. Reciprocal gene expression correlations were observed among compensatory routes, fresh activation of alternative routes and in the multi-copy genes of Cysteine synthase and of Phosphate transporter. Knock down of 46 reactions caused the activation of Isocitrate lyase or Malate synthase or both reactions, which are central to the persistent state of M. tuberculosis. A total of 30 new freshly activated routes including Cytochrome c oxidase, Lactate dehydrogenase, and Glycine cleavage system were predicted, which could be responsible for switching into dormant or persistent state. Thus, our integrated approach of exploring transcriptional programming of flux correlated reactions has the potential to unravel features of system architecture conferring robustness. PMID:27000948

  19. Integration of Metabolic Modeling with Gene Co-expression Reveals Transcriptionally Programmed Reactions Explaining Robustness in Mycobacterium tuberculosis.

    PubMed

    Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Mittal, Inna; Mobeen, Ahmed; Ramachandran, Srinivasan

    2016-01-01

    Robustness of metabolic networks is accomplished by gene regulation, modularity, re-routing of metabolites and plasticity. Here, we probed robustness against perturbations of biochemical reactions of M. tuberculosis in the form of predicting compensatory trends. In order to investigate the transcriptional programming of genes associated with correlated fluxes, we integrated with gene co-expression network. Knock down of the reactions NADH2r and ATPS responsible for producing the hub metabolites, and Central carbon metabolism had the highest proportion of their associated genes under transcriptional co-expression with genes of their flux correlated reactions. Reciprocal gene expression correlations were observed among compensatory routes, fresh activation of alternative routes and in the multi-copy genes of Cysteine synthase and of Phosphate transporter. Knock down of 46 reactions caused the activation of Isocitrate lyase or Malate synthase or both reactions, which are central to the persistent state of M. tuberculosis. A total of 30 new freshly activated routes including Cytochrome c oxidase, Lactate dehydrogenase, and Glycine cleavage system were predicted, which could be responsible for switching into dormant or persistent state. Thus, our integrated approach of exploring transcriptional programming of flux correlated reactions has the potential to unravel features of system architecture conferring robustness. PMID:27000948

  20. Cell-type–based model explaining coexpression patterns of genes in the brain

    PubMed Central

    Grange, Pascal; Bohland, Jason W.; Okaty, Benjamin W.; Sugino, Ken; Bokil, Hemant; Nelson, Sacha B.; Ng, Lydia; Hawrylycz, Michael; Mitra, Partha P.

    2014-01-01

    Spatial patterns of gene expression in the vertebrate brain are not independent, as pairs of genes can exhibit complex patterns of coexpression. Two genes may be similarly expressed in one region, but differentially expressed in other regions. These correlations have been studied quantitatively, particularly for the Allen Atlas of the adult mouse brain, but their biological meaning remains obscure. We propose a simple model of the coexpression patterns in terms of spatial distributions of underlying cell types and establish its plausibility using independently measured cell-type–specific transcriptomes. The model allows us to predict the spatial distribution of cell types in the mouse brain. PMID:24706869

  1. Co-expression network analysis reveals transcription factors associated to cell wall biosynthesis in sugarcane.

    PubMed

    Ferreira, Savio Siqueira; Hotta, Carlos Takeshi; Poelking, Viviane Guzzo de Carli; Leite, Debora Chaves Coelho; Buckeridge, Marcos Silveira; Loureiro, Marcelo Ehlers; Barbosa, Marcio Henrique Pereira; Carneiro, Monalisa Sampaio; Souza, Glaucia Mendes

    2016-05-01

    Sugarcane is a hybrid of Saccharum officinarum and Saccharum spontaneum, with minor contributions from other species in Saccharum and other genera. Understanding the molecular basis of cell wall metabolism in sugarcane may allow for rational changes in fiber quality and content when designing new energy crops. This work describes a comparative expression profiling of sugarcane ancestral genotypes: S. officinarum, S. spontaneum and S. robustum and a commercial hybrid: RB867515, linking gene expression to phenotypes to identify genes for sugarcane improvement. Oligoarray experiments of leaves, immature and intermediate internodes, detected 12,621 sense and 995 antisense transcripts. Amino acid metabolism was particularly evident among pathways showing natural antisense transcripts expression. For all tissues sampled, expression analysis revealed 831, 674 and 648 differentially expressed genes in S. officinarum, S. robustum and S. spontaneum, respectively, using RB867515 as reference. Expression of sugar transporters might explain sucrose differences among genotypes, but an unexpected differential expression of histones were also identified between high and low Brix° genotypes. Lignin biosynthetic genes and bioenergetics-related genes were up-regulated in the high lignin genotype, suggesting that these genes are important for S. spontaneum to allocate carbon to lignin, while S. officinarum allocates it to sucrose storage. Co-expression network analysis identified 18 transcription factors possibly related to cell wall biosynthesis while in silico analysis detected cis-elements involved in cell wall biosynthesis in their promoters. Our results provide information to elucidate regulatory networks underlying traits of interest that will allow the improvement of sugarcane for biofuel and chemicals production. PMID:26820137

  2. G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...

  3. Gene differential coexpression analysis based on biweight correlation and maximum clique.

    PubMed

    Zheng, Chun-Hou; Yuan, Lin; Sha, Wen; Sun, Zhan-Li

    2014-01-01

    Differential coexpression analysis usually requires the definition of 'distance' or 'similarity' between measured datasets. Until now, the most common choice is Pearson correlation coefficient. However, Pearson correlation coefficient is sensitive to outliers. Biweight midcorrelation is considered to be a good alternative to Pearson correlation since it is more robust to outliers. In this paper, we introduce to use Biweight Midcorrelation to measure 'similarity' between gene expression profiles, and provide a new approach for gene differential coexpression analysis. Firstly, we calculate the biweight midcorrelation coefficients between all gene pairs. Then, we filter out non-informative correlation pairs using the 'half-thresholding' strategy and calculate the differential coexpression value of gene, The experimental results on simulated data show that the new approach performed better than three previously published differential coexpression analysis (DCEA) methods. Moreover, we use the maximum clique analysis to gene subset included genes identified by our approach and previously reported T2D-related genes, many additional discoveries can be found through our method. PMID:25474074

  4. Understanding developmental and adaptive cues in pine through metabolite profiling and co-expression network analysis

    PubMed Central

    Cañas, Rafael A.; Canales, Javier; Muñoz-Hernández, Carmen; Granados, Jose M.; Ávila, Concepción; García-Martín, María L.; Cánovas, Francisco M.

    2015-01-01

    Conifers include long-lived evergreen trees of great economic and ecological importance, including pines and spruces. During their long lives conifers must respond to seasonal environmental changes, adapt to unpredictable environmental stresses, and co-ordinate their adaptive adjustments with internal developmental programmes. To gain insights into these responses, we examined metabolite and transcriptomic profiles of needles from naturally growing 25-year-old maritime pine (Pinus pinaster L. Aiton) trees over a year. The effect of environmental parameters such as temperature and rain on needle development were studied. Our results show that seasonal changes in the metabolite profiles were mainly affected by the needles’ age and acclimation for winter, but changes in transcript profiles were mainly dependent on climatic factors. The relative abundance of most transcripts correlated well with temperature, particularly for genes involved in photosynthesis or winter acclimation. Gene network analysis revealed relationships between 14 co-expressed gene modules and development and adaptation to environmental stimuli. Novel Myb transcription factors were identified as candidate regulators during needle development. Our systems-based analysis provides integrated data of the seasonal regulation of maritime pine growth, opening new perspectives for understanding the complex regulatory mechanisms underlying conifers’ adaptive responses. Taken together, our results suggest that the environment regulates the transcriptome for fine tuning of the metabolome during development. PMID:25873654

  5. Gene expression networks.

    PubMed

    Thomas, Reuben; Portier, Christopher J

    2013-01-01

    With the advent of microarrays and next-generation biotechnologies, the use of gene expression data has become ubiquitous in biological research. One potential drawback of these data is that they are very rich in features or genes though cost considerations allow for the use of only relatively small sample sizes. A useful way of getting at biologically meaningful interpretations of the environmental or toxicological condition of interest would be to make inferences at the level of a priori defined biochemical pathways or networks of interacting genes or proteins that are known to perform certain biological functions. This chapter describes approaches taken in the literature to make such inferences at the biochemical pathway level. In addition this chapter describes approaches to create hypotheses on genes playing important roles in response to a treatment, using organism level gene coexpression or protein-protein interaction networks. Also, approaches to reverse engineer gene networks or methods that seek to identify novel interactions between genes are described. Given the relatively small sample numbers typically available, these reverse engineering approaches are generally useful in inferring interactions only among a relatively small or an order 10 number of genes. Finally, given the vast amounts of publicly available gene expression data from different sources, this chapter summarizes the important sources of these data and characteristics of these sources or databases. In line with the overall aims of this book of providing practical knowledge to a researcher interested in analyzing gene expression data from a network perspective, the chapter provides convenient publicly accessible tools for performing analyses described, and in addition describe three motivating examples taken from the published literature that illustrate some of the relevant analyses. PMID:23086841

  6. Normalized lmQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers

    PubMed Central

    Zhang, Jie; Huang, Kun

    2014-01-01

    In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CNVs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. PMID:27486298

  7. WeGET: predicting new genes for molecular systems by weighted co-expression.

    PubMed

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  8. WeGET: predicting new genes for molecular systems by weighted co-expression

    PubMed Central

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A.

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  9. Co-expression analysis of differentially expressed genes in hepatitis C virus-induced hepatocellular carcinoma.

    PubMed

    Song, Qingfeng; Zhao, Chang; Ou, Shengqiu; Meng, Zhibin; Kang, Ping; Fan, Liwei; Qi, Feng; Ma, Yilong

    2015-01-01

    The aim of the current study was to investigate the molecular mechanisms underlying hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) using the expression profiles of HCV-infected Huh7 cells at different time points. The differentially expressed genes (DEGs) were identified with the Samr package in R software once the data were normalized. Functional and pathway enrichment analysis of the identified DEGs was also performed. Subsequently, MCODE in Cytoscape software was applied to conduct module analysis of the constructed co-expression networks. A total of 1,100 DEGs were identified between the HCV-infected and control samples at 12, 18, 24 and 48 h post-infection. DEGs at 24 and 48 h were involved in the same signaling pathways and biological processes, including sterol biosynthetic processes and tRNA amino-acylation. There were 22 time series genes which were clustered into 3 expression patterns, and the demarcation point of the 2 expression patterns that 401 overlapping DEGs at 24 and 48 h clustered into was 24 h post-infection. tRNA synthesis-related biological processes emerged at 24 and 48 h. Replication and assembly of HCV in HCV-infected Huh7 cells occurred mainly at 24 h post-infection. In view of this, the screened time series genes have the potential to become candidate target molecules for monitoring, diagnosing and treating HCV-induced HCC. PMID:25339452

  10. Co-expression analysis of differentially expressed genes in hepatitis C virus-induced hepatocellular carcinoma

    PubMed Central

    SONG, QINGFENG; ZHAO, CHANG; OU, SHENGQIU; MENG, ZHIBIN; KANG, PING; FAN, LIWEI; QI, FENG; MA, YILONG

    2015-01-01

    The aim of the current study was to investigate the molecular mechanisms underlying hepatitis C virus (HCV)-induced hepatocellular carcinoma (HCC) using the expression profiles of HCV-infected Huh7 cells at different time points. The differentially expressed genes (DEGs) were identified with the Samr package in R software once the data were normalized. Functional and pathway enrichment analysis of the identified DEGs was also performed. Subsequently, MCODE in Cytoscape software was applied to conduct module analysis of the constructed co-expression networks. A total of 1,100 DEGs were identified between the HCV-infected and control samples at 12, 18, 24 and 48 h post-infection. DEGs at 24 and 48 h were involved in the same signaling pathways and biological processes, including sterol biosynthetic processes and tRNA amino-acylation. There were 22 time series genes which were clustered into 3 expression patterns, and the demarcation point of the 2 expression patterns that 401 overlapping DEGs at 24 and 48 h clustered into was 24 h post-infection. tRNA synthesis-related biological processes emerged at 24 and 48 h. Replication and assembly of HCV in HCV-infected Huh7 cells occurred mainly at 24 h post-infection. In view of this, the screened time series genes have the potential to become candidate target molecules for monitoring, diagnosing and treating HCV-induced HCC. PMID:25339452

  11. Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation

    PubMed Central

    Coppe, Alessandro; Ferrari, Francesco; Bisognin, Andrea; Danieli, Gian Antonio; Ferrari, Sergio; Bicciato, Silvio; Bortoluzzi, Stefania

    2009-01-01

    Genes co-expressed may be under similar promoter-based and/or position-based regulation. Although data on expression, position and function of human genes are available, their true integration still represents a challenge for computational biology, hampering the identification of regulatory mechanisms. We carried out an integrative analysis of genomic position, functional annotation and promoters of genes expressed in myeloid cells. Promoter analysis was conducted by a novel multi-step method for discovering putative regulatory elements, i.e. over-represented motifs, in a selected set of promoters, as compared with a background model. The combination of transcriptional, structural and functional data allowed the identification of sets of promoters pertaining to groups of genes co-expressed and co-localized in regions of the human genome. The application of motif discovery to 26 groups of genes co-expressed in myeloid cells differentiation and co-localized in the genome showed that there are more over-represented motifs in promoters of co-expressed and co-localized genes than in promoters of simply co-expressed genes (CEG). Motifs, which are similar to the binding sequences of known transcription factors, non-uniformly distributed along promoter sequences and/or occurring in highly co-expressed subset of genes were identified. Co-expressed and co-localized gene sets were grouped in two co-expressed genomic meta-regions, putatively representing functional domains of a high-level expression regulation. PMID:19059999

  12. Co-expression of soybean Dicer-like genes in response to stress and development.

    PubMed

    Curtin, Shaun J; Kantar, Michael B; Yoon, Han W; Whaley, Adam M; Schlueter, Jessica A; Stupar, Robert M

    2012-11-01

    Regulation of gene transcription and post-transcriptional processes is critical for proper development, genome integrity, and stress responses in plants. Many genes involved in the key processes of transcriptional and post-transcriptional regulation have been well studied in model diploid organisms. However, gene and genome duplication may alter the function of the genes involved in these processes. To address this question, we assayed the stress-induced transcription patterns of duplicated gene pairs involved in RNAi and DNA methylation processes in the paleopolyploid soybean. Real-time quantitative PCR and Sequenom MassARRAY expression assays were used to profile the relative expression ratios of eight gene pairs across eight different biotic and abiotic stress conditions. The transcriptional responses to stress for genes involved in DNA methylation, RNAi processing, and miRNA processing were compared. The strongest evidence for pairwise co-expression in response to stresses was exhibited by non-paralogous Dicer-like (DCL) genes GmDCL2a-GmDCL3a and GmDCL1b-GmDCL2b, most profoundly in root tissues. Among homoeologous or paralogous DCL genes, the Dicer-like 2 (DCL2) gene pair exhibited the strongest response to stress and most conserved co-expression pattern. This was surprising because the DCL2 duplication event is more ancient than the other DCL duplications. Possible mechanisms that may be driving the DCL2 co-expression are discussed. PMID:22527487

  13. Age gene expression and coexpression progressive signatures in peripheral blood leukocytes.

    PubMed

    Irizar, Haritz; Goñi, Joaquín; Alzualde, Ainhoa; Castillo-Triviño, Tamara; Olascoaga, Javier; Lopez de Munain, Adolfo; Otaegui, David

    2015-12-01

    Both cellular senescence and organismic aging are known to be dynamic processes that start early in life and progress constantly during the whole life of the individual. In this work, with the objective of identifying signatures of age-related progressive change at the transcriptomic level, we have performed a whole-genome gene expression analysis of peripheral blood leukocytes in a group of healthy individuals with ages ranging from 14 to 93 years. A set of genes with progressively changing gene expression (either increase or decrease with age) has been identified and contextualized in a coexpression network. A modularity analysis has been performed on this network and biological-term and pathway enrichment analyses have been used for biological interpretation of each module. In summary, the results of the present work reveal the existence of a transcriptomic component that shows progressive expression changes associated to age in peripheral blood leukocytes, highlighting both the dynamic nature of the process and the need to complement young vs. elder studies with longitudinal studies that include middle aged individuals. From the transcriptional point of view, immunosenescence seems to be occurring from a relatively early age, at least from the late 20s/early 30s, and the 49-56 year old age-range appears to be critical. In general, the genes that, according to our results, show progressive expression changes with aging are involved in pathogenic/cellular processes that have classically been linked to aging in humans: cancer, immune processes and cellular growth vs. maintenance. PMID:26362218

  14. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474

  15. Predicting targeted drug combinations based on Pareto optimal patterns of coexpression network connectivity

    PubMed Central

    2014-01-01

    Background Molecularly targeted drugs promise a safer and more effective treatment modality than conventional chemotherapy for cancer patients. However, tumors are dynamic systems that readily adapt to these agents activating alternative survival pathways as they evolve resistant phenotypes. Combination therapies can overcome resistance but finding the optimal combinations efficiently presents a formidable challenge. Here we introduce a new paradigm for the design of combination therapy treatment strategies that exploits the tumor adaptive process to identify context-dependent essential genes as druggable targets. Methods We have developed a framework to mine high-throughput transcriptomic data, based on differential coexpression and Pareto optimization, to investigate drug-induced tumor adaptation. We use this approach to identify tumor-essential genes as druggable candidates. We apply our method to a set of ER+ breast tumor samples, collected before (n = 58) and after (n = 60) neoadjuvant treatment with the aromatase inhibitor letrozole, to prioritize genes as targets for combination therapy with letrozole treatment. We validate letrozole-induced tumor adaptation through coexpression and pathway analyses in an independent data set (n = 18). Results We find pervasive differential coexpression between the untreated and letrozole-treated tumor samples as evidence of letrozole-induced tumor adaptation. Based on patterns of coexpression, we identify ten genes as potential candidates for combination therapy with letrozole including EPCAM, a letrozole-induced essential gene and a target to which drugs have already been developed as cancer therapeutics. Through replication, we validate six letrozole-induced coexpression relationships and confirm the epithelial-to-mesenchymal transition as a process that is upregulated in the residual tumor samples following letrozole treatment. Conclusions To derive the greatest benefit from molecularly targeted drugs it is

  16. The Detection of Metabolite-Mediated Gene Module Co-Expression Using Multivariate Linear Models

    PubMed Central

    Padayachee, Trishanta; Khamiakova, Tatsiana; Shkedy, Ziv; Perola, Markus; Salo, Perttu; Burzykowski, Tomasz

    2016-01-01

    Investigating whether metabolites regulate the co-expression of a predefined gene module is one of the relevant questions posed in the integrative analysis of metabolomic and transcriptomic data. This article concerns the integrative analysis of the two high-dimensional datasets by means of multivariate models and statistical tests for the dependence between metabolites and the co-expression of a gene module. The general linear model (GLM) for correlated data that we propose models the dependence between adjusted gene expression values through a block-diagonal variance-covariance structure formed by metabolic-subset specific general variance-covariance blocks. Performance of statistical tests for the inference of conditional co-expression are evaluated through a simulation study. The proposed methodology is applied to the gene expression data of the previously characterized lipid-leukocyte module. Our results show that the GLM approach improves on a previous approach by being less prone to the detection of spurious conditional co-expression. PMID:26918614

  17. Differential co-expression analysis of venous thromboembolism based on gene expression profile data

    PubMed Central

    MING, ZHIBING; DING, WENBIN; YUAN, RUIFAN; JIN, JIE; LI, XIAOQIANG

    2016-01-01

    The aim of the present study was to screen differentially co-expressed genes and the involved transcription factors (TFs) and microRNAs (miRNAs) in venous thromboembolism (VTE). Microarray data of GSE19151 were downloaded from Gene Expression Omnibus, including 70 patients with VTE and 63 healthy controls. Principal component analysis (PCA) was performed using R software. Differential co-expression analysis was performed using R, followed by screening of modules using Cytoscape. Functional annotation was performed using Database for Annotation, Visualization, and Integrated Discovery. Moreover, Fisher test was used to screen key TFs and miRNAs for the modules. PCA revealed the disease and healthy samples could not be distinguished at the gene expression level. A total of 4,796 upregulated differentially co-expressed genes (e.g. zinc finger protein 264, electron-transfer-flavoprotein, beta polypeptide and Janus kinase 2) and 3,629 downregulated differentially co-expressed genes (e.g. adenylate cyclase 7 and single-stranded DNA binding protein 2) were identified, which were further mined to obtain 17 and eight modules separately. Functional annotation revealed that the largest upregulated module was primarily associated with acetylation and the largest downregulated module was mainly involved in mitochondrion. Moreover, 48 TFs and 62 miRNA families were screened for the 17 upregulated modules, such as E2F transcription factor 4, miR-30 and miR-135 regulating the largest module. Conversely, 35 TFs and 18 miRNA families were identified for the 8 downregulated modules, including mitochondrial ribosomal protein S12 and miR-23 regulating the largest module. Differentially co-expressed genes regulated by TFs and miRNAs may jointly contribute to the abnormal acetylation and mitochondrion presentation in the progression of VTE. PMID:27284300

  18. Characterization of Chemically Induced Liver Injuries Using Gene Co-Expression Modules

    PubMed Central

    Tawa, Gregory J.; AbdulHameed, Mohamed Diwan M.; Yu, Xueping; Kumar, Kamal; Ippolito, Danielle L.; Lewis, John A.; Stallings, Jonathan D.; Wallqvist, Anders

    2014-01-01

    Liver injuries due to ingestion or exposure to chemicals and industrial toxicants pose a serious health risk that may be hard to assess due to a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific damage and clinical outcomes via biomarkers or biomarker panels will provide the foundation for highly specific and robust diagnostic tests. Here, we have used DrugMatrix, a toxicogenomics database containing organ-specific gene expression data matched to dose-dependent chemical exposures and adverse clinical pathology assessments in Sprague Dawley rats, to identify groups of co-expressed genes (modules) specific to injury endpoints in the liver. We identified 78 such gene co-expression modules associated with 25 diverse injury endpoints categorized from clinical pathology, organ weight changes, and histopathology. Using gene expression data associated with an injury condition, we showed that these modules exhibited different patterns of activation characteristic of each injury. We further showed that specific module genes mapped to 1) known biochemical pathways associated with liver injuries and 2) clinically used diagnostic tests for liver fibrosis. As such, the gene modules have characteristics of both generalized and specific toxic response pathways. Using these results, we proposed three gene signature sets characteristic of liver fibrosis, steatosis, and general liver injury based on genes from the co-expression modules. Out of all 92 identified genes, 18 (20%) genes have well-documented relationships with liver disease, whereas the rest are novel and have not previously been associated with liver disease. In conclusion, identifying gene co-expression modules associated with chemically induced liver injuries aids in generating testable hypotheses and has the potential to identify putative biomarkers of adverse health effects. PMID:25226513

  19. Coexpression of two closely linked avian genes for purine nucleotide synthesis from a bidirectional promoter.

    PubMed Central

    Gavalas, A; Dixon, J E; Brayton, K A; Zalkin, H

    1993-01-01

    Two avian genes encoding essential steps in the purine nucleotide biosynthetic pathway are transcribed divergently from a bidirectional promoter element. The bidirectional promoter, embedded in a CpG island, directs coexpression of GPAT and AIRC genes from distinct transcriptional start sites 229 bp apart. The bidirectional promoter can be divided in half, with each half retaining partial activity towards the cognate gene. GPAT and AIRC genes encode the enzymes that catalyze step 1 and steps 6 plus 7, respectively, in the de novo purine biosynthetic pathway. This is the first report of genes coding for structurally unrelated enzymes of the same pathway that are tightly linked and transcribed divergently from a bidirectional promoter. This arrangement has the potential to provide for regulated coexpression comparable to that in a prokaryotic operon. Images PMID:8336716

  20. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism

    PubMed Central

    Pérez-Delgado, Carmen M.; Moyano, Tomás C.; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A.; Márquez, Antonio J.; Betti, Marco

    2016-01-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. PMID:27117340

  1. Use of transcriptomics and co-expression networks to analyze the interconnections between nitrogen assimilation and photorespiratory metabolism.

    PubMed

    Pérez-Delgado, Carmen M; Moyano, Tomás C; García-Calderón, Margarita; Canales, Javier; Gutiérrez, Rodrigo A; Márquez, Antonio J; Betti, Marco

    2016-05-01

    Nitrogen is one of the most important nutrients for plants and, in natural soils, its availability is often a major limiting factor for plant growth. Here we examine the effect of different forms of nitrogen nutrition and of photorespiration on gene expression in the model legume Lotus japonicus with the aim of identifying regulatory candidate genes co-ordinating primary nitrogen assimilation and photorespiration. The transcriptomic changes produced by the use of different nitrogen sources in leaves of L. japonicus plants combined with the transcriptomic changes produced in the same tissue by different photorespiratory conditions were examined. The results obtained provide novel information on the possible role of plastidic glutamine synthetase in the response to different nitrogen sources and in the C/N balance of L. japonicus plants. The use of gene co-expression networks establishes a clear relationship between photorespiration and primary nitrogen assimilation and identifies possible transcription factors connected to the genes of both routes. PMID:27117340

  2. ATTED-II in 2016: A Plant Coexpression Database Towards Lineage-Specific Coexpression

    PubMed Central

    Aoki, Yuichi; Okamura, Yasunobu; Tadaka, Shu; Kinoshita, Kengo; Obayashi, Takeshi

    2016-01-01

    ATTED-II (http://atted.jp) is a coexpression database for plant species with parallel views of multiple coexpression data sets and network analysis tools. The user can efficiently find functional gene relationships and design experiments to identify gene functions by reverse genetics and general molecular biology techniques. Here, we report updates to ATTED-II (version 8.0), including new and updated coexpression data and analysis tools. ATTED-II now includes eight microarray- and six RNA sequencing-based coexpression data sets for seven dicot species (Arabidopsis, field mustard, soybean, barrel medick, poplar, tomato and grape) and two monocot species (rice and maize). Stand-alone coexpression analyses tend to have low reliability. Therefore, examining evolutionarily conserved coexpression is a more effective approach from the viewpoints of reliability and evolutionary importance. In contrast, the reliability of species-specific coexpression data remains poor. Our assessment scores for individual coexpression data sets indicated that the quality of the new coexpression data sets in ATTED-II is higher than for any previous coexpression data set. In addition, five species (Arabidopsis, soybean, tomato, rice and maize) in ATTED-II are now supported by both microarray- and RNA sequencing-based coexpression data, which has increased the reliability. Consequently, ATTED-II can now provide lineage-specific coexpression information. As an example of the use of ATTED-II to explore lineage-specific coexpression, we demonstrate monocot- and dicot-specific coexpression of cell wall genes. With the expanded coexpression data for multilevel evaluation, ATTED-II provides new opportunities to investigate lineage-specific evolution in plants. PMID:26546318

  3. Increased co-expression of genes harboring the damaging de novo mutations in Chinese schizophrenic patients during prenatal development

    PubMed Central

    Wang, Qiang; Li, Miaoxin; Yang, Zhenxing; Hu, Xun; Wu, Hei-Man; Ni, Peiyan; Ren, Hongyan; Deng, Wei; Li, Mingli; Ma, Xiaohong; Guo, Wanjun; Zhao, Liansheng; Wang, Yingcheng; Xiang, Bo; Lei, Wei; Sham, Pak C; Li, Tao

    2015-01-01

    Schizophrenia is a heritable, heterogeneous common psychiatric disorder. In this study, we evaluated the hypothesis that de novo variants (DNVs) contribute to the pathogenesis of schizophrenia. We performed exome sequencing in Chinese patients (N = 45) with schizophrenia and their unaffected parents (N = 90). Forty genes were found to contain DNVs. These genes had enriched transcriptional co-expression profile in prenatal frontal cortex (Bonferroni corrected p < 9.1 × 10−3), and in prenatal temporal and parietal regions (Bonferroni corrected p < 0.03). Also, four prenatal anatomical subregions (VCF, MFC, OFC and ITC) have shown significant enrichment of connectedness in co-expression networks. Moreover, four genes (LRP1, MACF1, DICER1 and ABCA2) harboring the damaging de novo mutations are strongly prioritized as susceptibility genes by multiple evidences. Our findings in Chinese schizophrenic patients indicate the pathogenic role of DNVs, supporting the hypothesis that schizophrenia is a neurodevelopmental disease. PMID:26666178

  4. In silico prioritization based on coexpression can aid epileptic encephalopathy gene discovery

    PubMed Central

    Oliver, Karen L.; Lukic, Vesna; Freytag, Saskia; Scheffer, Ingrid E.; Berkovic, Samuel F.

    2016-01-01

    Objective: To evaluate the performance of an in silico prioritization approach that was applied to 179 epileptic encephalopathy candidate genes in 2013 and to expand the application of this approach to the whole genome based on expression data from the Allen Human Brain Atlas. Methods: PubMed searches determined which of the 179 epileptic encephalopathy candidate genes had been validated. For validated genes, it was noted whether they were 1 of the 19 of 179 candidates prioritized in 2013. The in silico prioritization approach was applied genome-wide; all genes were ranked according to their coexpression strength with a reference set (i.e., 51 established epileptic encephalopathy genes) in both adult and developing human brain expression data sets. Candidate genes ranked in the top 10% for both data sets were cross-referenced with genes previously implicated in the epileptic encephalopathies due to a de novo variant. Results: Five of 6 validated epileptic encephalopathy candidate genes were among the 19 prioritized in 2013 (odds ratio = 54, 95% confidence interval [7,∞], p = 4.5 × 10−5, Fisher exact test); one gene was false negative. A total of 297 genes ranked in the top 10% for both the adult and developing brain data sets based on coexpression with the reference set. Of these, 9 had been previously implicated in the epileptic encephalopathies (FBXO41, PLXNA1, ACOT4, PAK6, GABBR2, YWHAG, NBEA, KNDC1, and SELRC1). Conclusions: We conclude that brain gene coexpression data can be used to assist epileptic encephalopathy gene discovery and propose 9 genes as strong epileptic encephalopathy candidates worthy of further investigation. PMID:27066588

  5. Coexpression Network Analysis in Abdominal and Gluteal Adipose Tissue Reveals Regulatory Genetic Loci for Metabolic Syndrome and Related Phenotypes

    PubMed Central

    Min, Josine L.; Nicholson, George; Halgrimsdottir, Ingileif; Almstrup, Kristian; Petri, Andreas; Barrett, Amy; Travers, Mary; Rayner, Nigel W.; Mägi, Reedik; Pettersson, Fredrik H.; Broxholme, John; Neville, Matt J.; Wills, Quin F.; Cheeseman, Jane; Allen, Maxine; Holmes, Chris C.; Spector, Tim D.; Fleckner, Jan; McCarthy, Mark I.; Karpe, Fredrik; Lindgren, Cecilia M.; Zondervan, Krina T.

    2012-01-01

    Metabolic Syndrome (MetS) is highly prevalent and has considerable public health impact, but its underlying genetic factors remain elusive. To identify gene networks involved in MetS, we conducted whole-genome expression and genotype profiling on abdominal (ABD) and gluteal (GLU) adipose tissue, and whole blood (WB), from 29 MetS cases and 44 controls. Co-expression network analysis for each tissue independently identified nine, six, and zero MetS–associated modules of coexpressed genes in ABD, GLU, and WB, respectively. Of 8,992 probesets expressed in ABD or GLU, 685 (7.6%) were expressed in ABD and 51 (0.6%) in GLU only. Differential eigengene network analysis of 8,256 shared probesets detected 22 shared modules with high preservation across adipose depots (DABD-GLU = 0.89), seven of which were associated with MetS (FDR P<0.01). The strongest associated module, significantly enriched for immune response–related processes, contained 94/620 (15%) genes with inter-depot differences. In an independent cohort of 145/141 twins with ABD and WB longitudinal expression data, median variability in ABD due to familiality was greater for MetS–associated versus un-associated modules (ABD: 0.48 versus 0.18, P = 0.08; GLU: 0.54 versus 0.20, P = 7.8×10−4). Cis-eQTL analysis of probesets associated with MetS (FDR P<0.01) and/or inter-depot differences (FDR P<0.01) provided evidence for 32 eQTLs. Corresponding eSNPs were tested for association with MetS–related phenotypes in two GWAS of >100,000 individuals; rs10282458, affecting expression of RARRES2 (encoding chemerin), was associated with body mass index (BMI) (P = 6.0×10−4); and rs2395185, affecting inter-depot differences of HLA-DRB1 expression, was associated with high-density lipoprotein (P = 8.7×10−4) and BMI–adjusted waist-to-hip ratio (P = 2.4×10−4). Since many genes and their interactions influence complex traits such as MetS, integrated analysis of genotypes and

  6. Computational gene expression profiling under salt stress reveals patterns of co-expression.

    PubMed

    Sanchita; Sharma, Ashok

    2016-03-01

    Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411

  7. Datasets of genes coexpressed with FBN1 in mouse adipose tissue and during human adipogenesis.

    PubMed

    Davis, Margaret R; Arner, Erik; Duffy, Cairnan R E; De Sousa, Paul A; Dahlman, Ingrid; Arner, Peter; Summers, Kim M

    2016-09-01

    This article contains data related to the research article entitled "Expression of FBN1 during adipogenesis: relevance to the lipodystrophy phenotype in Marfan syndrome and related conditions" [1]. The article concerns the expression of FBN1, the gene encoding the extracellular matrix protein fibrillin-1, during adipogenesis in vitro and in relation to adipose tissue in vivo. The encoded protein has recently been shown to produce a short glucogenic peptide hormone, (Romere et al., 2016) [2], and this gene is therefore a key gene for regulating blood glucose levels. FBN1 and coexpressed genes were examined in mouse strains and in human cells undergoing adipogenesis. The data show the genes that were coexpressed with FBN1, including genes coding for other connective tissue proteins and the proteases that modify them and for the transcription factors that control their expression. Data analysed were derived from datasets available in the public domain and the analysis highlights the utility of such datasets for ongoing analysis and hence reduction in the use of experimental animals. PMID:27508231

  8. Salmonid genomes have a remarkably expanded akirin family, coexpressed with genes from conserved pathways governing skeletal muscle growth and catabolism

    PubMed Central

    Kristjánsson, Bjarni K.; Johnston, Ian A.

    2010-01-01

    Metazoan akirin genes regulate innate immunity, myogenesis, and carcinogenesis. Invertebrates typically have one family member, while most tetrapod and teleost vertebrates have one to three. We demonstrate an expanded repertoire of eight family members in genomes of four salmonid fishes, owing to paralog preservation after three tetraploidization events. Retention of paralogs secondarily lost in other teleosts may be related to functional diversification and posttranslational regulation. We hypothesized that salmonid akirins would be transcriptionally regulated in fast-twitch skeletal muscle during activation of conserved pathways governing catabolism and growth. The in vivo nutritional state of Arctic charr (Salvelinus alpinus L.) was experimentally manipulated, and transcript levels for akirin family members and 26 other genes were measured by quantitative real-time PCR (qPCR), allowing the establishment of a similarity network of expression profiles. In fasted muscle, a class of akirins was upregulated, with one family member showing high coexpression with catabolic genes coding the NF-κB p65 subunit, E2 ubiquitin-conjugating enzymes, E3 ubiquitin ligases, and IGF-I receptors. Another class of akirin was upregulated with subsequent feeding, coexpressed with 14-3-3 protein genes. There was no similarity between expression profiles of akirins with IGF hormones or binding protein genes. The level of phylogenetic relatedness of akirin family members was not a strong predictor of transcriptional responses to nutritional state, or differences in transcript abundance levels, indicating a complex pattern of regulatory evolution. The salmonid akirins epitomize the complexity linking the genome to physiological phenotypes of vertebrates with a history of tetraploidization. PMID:20388840

  9. Salmonid genomes have a remarkably expanded akirin family, coexpressed with genes from conserved pathways governing skeletal muscle growth and catabolism.

    PubMed

    Macqueen, Daniel J; Kristjánsson, Bjarni K; Johnston, Ian A

    2010-06-01

    Metazoan akirin genes regulate innate immunity, myogenesis, and carcinogenesis. Invertebrates typically have one family member, while most tetrapod and teleost vertebrates have one to three. We demonstrate an expanded repertoire of eight family members in genomes of four salmonid fishes, owing to paralog preservation after three tetraploidization events. Retention of paralogs secondarily lost in other teleosts may be related to functional diversification and posttranslational regulation. We hypothesized that salmonid akirins would be transcriptionally regulated in fast-twitch skeletal muscle during activation of conserved pathways governing catabolism and growth. The in vivo nutritional state of Arctic charr (Salvelinus alpinus L.) was experimentally manipulated, and transcript levels for akirin family members and 26 other genes were measured by quantitative real-time PCR (qPCR), allowing the establishment of a similarity network of expression profiles. In fasted muscle, a class of akirins was upregulated, with one family member showing high coexpression with catabolic genes coding the NF-kappaB p65 subunit, E2 ubiquitin-conjugating enzymes, E3 ubiquitin ligases, and IGF-I receptors. Another class of akirin was upregulated with subsequent feeding, coexpressed with 14-3-3 protein genes. There was no similarity between expression profiles of akirins with IGF hormones or binding protein genes. The level of phylogenetic relatedness of akirin family members was not a strong predictor of transcriptional responses to nutritional state, or differences in transcript abundance levels, indicating a complex pattern of regulatory evolution. The salmonid akirins epitomize the complexity linking the genome to physiological phenotypes of vertebrates with a history of tetraploidization. PMID:20388840

  10. An expression quantitative trait loci-guided co-expression analysis for constructing regulatory network using a rice recombinant inbred line population.

    PubMed

    Wang, Jia; Yu, Huihui; Weng, Xiaoyu; Xie, Weibo; Xu, Caiguo; Li, Xianghua; Xiao, Jinghua; Zhang, Qifa

    2014-03-01

    The ability to reveal the regulatory architecture of genes at the whole-genome level by constructing a regulatory network is critical for understanding the biological processes and developmental programmes of organisms. Here, we conducted an eQTL-guided function-related co-expression analysis to identify the putative regulators and construct gene regulatory network. We performed an eQTL analysis of 210 recombinant inbred lines (RILs) derived from a cross between two indica rice lines, Zhenshan 97 and Minghui 63, the parents of an elite hybrid, using data obtained by hybridizing RNA samples of flag leaves at the heading stage with Affymetrix whole-genome arrays. Making use of an ultrahigh-density single-nucleotide polymorphism bin map constructed by population sequencing, 13 647 eQTLs for 10 725 e-traits were detected, comprising 5079 cis-eQTLs (37.2%) and 8568 trans-eQTLs (62.8%). The analysis revealed 138 trans-eQTLs hotspots, each of which apparently regulates the expression variations of many genes. Co-expression analysis of functionally related genes within the framework of regulator-target relationships outlined by the eQTLs led to the identification of putative regulators in the system. The usefulness of the strategy was demonstrated with the genes known to be involved in flowering. We also applied this strategy to the analysis of QTLs for yield traits, which also suggested likely candidate genes. eQTL-guided co-expression analysis may provide a promising solution for outlining a framework for the complex regulatory network of an organism. PMID:24420573

  11. MGMT enrichment and second gene co-expression in hematopoietic progenitor cells using separate or dual-gene lentiviral vectors.

    PubMed

    Roth, Justin C; Alberti, Michael O; Ismail, Mourad; Lingas, Karen T; Reese, Jane S; Gerson, Stanton L

    2015-01-22

    The DNA repair gene O(6)-methylguanine-DNA methyltransferase (MGMT) allows efficient in vivo enrichment of transduced hematopoietic stem cells (HSC). Thus, linking this selection strategy to therapeutic gene expression offers the potential to reconstitute diseased hematopoietic tissue with gene-corrected cells. However, different dual-gene expression vector strategies are limited by poor expression of one or both transgenes. To evaluate different co-expression strategies in the context of MGMT-mediated HSC enrichment, we compared selection and expression efficacies in cells cotransduced with separate single-gene MGMT and GFP lentivectors to those obtained with dual-gene vectors employing either encephalomyocarditis virus (EMCV) internal ribosome entry site (IRES) or foot and mouth disease virus (FMDV) 2A elements for co-expression strategies. Each strategy was evaluated in vitro and in vivo using equivalent multiplicities of infection (MOI) to transduce 5-fluorouracil (5-FU) or Lin(-)Sca-1(+)c-kit(+) (LSK)-enriched murine bone marrow cells (BMCs). The highest dual-gene expression (MGMT(+)GFP(+)) percentages were obtained with the FMDV-2A dual-gene vector, but half of the resulting gene products existed as fusion proteins. Following selection, dual-gene expression percentages in single-gene vector cotransduced and dual-gene vector transduced populations were similar. Equivalent MGMT expression levels were obtained with each strategy, but GFP expression levels derived from the IRES dual-gene vector were significantly lower. In mice, vector-insertion averages were similar among cells enriched after dual-gene vectors and those cotransduced with single-gene vectors. These data demonstrate the limitations and advantages of each strategy in the context of MGMT-mediated selection, and may provide insights into vector design with respect to a particular therapeutic gene or hematologic defect. PMID:25479595

  12. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation

    PubMed Central

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches. PMID:26881263

  13. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation.

    PubMed

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches. PMID:26881263

  14. Co-Expression Analysis of Fetal Weight-Related Genes in Ovine Skeletal Muscle during Mid and Late Fetal Development Stages

    PubMed Central

    Xu, Lingyang; Zhao, Fuping; Ren, Hangxing; Li, Li; Lu, Jian; Liu, Jiasen; Zhang, Shifang; Liu, George E.; Song, Jiuzhou; Zhang, Li; Wei, Caihong; Du, Lixin

    2014-01-01

    Background: Muscle development and lipid metabolism play important roles during fetal development stages. The commercial Texel sheep are more muscular than the indigenous Ujumqin sheep. Results: We performed serial transcriptomics assays and systems biology analyses to investigate the dynamics of gene expression changes associated with fetal longissimus muscles during different fetal stages in two sheep breeds. Totally, we identified 1472 differentially expressed genes during various fetal stages using time-series expression analysis. A systems biology approach, weighted gene co-expression network analysis (WGCNA), was used to detect modules of correlated genes among these 1472 genes. Dramatically different gene modules were identified in four merged datasets, corresponding to the mid fetal stage in Texel and Ujumqin sheep, the late fetal stage in Texel and Ujumqin sheep, respectively. We further detected gene modules significantly correlated with fetal weight, and constructed networks and pathways using genes with high significances. In these gene modules, we identified genes like TADA3, LMNB1, TGF-β3, EEF1A2, FGFR1, MYOZ1, and FBP2 correlated with fetal weight. Conclusion: Our study revealed the complex network characteristics involved in muscle development and lipid metabolism during fetal development stages. Diverse patterns of the network connections observed between breeds and fetal stages could involve some hub genes, which play central roles in fetal development, correlating with fetal weight. Our findings could provide potential valuable biomarkers for selection of body weight-related traits in sheep and other livestock. PMID:25285036

  15. Modified Logistic Regression Models Using Gene Coexpression and Clinical Features to Predict Prostate Cancer Progression

    PubMed Central

    Zhao, Hongya; Logothetis, Christopher J.; Gorlov, Ivan P.; Zeng, Jia; Dai, Jianguo

    2013-01-01

    Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. PMID:24367394

  16. Module Based Differential Coexpression Analysis Method for Type 2 Diabetes

    PubMed Central

    Yuan, Lin; Zheng, Chun-Hou; Xia, Jun-Feng; Huang, De-Shuang

    2015-01-01

    More and more studies have shown that many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional biological pathway or network and are highly correlated. Differential coexpression analysis, as a more comprehensive technique to the differential expression analysis, was raised to research gene regulatory networks and biological pathways of phenotypic changes through measuring gene correlation changes between disease and normal conditions. In this paper, we propose a gene differential coexpression analysis algorithm in the level of gene sets and apply the algorithm to a publicly available type 2 diabetes (T2D) expression dataset. Firstly, we calculate coexpression biweight midcorrelation coefficients between all gene pairs. Then, we select informative correlation pairs using the “differential coexpression threshold” strategy. Finally, we identify the differential coexpression gene modules using maximum clique concept and k-clique algorithm. We apply the proposed differential coexpression analysis method on simulated data and T2D data. Two differential coexpression gene modules about T2D were detected, which should be useful for exploring the biological function of the related genes. PMID:26339648

  17. Novel role of ZmaNAC36 in co-expression of starch synthetic genes in maize endosperm.

    PubMed

    Zhang, Junjie; Chen, Jiang; Yi, Qiang; Hu, Yufeng; Liu, Hanmei; Liu, Yinghong; Huang, Yubi

    2014-02-01

    Starch is an essential commodity that is widely used as food, feed, fuel and in industry. However, its mechanism of synthesis is not fully understood, especially in terms of the expression and regulation of the starch synthetic genes. It was reported that the starch synthetic genes were co-expressed during maize endosperm development; however, the mechanism of the co-expression was not reported. In this paper, the ZmaNAC36 gene was amplified by homology-based cloning, and its expression vector was constructed for transient expression. The nuclear localization, transcriptional activation and target sites of the ZmaNAC36 protein were identified. The expression profile of ZmaNAC36 showed that it was strongly expressed in the maize endosperm and was co-expressed with most of the starch synthetic genes. Moreover, the expressions of many starch synthesis genes in the endosperm were upregulated when ZmaNAC36 was transiently overexpressed. All our results indicated that NAC36 might be a transcription factor and play a potential role in the co-expression of starch synthetic genes in the maize endosperm. PMID:24235061

  18. Mining Temporal Protein Complex Based on the Dynamic PIN Weighted with Connected Affinity and Gene Co-Expression

    PubMed Central

    Shen, Xianjun; Jiang, Xingpeng; He, Tingting; Hu, Xiaohua; Yang, Jincai

    2016-01-01

    The identification of temporal protein complexes would make great contribution to our knowledge of the dynamic organization characteristics in protein interaction networks (PINs). Recent studies have focused on integrating gene expression data into static PIN to construct dynamic PIN which reveals the dynamic evolutionary procedure of protein interactions, but they fail in practice for recognizing the active time points of proteins with low or high expression levels. We construct a Time-Evolving PIN (TEPIN) with a novel method called Deviation Degree, which is designed to identify the active time points of proteins based on the deviation degree of their own expression values. Owing to the differences between protein interactions, moreover, we weight TEPIN with connected affinity and gene co-expression to quantify the degree of these interactions. To validate the efficiencies of our methods, ClusterONE, CAMSE and MCL algorithms are applied on the TEPIN, DPIN (a dynamic PIN constructed with state-of-the-art three-sigma method) and SPIN (the original static PIN) to detect temporal protein complexes. Each algorithm on our TEPIN outperforms that on other networks in terms of match degree, sensitivity, specificity, F-measure and function enrichment etc. In conclusion, our Deviation Degree method successfully eliminates the disadvantages which exist in the previous state-of-the-art dynamic PIN construction methods. Moreover, the biological nature of protein interactions can be well described in our weighted network. Weighted TEPIN is a useful approach for detecting temporal protein complexes and revealing the dynamic protein assembly process for cellular organization. PMID:27100396

  19. Gene co-expression analysis identifies brain regions and cell types involved in migraine pathophysiology: a GWAS-based study using the Allen Human Brain Atlas.

    PubMed

    Eising, Else; Huisman, Sjoerd M H; Mahfouz, Ahmed; Vijfhuizen, Lisanne S; Anttila, Verneri; Winsvold, Bendik S; Kurth, Tobias; Ikram, M Arfan; Freilinger, Tobias; Kaprio, Jaakko; Boomsma, Dorret I; van Duijn, Cornelia M; Järvelin, Marjo-Riitta R; Zwart, John-Anker; Quaye, Lydia; Strachan, David P; Kubisch, Christian; Dichgans, Martin; Davey Smith, George; Stefansson, Kari; Palotie, Aarno; Chasman, Daniel I; Ferrari, Michel D; Terwindt, Gisela M; de Vries, Boukje; Nyholt, Dale R; Lelieveldt, Boudewijn P F; van den Maagdenberg, Arn M J M; Reinders, Marcel J T

    2016-04-01

    Migraine is a common disabling neurovascular brain disorder typically characterised by attacks of severe headache and associated with autonomic and neurological symptoms. Migraine is caused by an interplay of genetic and environmental factors. Genome-wide association studies (GWAS) have identified over a dozen genetic loci associated with migraine. Here, we integrated migraine GWAS data with high-resolution spatial gene expression data of normal adult brains from the Allen Human Brain Atlas to identify specific brain regions and molecular pathways that are possibly involved in migraine pathophysiology. To this end, we used two complementary methods. In GWAS data from 23,285 migraine cases and 95,425 controls, we first studied modules of co-expressed genes that were calculated based on human brain expression data for enrichment of genes that showed association with migraine. Enrichment of a migraine GWAS signal was found for five modules that suggest involvement in migraine pathophysiology of: (i) neurotransmission, protein catabolism and mitochondria in the cortex; (ii) transcription regulation in the cortex and cerebellum; and (iii) oligodendrocytes and mitochondria in subcortical areas. Second, we used the high-confidence genes from the migraine GWAS as a basis to construct local migraine-related co-expression gene networks. Signatures of all brain regions and pathways that were prominent in the first method also surfaced in the second method, thus providing support that these brain regions and pathways are indeed involved in migraine pathophysiology. PMID:26899160

  20. Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

    PubMed

    Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed

  1. Multi-tissue Analysis of Co-expression Networks by Higher-Order Generalized Singular Value Decomposition Identifies Functionally Coherent Transcriptional Modules

    PubMed Central

    Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed

  2. Co-expression networks revealed potential core lncRNAs in the triple-negative breast cancer.

    PubMed

    Yang, Fan; Liu, Ye-Huan; Dong, Si-Yang; Yao, Zhi-Han; Lv, Lin; Ma, Rui-Min; Dai, Xuan-Xuan; Wang, Jiao; Zhang, Xiao-Hua; Wang, Ou-Chen

    2016-10-15

    Triple-negative breast cancer (TNBC) is an aggressive type of breast cancer with unfavorable outcome. It is urgent to explore novel biomarkers and potential therapeutic targets in this malignancy. Increasing knowledge of long noncoding RNAs (lncRNAs) significantly deepens our understanding of cancer biology. Here, we sequenced eight paired TNBC tumor tissues and non-cancerous tissues, and validated significantly differentially expressed lncRNAs. Gene ontology (GO) and pathway analysis were used to investigate the function of differentially expressed mRNAs. Further, potential core lncRNAs in TNBC were identified by co-expression networks. Kaplan-Meier analysis also indicated that breast cancer patients with lower expression level of rhabdomyosarcoma 2 associated transcript (RMST), one of the potential core lncRNAs, had worse overall survival. To the best of our knowledge, it was the first report that RMST was involved in breast cancer. Our research provided a rich resource to the research community for further investigating lncRNAs functions and identifying lncRNAs with diagnostic and therapeutic potentials in TNBC. PMID:27380926

  3. Gene Coexpression and Evolutionary Conservation Analysis of the Human Preimplantation Embryos.

    PubMed

    Liu, Tiancheng; Yu, Lin; Ding, Guohui; Wang, Zhen; Liu, Lei; Li, Hong; Li, Yixue

    2015-01-01

    Evolutionary developmental biology (EVO-DEVO) tries to decode evolutionary constraints on the stages of embryonic development. Two models--the "funnel-like" model and the "hourglass" model--have been proposed by investigators to illustrate the fluctuation of selective pressure on these stages. However, selective indices of stages corresponding to mammalian preimplantation embryonic development (PED) were undetected in previous studies. Based on single cell RNA sequencing of stages during human PED, we used coexpression method to identify gene modules activated in each of these stages. Through measuring the evolutionary indices of gene modules belonging to each stage, we observed change pattern of selective constraints on PED for the first time. The selective pressure decreases from the zygote stage to the 4-cell stage and increases at the 8-cell stage and then decreases again from 8-cell stage to the late blastocyst stages. Previous EVO-DEVO studies concerning the whole embryo development neglected the fluctuation of selective pressure in these earlier stages, and the fluctuation was potentially correlated with events of earlier stages, such as zygote genome activation (ZGA). Such oscillation in an earlier stage would further affect models of the evolutionary constraints on whole embryo development. Therefore, these earlier stages should be measured intensively in future EVO-DEVO studies. PMID:26273607

  4. Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics

    PubMed Central

    2014-01-01

    Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353

  5. [Construction of recombinant adenovirus co-expressing M1 and HA genes of influenza virus type A].

    PubMed

    Guo, Jian-Qiang; Yao, Li-Hong; Chen, Ai-Jun; Xu, Yi; Jia, Run-Qing; Bo, Hong; Dong, Jie; Zhou, Jian-Fang; Shu, Yue-Long; Zhang, Zhi-Qing

    2009-03-01

    Based on the human H5N1 influenza virus strain A/Anhui/1/2005, recombinant adenovirus co-expressing M1 and HA genes of H5N1 influenza virus was constructed using an internal ribosome entry site (IRES) sequence to link the two genes. The M1 and HA genes of H5N1 influenza virus were amplified by PCR and subcloned into pStar vector separately. Then the M1-IRES-HA fragment was amplified and subcloned into pShuttle-CMV vector, the shuttle plasmid was then linearized and transformed into BJ5183 bacteria which contained backbone vector pAd-Easy. The recombinant vector pAd-Easy was packaged in 293 cells to get recombinant adenovirus Ad-M1/HA. CPE was observed after 293 cells were transfected by Ad-M1/HA. The co-expression of M1 and HA genes was confirmed by Western-blot and IFA (immunofluorescence assay). The IRES containing recombinant adenovirus allowed functional co-expression of M1 and HA genes and provided the foundation for developing new influenza vaccines with adenoviral vector. PMID:19678564

  6. Bioinformatics Data Mining Approach Suggests Coexpression of AGTPBP1 with an ALS-linked Gene C9orf72

    PubMed Central

    Kitano, Shouta; Kino, Yoshihiro; Yamamoto, Yoji; Takitani, Mika; Miyoshi, Junko; Ishida, Tsuyoshi; Saito, Yuko; Arima, Kunimasa; Satoh, Jun-ichi

    2015-01-01

    BACKGROUND Expanded GGGGCC hexanucleotide repeats located in the noncoding region of the chromosome 9 open reading frame 72 (C9orf72) gene represent the most common genetic abnormality for familial and sporadic amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). Formation of nuclear RNA foci, accumulation of repeat-associated non-ATG-translated dipeptide-repeat proteins, and haploinsufficiency of C9orf72 are proposed for pathological mechanisms of C9ALS/FTD. However, at present, the physiological function of C9orf72 remains largely unknown. METHODS By searching on a bioinformatics database named COXPRESdb composed of the comprehensive gene coexpression data, we studied potential C9orf72 interactors. RESULTS We identified the ATP/GTP binding protein 1 (AGTPBP1) gene alternatively named NNA1 encoding a cytosolic carboxypeptidase whose mutation is causative of the degeneration of Purkinje cells and motor neurons as the most significant gene coexpressed with C9orf72. We verified coexpression and interaction of AGTPBP1 and C9orf72 in transfected cells by immunoprecipitation and in neurons of the human brain by double-labeling immunohistochemistry. Furthermore, we found a positive correlation between AGTPBP1 and C9orf72 mRNA expression levels in the set of 21 human brains examined. CONCLUSIONS These results suggest that AGTPBP1 serves as a C9orf72 interacting partner that plays a role in the regulation of neuronal function in a coordinated manner within the central nervous system. PMID:26106267

  7. cap alpha. -skeletal and. cap alpha. -cardiac actin genes are coexpressed in adult human skeletal muscle and heart

    SciTech Connect

    Gunning, P.; Ponte, P.; Blau, H.; Kedes, L.

    1983-11-01

    The authors determined the actin isotypes encoded by 30 actin cDNA clones previously isolated from an adult human muscle cDNA library. Using 3' untranslated region probes, derived from ..cap alpha.. skeletal, ..beta..- and ..gamma..-actin cDNAs and from an ..cap alpha..-cardiac actin genomic clone, they showed that 28 of the cDNAs correspond to ..cap alpha..-skeletal actin transcripts. Unexpectedly, however, the remaining two cDNA clones proved to derive from ..cap alpha..-cardiac actin mRNA. Sequence analysis confirmed that the two skeletal muscle ..cap alpha..-cardiac actin cDNAs are derived from transcripts of the cloned ..cap alpha..-cardiac actin gene. Comparison of total actin mRNA levels in adult skeletal muscle and adult heart revealed that the steady-state levels in skeletal muscle are about twofold greater, per microgram of total cellular RNA, than those in heart. Thus, in skeletal muscle and in heart, both of the sarcomeric actin mRNA isotypes are quite abundant transcripts. They conclude that ..cap alpha..-skeletal and ..cap alpha..-cardiac actin genes are coexpressed as an actin pair in human adult striated muscles. Since the smooth-muscle actins (aortic and stomach) and the cytoplasmic actins (..beta.. and ..gamma..) are known to be coexpressed in smooth muscle and nonmuscle cells, respectively, they postulate that coexpression of actin pairs may be a common feature of mammalian actin gene expression in all tissues.

  8. Coexpression Network Analysis of Benign and Malignant Phenotypes of SIV-Infected Sooty Mangabey and Rhesus Macaque

    PubMed Central

    Silvestri, Guido; Bosinger, Steven E.; Li, Bai-Lian; Jong, Ambrose; Zhou, Yan-Hong; Huang, Sheng-He

    2016-01-01

    To explore the differences between the extreme SIV infection phenotypes, nonprogression (BEN: benign) to AIDS in sooty mangabeys (SMs) and progression to AIDS (MAL: malignant) in rhesus macaques (RMs), we performed an integrated dual positive-negative connectivity (DPNC) analysis of gene coexpression networks (GCN) based on publicly available big data sets in the GEO database of NCBI. The microarray-based gene expression data sets were generated, respectively, from the peripheral blood of SMs and RMs at several time points of SIV infection. Significant differences of GCN changes in DPNC values were observed in SIV-infected SMs and RMs. There are three groups of enriched genes or pathways (EGPs) that are associated with three SIV infection phenotypes (BEN+, MAL+ and mixed BEN+/MAL+). The MAL+ phenotype in SIV-infected RMs is specifically associated with eight EGPs, including the protein ubiquitin proteasome system, p53, granzyme A, gramzyme B, polo-like kinase, Glucocorticoid receptor, oxidative phosyphorylation and mitochondrial signaling. Mitochondrial (endosymbiotic) dysfunction is solely present in RMs. Specific BEN+ pattern changes in four EGPs are identified in SIV-infected SMs, including the pathways contributing to interferon signaling, BRCA1/DNA damage response, PKR/INF induction and LGALS8. There are three enriched pathways (PRR-activated IRF signaling, RIG1-like receptor and PRR pathway) contributing to the mixed (BEN+/MAL+) phenotypes of SIV infections in RMs and SMs, suggesting that these pathways play a dual role in the host defense against viral infections. Further analysis of Hub genes in these GCNs revealed that the genes LGALS8 and IL-17RA, which positively regulate the barrier function of the gut mucosa and the immune homeostasis with the gut microbiota (exosymbiosis), were significantly differentially expressed in RMs and SMs. Our data suggest that there exists an exo- (dysbiosis of the gut microbiota) and endo- (mitochondrial dysfunction

  9. Enhanced production of shikimic acid using a multi-gene co-expression system in Escherichia coli.

    PubMed

    Liu, Xiang-Lei; Lin, Jun; Hu, Hai-Feng; Zhou, Bin; Zhu, Bao-Quan

    2016-04-01

    Shikimic acid (SA) is the key synthetic material for the chemical synthesis of Oseltamivir, which is prescribed as the front-line treatment for serious cases of influenza. Multi-gene expression vector can be used for expressing the plurality of the genes in one plasmid, so it is widely applied to increase the yield of metabolites. In the present study, on the basis of a shikimate kinase genetic defect strain Escherichia coli BL21 (ΔaroL/aroK, DE3), the key enzyme genes aroG, aroB, tktA and aroE of SA pathway were co-expressed and compared systematically by constructing a series of multi-gene expression vectors. The results showed that different gene co-expression combinations (two, three or four genes) or gene orders had different effects on the production of SA. SA production of the recombinant BL21-GBAE reached to 886.38 mg·L(-1), which was 17-fold (P < 0.05) of the parent strain BL21 (ΔaroL/aroK, DE3). PMID:27114316

  10. GENE EXPRESSION NETWORKS

    EPA Science Inventory

    "Gene expression network" is the term used to describe the interplay, simple or complex, between two or more gene products in performing a specific cellular function. Although the delineation of such networks is complicated by the existence of multiple and subtle types of intera...

  11. Identifying Gene Interaction Networks

    PubMed Central

    Bebek, Gurkan

    2016-01-01

    In this chapter, we introduce interaction networks by describing how they are generated, where they are stored, and how they are shared. We focus on publicly available interaction networks and describe a simple way of utilizing these resources. As a case study, we used Cytoscape, an open source and easy-to-use network visualization and analysis tool to first gather and visualize a small network. We have analyzed this network’s topological features and have looked at functional enrichment of the network nodes by integrating the gene ontology database. The methods described are applicable to larger networks that can be collected from various resources. PMID:22307715

  12. Coexpression of Nuclear Receptors and Histone Methylation Modifying Genes in the Testis: Implications for Endocrine Disruptor Modes of Action

    PubMed Central

    Anderson, Alison M.; Carter, Kim W.; Anderson, Denise; Wise, Michael J.

    2012-01-01

    Background Endocrine disruptor chemicals elicit adverse health effects by perturbing nuclear receptor signalling systems. It has been speculated that these compounds may also perturb epigenetic mechanisms and thus contribute to the early origin of adult onset disease. We hypothesised that histone methylation may be a component of the epigenome that is susceptible to perturbation. We used coexpression analysis of publicly available data to investigate the combinatorial actions of nuclear receptors and genes involved in histone methylation in normal testis and when faced with endocrine disruptor compounds. Methodology/Principal Findings The expression patterns of a set of genes were profiled across testis tissue in human, rat and mouse, plus control and exposed samples from four toxicity experiments in the rat. Our results indicate that histone methylation events are a more general component of nuclear receptor mediated transcriptional regulation in the testis than previously appreciated. Coexpression patterns support the role of a gatekeeper mechanism involving the histone methylation modifiers Kdm1, Prdm2, and Ehmt1 and indicate that this mechanism is a common determinant of transcriptional integrity for genes critical to diverse physiological endpoints relevant to endocrine disruption. Coexpression patterns following exposure to vinclozolin and dibutyl phthalate suggest that coactivity of the demethylase Kdm1 in particular warrants further investigation in relation to endocrine disruptor mode of action. Conclusions/Significance This study provides proof of concept that a bioinformatics approach that profiles genes related to a specific hypothesis across multiple biological settings can provide powerful insight into coregulatory activity that would be difficult to discern at an individual experiment level or by traditional differential expression analysis methods. PMID:22496781

  13. Genes and gene networks implicated in aggression related behaviour.

    PubMed

    Malki, Karim; Pain, Oliver; Du Rietz, Ebba; Tosto, Maria Grazia; Paya-Cano, Jose; Sandnabba, Kenneth N; de Boer, Sietse; Schalkwyk, Leonard C; Sluyter, Frans

    2014-10-01

    Aggressive behaviour is a major cause of mortality and morbidity. Despite of moderate heritability estimates, progress in identifying the genetic factors underlying aggressive behaviour has been limited. There are currently three genetic mouse models of high and low aggression created using selective breeding. This is the first study to offer a global transcriptomic characterization of the prefrontal cortex across all three genetic mouse models of aggression. A systems biology approach has been applied to transcriptomic data across the three pairs of selected inbred mouse strains (Turku Aggressive (TA) and Turku Non-Aggressive (TNA), Short Attack Latency (SAL) and Long Attack Latency (LAL) mice and North Carolina Aggressive (NC900) and North Carolina Non-Aggressive (NC100)), providing novel insight into the neurobiological mechanisms and genetics underlying aggression. First, weighted gene co-expression network analysis (WGCNA) was performed to identify modules of highly correlated genes associated with aggression. Probe sets belonging to gene modules uncovered by WGCNA were carried forward for network analysis using ingenuity pathway analysis (IPA). The RankProd non-parametric algorithm was then used to statistically evaluate expression differences across the genes belonging to modules significantly associated with aggression. IPA uncovered two pathways, involving NF-kB and MAPKs. The secondary RankProd analysis yielded 14 differentially expressed genes, some of which have previously been implicated in pathways associated with aggressive behaviour, such as Adrbk2. The results highlighted plausible candidate genes and gene networks implicated in aggression-related behaviour. PMID:25142712

  14. Identification of Crowding Stress Tolerance Co-Expression Networks Involved in Sweet Corn Yield.

    PubMed

    Choe, Eunsoo; Drnevich, Jenny; Williams, Martin M

    2016-01-01

    Tolerance to crowding stress has played a crucial role in improving agronomic productivity in field corn; however, commercial sweet corn hybrids vary greatly in crowding stress tolerance. The objectives were to 1) explore transcriptional changes among sweet corn hybrids with differential yield under crowding stress, 2) identify relationships between phenotypic responses and gene expression patterns, and 3) identify groups of genes associated with yield and crowding stress tolerance. Under conditions of crowding stress, three high-yielding and three low-yielding sweet corn hybrids were grouped for transcriptional and phenotypic analyses. Transcriptional analyses identified from 372 to 859 common differentially expressed genes (DEGs) for each hybrid. Large gene expression pattern variation among hybrids and only 26 common DEGs across all hybrid comparisons were identified, suggesting each hybrid has a unique response to crowding stress. Over-represented biological functions of DEGs also differed among hybrids. Strong correlation was observed between: 1) modules with up-regulation in high-yielding hybrids and yield traits, and 2) modules with up-regulation in low-yielding hybrids and plant/ear traits. Modules linked with yield traits may be important crowding stress response mechanisms influencing crop yield. Functional analysis of the modules and common DEGs identified candidate crowding stress tolerant processes in photosynthesis, glycolysis, cell wall, carbohydrate/nitrogen metabolic process, chromatin, and transcription regulation. Moreover, these biological functions were greatly inter-connected, indicating the importance of improving the mechanisms as a network. PMID:26796516

  15. Identification of Crowding Stress Tolerance Co-Expression Networks Involved in Sweet Corn Yield

    PubMed Central

    Choe, Eunsoo; Drnevich, Jenny; Williams, Martin M.

    2016-01-01

    Tolerance to crowding stress has played a crucial role in improving agronomic productivity in field corn; however, commercial sweet corn hybrids vary greatly in crowding stress tolerance. The objectives were to 1) explore transcriptional changes among sweet corn hybrids with differential yield under crowding stress, 2) identify relationships between phenotypic responses and gene expression patterns, and 3) identify groups of genes associated with yield and crowding stress tolerance. Under conditions of crowding stress, three high-yielding and three low-yielding sweet corn hybrids were grouped for transcriptional and phenotypic analyses. Transcriptional analyses identified from 372 to 859 common differentially expressed genes (DEGs) for each hybrid. Large gene expression pattern variation among hybrids and only 26 common DEGs across all hybrid comparisons were identified, suggesting each hybrid has a unique response to crowding stress. Over-represented biological functions of DEGs also differed among hybrids. Strong correlation was observed between: 1) modules with up-regulation in high-yielding hybrids and yield traits, and 2) modules with up-regulation in low-yielding hybrids and plant/ear traits. Modules linked with yield traits may be important crowding stress response mechanisms influencing crop yield. Functional analysis of the modules and common DEGs identified candidate crowding stress tolerant processes in photosynthesis, glycolysis, cell wall, carbohydrate/nitrogen metabolic process, chromatin, and transcription regulation. Moreover, these biological functions were greatly inter-connected, indicating the importance of improving the mechanisms as a network. PMID:26796516

  16. Genes2FANs: connecting genes through functional association networks

    PubMed Central

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  17. The gap gene network

    PubMed Central

    2010-01-01

    Gap genes are involved in segment determination during the early development of the fruit fly Drosophila melanogaster as well as in other insects. This review attempts to synthesize the current knowledge of the gap gene network through a comprehensive survey of the experimental literature. I focus on genetic and molecular evidence, which provides us with an almost-complete picture of the regulatory interactions responsible for trunk gap gene expression. I discuss the regulatory mechanisms involved, and highlight the remaining ambiguities and gaps in the evidence. This is followed by a brief discussion of molecular regulatory mechanisms for transcriptional regulation, as well as precision and size-regulation provided by the system. Finally, I discuss evidence on the evolution of gap gene expression from species other than Drosophila. My survey concludes that studies of the gap gene system continue to reveal interesting and important new insights into the role of gene regulatory networks in development and evolution. PMID:20927566

  18. Coexpression of multiple genes reconstitutes two pathways of very long-chain polyunsaturated fatty acid biosynthesis in Pichia pastoris.

    PubMed

    Kim, Sun Hee; Roh, Kyung Hee; Kim, Kwang-Soo; Kim, Hyun Uk; Lee, Kyeong-Ryeol; Kang, Han-Chul; Kim, Jong-Bum

    2014-09-01

    The introduction of novel traits to cells often requires the stable coexpression of multiple genes within the same cell. Herein, we report that C22 very long-chain polyunsaturated fatty acids (VLC-PUFAs) were synthesized from C18 precursors by reactions catalyzed by delta 6-desaturase, an ELOVL5 involved in VLC-PUFA elongation, and delta 5-desaturase. The coexpression of McD6DES, AsELOVL5, and PtD5DES encoding the corresponding enzymes, produced docosatetraenoic acid (C22:4 n-6) and docosapentaenoic acid (C22:5 n-3), as well as arachidonic acid (C20:4 n-6) and eicosapentaenoic acid (C20:5 n-3) in the methylotrophic yeast Pichia pastoris. The expression of each gene increased within 24 h, with high transcript levels after induction with 0.5 or 1 % methanol. High levels of the newly expressed VLC-PUFAs occurred after 144 h. This expression system exemplifies the recent progress and future possibilities of the metabolic engineering of VLC-PUFAs in oilseed crops. PMID:24863294

  19. Gene Coexpression Analysis Reveals Complex Metabolism of the Monoterpene Alcohol Linalool in Arabidopsis Flowers[W][OPEN

    PubMed Central

    Ginglinger, Jean-François; Boachon, Benoit; Höfer, René; Paetz, Christian; Köllner, Tobias G.; Miesch, Laurence; Lugan, Raphael; Baltenweck, Raymonde; Mutterer, Jérôme; Ullmann, Pascaline; Beran, Franziska; Claudel, Patricia; Verstappen, Francel; Fischer, Marc J.C.; Karst, Francis; Bouwmeester, Harro; Miesch, Michel; Schneider, Bernd; Gershenzon, Jonathan; Ehlting, Jürgen; Werck-Reichhart, Danièle

    2013-01-01

    The cytochrome P450 family encompasses the largest family of enzymes in plant metabolism, and the functions of many of its members in Arabidopsis thaliana are still unknown. Gene coexpression analysis pointed to two P450s that were coexpressed with two monoterpene synthases in flowers and were thus predicted to be involved in monoterpenoid metabolism. We show that all four selected genes, the two terpene synthases (TPS10 and TPS14) and the two cytochrome P450s (CYP71B31 and CYP76C3), are simultaneously expressed at anthesis, mainly in upper anther filaments and in petals. Upon transient expression in Nicotiana benthamiana, the TPS enzymes colocalize in vesicular structures associated with the plastid surface, whereas the P450 proteins were detected in the endoplasmic reticulum. Whether they were expressed in Saccharomyces cerevisiae or in N. benthamiana, the TPS enzymes formed two different enantiomers of linalool: (−)-(R)-linalool for TPS10 and (+)-(S)-linalool for TPS14. Both P450 enzymes metabolize the two linalool enantiomers to form different but overlapping sets of hydroxylated or epoxidized products. These oxygenated products are not emitted into the floral headspace, but accumulate in floral tissues as further converted or conjugated metabolites. This work reveals complex linalool metabolism in Arabidopsis flowers, the ecological role of which remains to be determined. PMID:24285789

  20. Comprehensive network analysis of anther-expressed genes in rice by the combination of 33 laser microdissection and 143 spatiotemporal microarrays.

    PubMed

    Aya, Koichiro; Suzuki, Go; Suwabe, Keita; Hobo, Tokunori; Takahashi, Hirokazu; Shiono, Katsuhiro; Yano, Kentaro; Tsutsumi, Nobuhiro; Nakazono, Mikio; Nagamura, Yoshiaki; Matsuoka, Makoto; Watanabe, Masao

    2011-01-01

    Co-expression networks systematically constructed from large-scale transcriptome data reflect the interactions and functions of genes with similar expression patterns and are a powerful tool for the comprehensive understanding of biological events and mining of novel genes. In Arabidopsis (a model dicot plant), high-resolution co-expression networks have been constructed from very large microarray datasets and these are publicly available as online information resources. However, the available transcriptome data of rice (a model monocot plant) have been limited so far, making it difficult for rice researchers to achieve reliable co-expression analysis. In this study, we performed co-expression network analysis by using combined 44 K agilent microarray datasets of rice, which consisted of 33 laser microdissection (LM)-microarray datasets of anthers, and 143 spatiotemporal transcriptome datasets deposited in RicexPro. The entire data of the rice co-expression network, which was generated from the 176 microarray datasets by the Pearson correlation coefficient (PCC) method with the mutual rank (MR)-based cut-off, contained 24,258 genes and 60,441 genes pairs. Using these datasets, we constructed high-resolution co-expression subnetworks of two specific biological events in the anther, "meiosis" and "pollen wall synthesis". The meiosis network contained many known or putative meiotic genes, including genes related to meiosis initiation and recombination. In the pollen wall synthesis network, several candidate genes involved in the sporopollenin biosynthesis pathway were efficiently identified. Hence, these two subnetworks are important demonstrations of the efficiency of co-expression network analysis in rice. Our co-expression analysis included the separated transcriptomes of pollen and tapetum cells in the anther, which are able to provide precise information on transcriptional regulation during male gametophyte development in rice. The co-expression network data

  1. Comprehensive Network Analysis of Anther-Expressed Genes in Rice by the Combination of 33 Laser Microdissection and 143 Spatiotemporal Microarrays

    PubMed Central

    Takahashi, Hirokazu; Shiono, Katsuhiro; Yano, Kentaro; Tsutsumi, Nobuhiro; Nakazono, Mikio; Nagamura, Yoshiaki; Matsuoka, Makoto; Watanabe, Masao

    2011-01-01

    Co-expression networks systematically constructed from large-scale transcriptome data reflect the interactions and functions of genes with similar expression patterns and are a powerful tool for the comprehensive understanding of biological events and mining of novel genes. In Arabidopsis (a model dicot plant), high-resolution co-expression networks have been constructed from very large microarray datasets and these are publicly available as online information resources. However, the available transcriptome data of rice (a model monocot plant) have been limited so far, making it difficult for rice researchers to achieve reliable co-expression analysis. In this study, we performed co-expression network analysis by using combined 44 K agilent microarray datasets of rice, which consisted of 33 laser microdissection (LM)-microarray datasets of anthers, and 143 spatiotemporal transcriptome datasets deposited in RicexPro. The entire data of the rice co-expression network, which was generated from the 176 microarray datasets by the Pearson correlation coefficient (PCC) method with the mutual rank (MR)-based cut-off, contained 24,258 genes and 60,441 genes pairs. Using these datasets, we constructed high-resolution co-expression subnetworks of two specific biological events in the anther, “meiosis” and “pollen wall synthesis”. The meiosis network contained many known or putative meiotic genes, including genes related to meiosis initiation and recombination. In the pollen wall synthesis network, several candidate genes involved in the sporopollenin biosynthesis pathway were efficiently identified. Hence, these two subnetworks are important demonstrations of the efficiency of co-expression network analysis in rice. Our co-expression analysis included the separated transcriptomes of pollen and tapetum cells in the anther, which are able to provide precise information on transcriptional regulation during male gametophyte development in rice. The co-expression network

  2. SiBIC: A Web Server for Generating Gene Set Networks Based on Biclusters Obtained by Maximal Frequent Itemset Mining

    PubMed Central

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp. PMID:24386124

  3. SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

    PubMed

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp. PMID:24386124

  4. Use of the growing environment as a source of variation to identify the quantitative trait transcripts and modules of co-expressed genes that determine chlorogenic acid accumulation

    PubMed Central

    JOËT, THIERRY; SALMONA, JORDI; LAFFARGUE, ANDRÉINA; DESCROIX, FRÉDÉRIC; DUSSERT, STÉPHANE

    2010-01-01

    Developing Coffea arabica seeds accumulate large amounts of chlorogenic acids (CGAs) as a storage form of phenylpropanoid derivatives, making coffee a valuable model to investigate the metabolism of these widespread plant phenolics. However, developmental and environmental regulations of CGA metabolism are poorly understood. In the present work, the expression of selected phenylpropanoid genes, together with CGA isomer profiles, was monitored throughout seed development across a wide set of contrasted natural environments. Although CGA metabolism was controlled by major developmental factors, the mean temperature during seed development had a direct impact on the time-window of CGA biosynthesis, as well as on final CGA isomer composition through subtle transcriptional regulations. We provide evidence that the variability induced by the environment is a useful tool to test whether CGA accumulation is quantitatively modulated at the transcriptional level, hence enabling detection of rate-limiting transcriptional steps [quantitative trait transcripts (QTTs)] for CGA biosynthesis. Variations induced by the environment also enabled a better description of the phenylpropanoid gene transcriptional network throughout seed development, as well as the detection of three temporally distinct modules of quantitatively co-expressed genes. Finally, analysis of metabolite-to-metabolite relationships revealed new biochemical characteristics of the isomerization steps that remain uncharacterized at the gene level. PMID:20199615

  5. Functional Analysis of Prognostic Gene Expression Network Genes in Metastatic Breast Cancer Models

    PubMed Central

    Geiger, Thomas R.; Ha, Ngoc-Han; Faraji, Farhoud; Michael, Helen T.; Rodriguez, Loren; Walker, Renard C.; Green, Jeffery E.; Simpson, R. Mark; Hunter, Kent W.

    2014-01-01

    Identification of conserved co-expression networks is a useful tool for clustering groups of genes enriched for common molecular or cellular functions [1]. The relative importance of genes within networks can frequently be inferred by the degree of connectivity, with those displaying high connectivity being significantly more likely to be associated with specific molecular functions [2]. Previously we utilized cross-species network analysis to identify two network modules that were significantly associated with distant metastasis free survival in breast cancer. Here, we validate one of the highly connected genes as a metastasis associated gene. Tpx2, the most highly connected gene within a proliferation network specifically prognostic for estrogen receptor positive (ER+) breast cancers, enhances metastatic disease, but in a tumor autonomous, proliferation-independent manner. Histologic analysis suggests instead that variation of TPX2 levels within disseminated tumor cells may influence the transition between dormant to actively proliferating cells in the secondary site. These results support the co-expression network approach for identification of new metastasis-associated genes to provide new information regarding the etiology of breast cancer progression and metastatic disease. PMID:25368990

  6. Significant enhancement of methionol production by co-expression of the aminotransferase gene ARO8 and the decarboxylase gene ARO10 in Saccharomyces cerevisiae.

    PubMed

    Yin, Sheng; Lang, Tiandan; Xiao, Xiao; Liu, Li; Sun, Baoguo; Wang, Chengtao

    2015-03-01

    Methionol is an important volatile sulfur flavor compound, which can be produced via the Ehrlich pathway in Saccharomyces cerevisiae. Aminotransferase and decarboxylase are essential enzymes catalyzing methionol biosynthesis. In this work, two aminotransferase genes ARO8 and ARO9 and one decarboxylase gene ARO10 were introduced into S. cerevisiae S288c, respectively, via an expression vector. Over-expression of ARO8 resulted in higher aminotransferase activity than that of ARO9. And the cellular decarboxylase activity was remarkably increased by over-expression of ARO10. A co-expression vector carrying both ARO8 and ARO10 was further constructed to generate the recombinant strain S810. Shaking flask experiments showed that the methionol yield from S810 reached 1.27 g L(-1), which was increased by 51.8 and 68.8% compared to that from the wild-type strain and the control strain harboring the empty vector. The fed-batch fermentation by strain S810 produced 3.24 g L(-1) of methionol after 72 h of cultivation in a bioreactor. These results demonstrated that co-expression of ARO8 and ARO10 significantly boosted the methionol production. It is the first time that more than 3.0 g L(-1) of methionol produced by genetically engineered yeast strain was reported by co-expression of the aminotransferase and decarboxylase via the Ehrlich pathway. PMID:25743068

  7. Co-expression of G2-EPSPS and glyphosate acetyltransferase GAT genes conferring high tolerance to glyphosate in soybean

    PubMed Central

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Jin, Longguo; Zhang, Lijuan; Chang, Ru-Zhen; Lu, Wei; Lin, Min; Qiu, Li-Juan

    2015-01-01

    Glyphosate is a widely used non-selective herbicide with broad spectrum of weed control around the world. At present, most of the commercial glyphosate tolerant soybeans utilize glyphosate tolerant gene CP4-EPSPS or glyphosate acetyltransferase gene GAT separately. In this study, both glyphosate tolerant gene G2-EPSPS and glyphosate degraded gene GAT were co-transferred into soybean and transgenic plants showed high tolerance to glyphosate. Molecular analysis including PCR, Sothern blot, qRT-PCR, and Western blot revealed that target genes have been integrated into genome and expressed effectively at both mRNA and protein levels. Furthermore, the glyphosate tolerance analysis showed that no typical symptom was observed when compared with a glyphosate tolerant line HJ06-698 derived from GR1 transgenic soybean even at fourfold labeled rate of Roundup. Chlorophyll and shikimic acid content analysis of transgenic plant also revealed that these two indexes were not significantly altered after glyphosate application. These results indicated that co-expression of G2-EPSPS and GAT conferred high tolerance to the herbicide glyphosate in soybean. Therefore, combination of tolerant and degraded genes provides a new strategy for developing glyphosate tolerant transgenic crops. PMID:26528311

  8. Evolution of akirin family in gene and genome levels and coexpressed patterns among family members and rel gene in croaker.

    PubMed

    Liu, Tianxing; Gao, Yunhang; Xu, Tianjun

    2015-09-01

    Akirins, which are highly conserved nuclear proteins, are present throughout the metazoan and regulate innate immunity, embryogenesis, myogenesis, and carcinogenesis. This study reports all akirin genes from miiuy croaker and analyzes comprehensively the akirin gene family combined with akirin genes from other species. A second nuclear localization signal (NLS) is observed in akirin2 homologues, which is not in akirin1 homologues in all teleosts and most other vertebrates. Thus, we deduced that the loss of second NLS in akirin1 homologues in teleosts likely occurred in an ancestor to all Osteichthyes after splitting with cartilaginous fish. Significantly, the akirin2(2) gene included six exons interrupted by five introns in the miiuy croaker, which may be caused by the intron insertion event as a novel evidence for the variation of akirin gene structure in some species. In addition, comparison of the genomic neighborhood genes of akirin1, akirin2(1), and akirin2(2) demonstrates a strong level of conserved synteny across the teleost classes, which further proved the deduction of Macqueen and Johnston 2009 that the produce of akirin paralogues can be attributed to whole-genome duplications and the loss of some akirin paralogues after genome duplications. Furthermore, akirin gene family members and relish gene are ubiquitously expressed across all tissues, and their expression levels are increased in three immune tissues after infection with Vibrio anguillarum. Combined with the expression patterns of LEAP-1 and LEAP-2 from miiuy croaker, an intricate network of co-regulation among family members is established. Thus, it is further proved that akirins acted in concert with the relish protein to induce the expression of a subset of downstream pathway elements in the NF-kB dependent signaling pathway. PMID:25912355

  9. Oppositely imprinted genes H19 and insulin-like growth factor 2 are coexpressed in human androgenetic trophoblast.

    PubMed Central

    Mutter, G L; Stewart, C L; Chaponot, M L; Pomponio, R J

    1993-01-01

    Human uniparental gestations such as gynogenetic ovarian teratomas and androgenetic complete hydatidiform moles provide a model to evaluate the integrity of parent-specific gene expression--i.e., imprinting--in the absence of a complementary parental genetic contribution. We studied expression, in these tissues, of the oppositely imprinted genes H19, which is an embryonic nontranslated RNA, and insulin-like growth factor type 2 (IGF2). Normal gestations only express H19 from the maternal allele and express IGF2 from the paternal allele, whereas neither is expressed from the maternal genome of gynogenetic gestations, and both are expressed from the paternal genome of androgenetic gestations. Coexpression of H19 and IGF2 in the androgenetic tissues was in a single population of cells, mononuclear trophoblast--the same cell type expressing these genes in biparental placentas. These results demonstrate that a biparental genome may be required for expression of the reciprocal IGF2/H19 imprint. Alternatively, biparental expression may be a normal feature of some imprinted genes in specific cell types. Additional experiments with other imprinted genes will clarify whether this reflects global failure of the imprinting process or a change specific to the IGF2/H19 locus. Images Figure 1 Figure 2 Figure 3 PMID:7692725

  10. Enhancement of heavy metal accumulation by tissue specific co-expression of iaaM and ACC deaminase genes in plants.

    PubMed

    Zhang, Yong; Zhao, Lihong; Wang, Yao; Yang, Baoyu; Chen, Shiyun

    2008-06-01

    1-Aminocyclopropane deaminase (ACC) and tryptophan monooxygenase are two enzymes involved in plant senescence-inhibiting and growth-promoting regulation, respectively. In this study, two binary vectors were constructed in which the Agrobacterium iaaM gene was under the transcriptional control of a xylem-specific glycine-rich protein promoter alone, or co-expressed with the bacterial ACC deaminase gene, which was driven by the constitutive CaMV 35S promoter. Transgenic petunia shoots co-expressing both genes were able to root on medium supplemented with 7.5 mg l(-1) CoCl2. When T1 transgenic tobacco plants were grown in sand supplemented with Cu2+ and Co2+, tissue specific co-expression of both iaaM and ACC deaminase genes showed faster growth with larger biomass with a more extensive root system, and accumulated a greater amount of heavy metals than the empty vector control plants. When T1 transgenic tobacco plants were grown in soil watered with different concentrations of CuSO4, xylem specific expression of the iaaM gene caused the accumulation of more Cu2+ than the empty vector control at lower CuSO4 concentrations, but showed severe toxic symptoms at concentration of 100 mg l(-1) CuSO4. T1 transgenic plants co-expressing both genes accumulated more heavy metals into the plant shoots and can tolerate CuSO4 at 150 mg l(-1). In addition, plants co-expressing these two genes can grow well in a complex contaminated soil containing both inorganic and organic pollutants, while the growth of the control plants was greatly inhibited. PMID:18471863

  11. Differential Co-Expression between α-Synuclein and IFN-γ Signaling Genes across Development and in Parkinson’s Disease

    PubMed Central

    Liscovitch, Noa; French, Leon

    2014-01-01

    Expression patterns of the alpha-synuclein gene (SNCA) were studied across anatomy, development, and disease to better characterize its role in the brain. In this postmortem study, negative spatial co-expression between SNCA and 73 interferon-γ (IFN-γ) signaling genes was observed across many brain regions. Recent animal studies have demonstrated that IFN-γ induces loss of dopamine neurons and nigrostriatal degeneration. This opposing pattern between SNCA and IFN-γ signaling genes increases with age (rho = −0.78). In contrast, a meta-analysis of four microarray experiments representing 126 substantia nigra samples reveals a switch to positive co-expression in Parkinson’s disease (p<0.005). Use of genome-wide testing demonstrates this relationship is specific to SNCA (p<0.002). This change in co-expression suggests an immunomodulatory role of SNCA that may provide insight into neurodegeneration. Genes showing similar co-expression patterns have been previously linked to Alzheimer’s (ANK1) and Parkinson’s disease (UBE2E2, PCMT1, HPRT1 and RIT2). PMID:25493648

  12. Co-expression of perforin and granzyme B genes induces apoptosis and inhibits the tumorigenicity of laryngeal cancer cell line Hep-2

    PubMed Central

    Li, Xiu-Ying; Li, Zhi; An, Gui-Jie; Liu, Sha; Lai, Yan-Dong

    2014-01-01

    Granzyme B and perforin, two of the most important components, have shown anticancer properties in various cancers, but their effects in laryngeal cancer remain unexplored. Here we decided to examine the effects of Granzyme B and perforin in Hep-2 cells and clarify the role of perforin and granzyme B in the tumorigenicity of laryngeal cancer cell line. Hep-2 cells were transfected with pVAX1-PIG co-expression vector (comprising perforin and granzyme B genes), and then the growth and apoptosis of these Hep-2 cells were evaluated. The tumorigenicity of Hep-2 cell line co-expressing perforin and granzyme B genes was tested in BALB/c nu/nu mice. We found that the co-expression of perforin and granzyme B genes could obviously inhibit cell focus formation and induce cell apoptosis in Hep-2 cells. Furthermore, after subcutaneous injection of Hep-2 cells transfected with pVAX1-PIG, an extensive delay in tumor growth was observed in BALB/c-nu/nu mice. Moreover, our studies demonstrated that the anticancer activity of perforin and granzyme B was sustainable in vivo as tumor development by inducing cell apoptosis. Taken together, our data indicate that the co-expression of perforin and granzyme B genes exhibits anticancer potential, and hopefully provide potential therapeutic applications in laryngeal cancer. PMID:24696715

  13. Yeast gene CMR1/YDL156W is consistently co-expressed with genes participating in DNA-metabolic processes in a variety of stringent clustering experiments

    PubMed Central

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J.; Nandi, Asoke K.

    2013-01-01

    The binarization of consensus partition matrices (Bi-CoPaM) method has, among its unique features, the ability to perform ensemble clustering over the same set of genes from multiple microarray datasets by using various clustering methods in order to generate tunable tight clusters. Therefore, we have used the Bi-CoPaM method to the most synchronized 500 cell-cycle-regulated yeast genes from different microarray datasets to produce four tight, specific and exclusive clusters of co-expressed genes. We found 19 genes formed the tightest of the four clusters and this included the gene CMR1/YDL156W, which was an uncharacterized gene at the time of our investigations. Two very recent proteomic and biochemical studies have independently revealed many facets of CMR1 protein, although the precise functions of the protein remain to be elucidated. Our computational results complement these biological results and add more evidence to their recent findings of CMR1 as potentially participating in many of the DNA-metabolism processes such as replication, repair and transcription. Interestingly, our results demonstrate the close co-expressions of CMR1 and the replication protein A (RPA), the cohesion complex and the DNA polymerases α, δ and ɛ, as well as suggest functional relationships between CMR1 and the respective proteins. In addition, the analysis provides further substantial evidence that the expression of the CMR1 gene could be regulated by the MBF complex. In summary, the application of a novel analytic technique in large biological datasets has provided supporting evidence for a gene of previously unknown function, further hypotheses to test, and a more general demonstration of the value of sophisticated methods to explore new large datasets now so readily generated in biological experiments. PMID:23349438

  14. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.

    PubMed

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-01

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments. PMID:26527724

  15. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond

    PubMed Central

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Moral-Chávez, Víctor Del; Rinaldi, Fabio; Collado-Vides, Julio

    2016-01-01

    RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments. PMID:26527724

  16. CorSig: A General Framework for Estimating Statistical Significance of Correlation and Its Application to Gene Co-Expression Analysis

    PubMed Central

    Wang, Hong-Qiang; Tsai, Chung-Jui

    2013-01-01

    With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. Software

  17. Gene Co-Expression Analysis Inferring the Crosstalk of Ethylene and Gibberellin in Modulating the Transcriptional Acclimation of Cassava Root Growth in Different Seasons

    PubMed Central

    Saithong, Treenut; Saerue, Samorn; Kalapanulak, Saowalak; Sojikul, Punchapat; Narangajavana, Jarunya; Bhumiratana, Sakarindr

    2015-01-01

    Cassava is a crop of hope for the 21st century. Great advantages of cassava over other crops are not only the capacity of carbohydrates, but it is also an easily grown crop with fast development. As a plant which is highly tolerant to a poor environment, cassava has been believed to own an effective acclimation process, an intelligent mechanism behind its survival and sustainability in a wide range of climates. Herein, we aimed to investigate the transcriptional regulation underlying the adaptive development of a cassava root to different seasonal cultivation climates. Gene co-expression analysis suggests that AP2-EREBP transcription factor (ERF1) orthologue (D142) played a pivotal role in regulating the cellular response to exposing to wet and dry seasons. The ERF shows crosstalk with gibberellin, via ent-Kaurene synthase (D106), in the transcriptional regulatory network that was proposed to modulate the downstream regulatory system through a distinct signaling mechanism. While sulfur assimilation is likely to be a signaling regulation for dry crop growth response, calmodulin-binding protein is responsible for regulation in the wet crop. With our initiative study, we hope that our findings will pave the way towards sustainability of cassava production under various kinds of stress considering the future global climate change. PMID:26366737

  18. Use of Semisupervised Clustering and Feature-Selection Techniques for Identification of Co-expressed Genes.

    PubMed

    Saha, Sriparna; Alok, Abhay Kumar; Ekbal, Asif

    2016-07-01

    Studying the patterns hidden in gene-expression data helps to understand the functionality of genes. In general, clustering techniques are widely used for the identification of natural partitionings from the gene expression data. In order to put constraints on dimensionality, feature selection is the key issue because not all features are important from clustering point of view. Moreover some limited amount of supervised information can help to fine tune the obtained clustering solution. In this paper, the problem of simultaneous feature selection and semisupervised clustering is formulated as a multiobjective optimization (MOO) task. A modern simulated annealing-based MOO technique namely AMOSA is utilized as the background optimization methodology. Here, features and cluster centers are represented in the form of a string and the assignment of genes to different clusters is done using a point symmetry-based distance. Six optimization criteria based on several internal and external cluster validity indices are utilized. In order to generate the supervised information, a popular clustering technique, Fuzzy C-mean, is utilized. Appropriate subset of features, proper number of clusters and the proper partitioning are determined using the search capability of AMOSA. The effectiveness of this proposed semisupervised clustering technique, Semi-FeaClustMOO, is demonstrated on five publicly available benchmark gene-expression datasets. Comparison results with the existing techniques for gene-expression data clustering again reveal the superiority of the proposed technique. Statistical and biological significance tests have also been carried out. PMID:26208367

  19. Computation in gene networks

    NASA Astrophysics Data System (ADS)

    Ben-Hur, Asa; Siegelmann, Hava T.

    2004-03-01

    Genetic regulatory networks have the complex task of controlling all aspects of life. Using a model of gene expression by piecewise linear differential equations we show that this process can be considered as a process of computation. This is demonstrated by showing that this model can simulate memory bounded Turing machines. The simulation is robust with respect to perturbations of the system, an important property for both analog computers and biological systems. Robustness is achieved using a condition that ensures that the model equations, that are generally chaotic, follow a predictable dynamics.

  20. Three tightly linked genes encoding human type I keratins: conservation of sequence in the 5'-untranslated leader and 5'-upstream regions of coexpressed keratin genes.

    PubMed Central

    RayChaudhury, A; Marchuk, D; Lindhurst, M; Fuchs, E

    1986-01-01

    We have isolated and subcloned three separate segments of human DNA which share strong sequence homology with a previously sequenced gene encoding a type I keratin, K14 (50 kilodaltons). Restriction endonuclease mapping has demonstrated that these three genes are tightly linked chromosomally, whereas the K14 gene appears to be separate. As judged by positive hybridization-translation and Northern blot analyses, the central linked gene encodes a keratin, K17, which is expressed in abundance with K14 and two other type I keratins in cultured human epidermal cells. None of these other epidermal keratin mRNAs appears to be generated from the K17 gene through differential splicing of its transcript. The sequence of the K17 gene reveals striking homologies not only with the coding portions and intron positions of the K14 gene, but also with its 5'-noncoding and 5'-upstream sequences. These similarities may provide an important clue in elucidating the molecular mechanisms underlying the coexpression of the two genes. Images PMID:2431270

  1. Identification of microRNA-regulated gene networks by expression analysis of target genes.

    PubMed

    Gennarino, Vincenzo Alessandro; D'Angelo, Giovanni; Dharmalingam, Gopuraja; Fernandez, Serena; Russolillo, Giorgio; Sanges, Remo; Mutarelli, Margherita; Belcastro, Vincenzo; Ballabio, Andrea; Verde, Pasquale; Sardiello, Marco; Banfi, Sandro

    2012-06-01

    MicroRNAs (miRNAs) and transcription factors control eukaryotic cell proliferation, differentiation, and metabolism through their specific gene regulatory networks. However, differently from transcription factors, our understanding of the processes regulated by miRNAs is currently limited. Here, we introduce gene network analysis as a new means for gaining insight into miRNA biology. A systematic analysis of all human miRNAs based on Co-expression Meta-analysis of miRNA Targets (CoMeTa) assigns high-resolution biological functions to miRNAs and provides a comprehensive, genome-scale analysis of human miRNA regulatory networks. Moreover, gene cotargeting analyses show that miRNAs synergistically regulate cohorts of genes that participate in similar processes. We experimentally validate the CoMeTa procedure through focusing on three poorly characterized miRNAs, miR-519d/190/340, which CoMeTa predicts to be associated with the TGFβ pathway. Using lung adenocarcinoma A549 cells as a model system, we show that miR-519d and miR-190 inhibit, while miR-340 enhances TGFβ signaling and its effects on cell proliferation, morphology, and scattering. Based on these findings, we formalize and propose co-expression analysis as a general paradigm for second-generation procedures to recognize bona fide targets and infer biological roles and network communities of miRNAs. PMID:22345618

  2. Identification of microRNA-regulated gene networks by expression analysis of target genes

    PubMed Central

    Gennarino, Vincenzo Alessandro; D'Angelo, Giovanni; Dharmalingam, Gopuraja; Fernandez, Serena; Russolillo, Giorgio; Sanges, Remo; Mutarelli, Margherita; Belcastro, Vincenzo; Ballabio, Andrea; Verde, Pasquale; Sardiello, Marco; Banfi, Sandro

    2012-01-01

    MicroRNAs (miRNAs) and transcription factors control eukaryotic cell proliferation, differentiation, and metabolism through their specific gene regulatory networks. However, differently from transcription factors, our understanding of the processes regulated by miRNAs is currently limited. Here, we introduce gene network analysis as a new means for gaining insight into miRNA biology. A systematic analysis of all human miRNAs based on Co-expression Meta-analysis of miRNA Targets (CoMeTa) assigns high-resolution biological functions to miRNAs and provides a comprehensive, genome-scale analysis of human miRNA regulatory networks. Moreover, gene cotargeting analyses show that miRNAs synergistically regulate cohorts of genes that participate in similar processes. We experimentally validate the CoMeTa procedure through focusing on three poorly characterized miRNAs, miR-519d/190/340, which CoMeTa predicts to be associated with the TGFβ pathway. Using lung adenocarcinoma A549 cells as a model system, we show that miR-519d and miR-190 inhibit, while miR-340 enhances TGFβ signaling and its effects on cell proliferation, morphology, and scattering. Based on these findings, we formalize and propose co-expression analysis as a general paradigm for second-generation procedures to recognize bona fide targets and infer biological roles and network communities of miRNAs. PMID:22345618

  3. Systems toxicology of chemically induced liver and kidney injuries: histopathology-associated gene co-expression modules.

    PubMed

    Te, Jerez A; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2016-09-01

    Organ injuries caused by environmental chemical exposures or use of pharmaceutical drugs pose a serious health risk that may be difficult to assess because of a lack of non-invasive diagnostic tests. Mapping chemical injuries to organ-specific histopathology outcomes via biomarkers will provide a foundation for designing precise and robust diagnostic tests. We identified co-expressed genes (modules) specific to injury endpoints using the Open Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System (TG-GATEs) - a toxicogenomics database containing organ-specific gene expression data matched to dose- and time-dependent chemical exposures and adverse histopathology assessments in Sprague-Dawley rats. We proposed a protocol for selecting gene modules associated with chemical-induced injuries that classify 11 liver and eight kidney histopathology endpoints based on dose-dependent activation of the identified modules. We showed that the activation of the modules for a particular chemical exposure condition, i.e., chemical-time-dose combination, correlated with the severity of histopathological damage in a dose-dependent manner. Furthermore, the modules could distinguish different types of injuries caused by chemical exposures as well as determine whether the injury module activation was specific to the tissue of origin (liver and kidney). The generated modules provide a link between toxic chemical exposures, different molecular initiating events among underlying molecular pathways and resultant organ damage. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. Journal of Applied Toxicology published by John Wiley & Sons, Ltd. PMID:26725466

  4. Ethanol production by Escherichia coli strains co-expressing Zymomonas PDC and ADH genes

    DOEpatents

    Ingram, Lonnie O.; Conway, Tyrrell; Alterthum, Flavio

    1991-01-01

    A novel operon and plasmids comprising genes which code for the alcohol dehydrogenase and pyruvate decarboxylase activities of Zymomonas mobilis are described. Also disclosed are methods for increasing the growth of microorganisms or eukaryotic cells and methods for reducing the accumulation of undesirable metabolic products in the growth medium of microorganisms or cells.

  5. Characterization of Tusc5, a Unique Adipocyte Gene Co-Expressed in Peripheral Somatosensory Neurons

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tumor suppressor candidate 5 (Tusc5, GenBank nomenclature) is a cold-repressed gene encoding a member of the CD225 domain-containing family, identified through analysis of transcripts differentially-expressed in brown adipose tissue (BAT) with changes in ambient temperature. Tusc5 mRNA was found to ...

  6. Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

    PubMed Central

    Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

    2006-01-01

    Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which

  7. How difficult is inference of mammalian causal gene regulatory networks?

    PubMed

    Djordjevic, Djordje; Yang, Andrian; Zadoorian, Armella; Rungrugeecharoen, Kevin; Ho, Joshua W K

    2014-01-01

    Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for

  8. Identification of co-expressed gene signatures in mouse B1, marginal zone and B2 B-cell populations

    PubMed Central

    Mabbott, Neil A; Gray, David

    2014-01-01

    In mice, three major B-cell subsets have been identified with distinct functionalities: B1 B cells, marginal zone B cells and follicular B2 B cells. Here, we used the growing body of publicly available transcriptomics data to create an expression atlas of 84 gene expression microarray data sets of distinct mouse B-cell subsets. These data were subjected to network-based cluster analysis using BioLayout Express3D. Using this analysis tool, genes with related functions clustered together in discrete regions of the network graph and enabled the identification of transcriptional networks that underpinned the functional activity of distinct cell populations. Some gene clusters were expressed highly by most of the cell populations included in this analysis (such as those with activity related to house-keeping functions). Others contained genes with expression patterns specific to distinct B-cell subsets. While these clusters contained many genes typically associated with the activity of the cells they were specifically expressed in, many novel B-cell-subset-specific candidate genes were identified. A large number of uncharacterized genes were also represented in these B-cell lineage-specific clusters. Further analysis of the activities of these uncharacterized candidate genes will lead to the identification of novel B-cell lineage-specific transcription factors and regulators of B-cell function. We also analysed 36 microarray data sets from distinct human B-cell populations. These data showed that mouse and human germinal centre B cells shared similar transcriptional features, whereas mouse B1 B cells were distinct from proposed human B1 B cells. PMID:24032749

  9. Intraisolate Mitochondrial Genetic Polymorphism and Gene Variants Coexpression in Arbuscular Mycorrhizal Fungi

    PubMed Central

    Beaudet, Denis; de la Providencia, Ivan Enrique; Labridy, Manuel; Roy-Bolduc, Alice; Daubois, Laurence; Hijri, Mohamed

    2015-01-01

    Arbuscular mycorrhizal fungi (AMF) are multinucleated and coenocytic organisms, in which the extent of the intraisolate nuclear genetic variation has been a source of debate. Conversely, their mitochondrial genomes (mtDNAs) have appeared to be homogeneous within isolates in all next generation sequencing (NGS)-based studies. Although several lines of evidence have challenged mtDNA homogeneity in AMF, extensive survey to investigate intraisolate allelic diversity has not previously been undertaken. In this study, we used a conventional polymerase chain reaction -based approach on selected mitochondrial regions with a high-fidelity DNA polymerase, followed by cloning and Sanger sequencing. Two isolates of Rhizophagus irregularis were used, one cultivated in vitro for several generations (DAOM-197198) and the other recently isolated from the field (DAOM-242422). At different loci in both isolates, we found intraisolate allelic variation within the mtDNA and in a single copy nuclear marker, which highlighted the presence of several nonsynonymous mutations in protein coding genes. We confirmed that some of this variation persisted in the transcriptome, giving rise to at least four distinct nad4 transcripts in DAOM-197198. We also detected the presence of numerous mitochondrial DNA copies within nuclear genomes (numts), providing insights to understand this important evolutionary process in AMF. Our study reveals that genetic variation in Glomeromycota is higher than what had been previously assumed and also suggests that it could have been grossly underestimated in most NGS-based AMF studies, both in mitochondrial and nuclear genomes, due to the presence of low-level mutations. PMID:25527836

  10. Enhancement of lipase r27RCL production in Pichia pastoris by regulating gene dosage and co-expression with chaperone protein disulfide isomerase.

    PubMed

    Sha, Chong; Yu, Xiao-Wei; Lin, Nai-Xin; Zhang, Meng; Xu, Yan

    2013-12-10

    Pichia pastoris has been successfully used in the production of many secreted and intracellular recombinant proteins, but there is still a large room of improvement for this expression system. Two factors drastically influence the lipase r27RCL production from Rhizopus chinensis CCTCC M201021, which are gene dosage and protein folding in the endoplasmic reticulum (ER). Regarding the effect of gene dosage, the enzyme activity for recombinant strain with three copies lipase gene was 1.95-fold higher than that for recombinant strain with only one copy lipase gene. In addition, the lipase production was further improved by co-expression with chaperone PDI involved in the disulfide bond formation in the ER. Overall, the maximum enzyme activity reached 355U/mL by the recombinant strain with one copy chaperone gene PDI plus five copies lipase gene proRCL in shaking flasks, which was 2.74-fold higher than that for the control strain with only one copy lipase gene. Overall, co-expression with PDI vastly increased the capacity for processing proteins of ER in P. pastoris. PMID:24315648

  11. Dynamic Visualization of Co-expression in Systems Genetics Data

    SciTech Connect

    New, Joshua Ryan; Huang, Jian; Chesler, Elissa J

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biological networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.

  12. Co-expression of interleukin 12 enhances antitumor effects of a novel chimeric promoter-mediated suicide gene therapy in an immunocompetent mouse model

    SciTech Connect

    Xu, Yu; Liu, Zhengchun; Kong, Haiyan; Sun, Wenjie; Liao, Zhengkai; Zhou, Fuxiang; Xie, Conghua; and others

    2011-09-09

    Highlights: {yields} A novel chimeric promoter consisting of CArG element and hTERT promoter was developed. {yields} The promoter was characterized with radiation-inducibility and tumor-specificity. {yields} Suicide gene system driven by the promoter showed remarkable cytotoxicity in vitro. {yields} Co-expression of IL12 enhanced the promoter mediated suicide gene therapy in vivo. -- Abstract: The human telomerase reverse transcriptase (hTERT) promoter has been widely used in target gene therapy of cancer. However, low transcriptional activity limited its clinical application. Here, we designed a novel dual radiation-inducible and tumor-specific promoter system consisting of CArG elements and the hTERT promoter, resulting in increased expression of reporter genes after gamma-irradiation. Therapeutic and side effects of adenovirus-mediated horseradish peroxidase (HRP)/indole-3-acetic (IAA) system downstream of the chimeric promoter were evaluated in mice bearing Lewis lung carcinoma, combining with or without adenovirus-mediated interleukin 12 (IL12) gene driven by the cytomegalovirus promoter. The combination treatment showed more effective suppression of tumor growth than those with single agent alone, being associated with pronounced intratumoral T-lymphocyte infiltration and minor side effects. Our results suggest that the combination treatment with HRP/IAA system driven by the novel chimeric promoter and the co-expression of IL12 might be an effective and safe target gene therapy strategy of cancer.

  13. Construction of a gene-gene interaction network with a combined score across multiple approaches.

    PubMed

    Zhang, A M; Song, H; Shen, Y H; Liu, Y

    2015-01-01

    Recent progress in computational methods for inves-tigating physical and functional gene interactions has provided new insights into the complexity of biological processes. An essential part of these methods is presented visually in the form of gene interaction networks that can be valuable in exploring the mechanisms of disease. Here, a combined network based on gene pairs with an extra layer of re-liability was constructed after converting and combining the gene pair scores using a novel algorithm across multiple approaches. Four groups of kidney cancer data sets from ArrayExpress were downloaded and analyzed to identify differentially expressed genes using a rank prod-ucts analysis tool. Gene co-expression network, protein-protein interac-tion, co-occurrence network and a combined network were constructed using empirical Bayesian meta-analysis approach, Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, an odds ratio formula of the cBioPortal for Cancer Genomics and a novel rank algorithm with combined score, respectively. The topological features of these networks were then compared to evaluate their performances. The results indicated that the gene pairs and their relationship rank-ings were not uniform. The values of topological parameters, such as clustering coefficient and the fitting coefficient R(2) of interaction net-work constructed using our ranked based combination score, were much greater than the other networks. The combined network had a classic small world property which transferred information quickly and displayed great resilience to the dysfunction of low-degree hubs with high-clustering and short average path length. It also followed distinct-ly a scale-free network with a higher reliability. PMID:26125911

  14. Gene co-expression network analysis identifies porcine genes associated with variation in Salmonella shedding

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Salmonella enterica serovar Typhimurium is a gram-negative bacterium that can colonize the gut of humans and several species of food producing farm animals to cause enteric or septicaemic salmonellosis. While many studies have looked into the host genetic response to Salmonella infection, relatively...

  15. A Genome-Wide Association Study for Culm Cellulose Content in Barley Reveals Candidate Genes Co-Expressed with Members of the CELLULOSE SYNTHASE A Gene Family

    PubMed Central

    Houston, Kelly; Burton, Rachel A.; Sznajder, Beata; Rafalski, Antoni J.; Dhugga, Kanwarpal S.; Mather, Diane E.; Taylor, Jillian; Steffenson, Brian J.; Waugh, Robbie; Fincher, Geoffrey B.

    2015-01-01

    Cellulose is a fundamentally important component of cell walls of higher plants. It provides a scaffold that allows the development and growth of the plant to occur in an ordered fashion. Cellulose also provides mechanical strength, which is crucial for both normal development and to enable the plant to withstand both abiotic and biotic stresses. We quantified the cellulose concentration in the culm of 288 two – rowed and 288 six – rowed spring type barley accessions that were part of the USDA funded barley Coordinated Agricultural Project (CAP) program in the USA. When the population structure of these accessions was analysed we identified six distinct populations, four of which we considered to be comprised of a sufficient number of accessions to be suitable for genome-wide association studies (GWAS). These lines had been genotyped with 3072 SNPs so we combined the trait and genetic data to carry out GWAS. The analysis allowed us to identify regions of the genome containing significant associations between molecular markers and cellulose concentration data, including one region cross-validated in multiple populations. To identify candidate genes we assembled the gene content of these regions and used these to query a comprehensive RNA-seq based gene expression atlas. This provided us with gene annotations and associated expression data across multiple tissues, which allowed us to formulate a supported list of candidate genes that regulate cellulose biosynthesis. Several regions identified by our analysis contain genes that are co-expressed with CELLULOSE SYNTHASE A (HvCesA) across a range of tissues and developmental stages. These genes are involved in both primary and secondary cell wall development. In addition, genes that have been previously linked with cellulose synthesis by biochemical methods, such as HvCOBRA, a gene of unknown function, were also associated with cellulose levels in the association panel. Our analyses provide new insights into the

  16. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe.

    PubMed

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338

  17. A Consensus Network of Gene Regulatory Factors in the Human Frontal Lobe

    PubMed Central

    Berto, Stefano; Perdomo-Sabogal, Alvaro; Gerighausen, Daniel; Qin, Jing; Nowick, Katja

    2016-01-01

    Cognitive abilities, such as memory, learning, language, problem solving, and planning, involve the frontal lobe and other brain areas. Not much is known yet about the molecular basis of cognitive abilities, but it seems clear that cognitive abilities are determined by the interplay of many genes. One approach for analyzing the genetic networks involved in cognitive functions is to study the coexpression networks of genes with known importance for proper cognitive functions, such as genes that have been associated with cognitive disorders like intellectual disability (ID) or autism spectrum disorders (ASD). Because many of these genes are gene regulatory factors (GRFs) we aimed to provide insights into the gene regulatory networks active in the human frontal lobe. Using genome wide human frontal lobe expression data from 10 independent data sets, we first derived 10 individual coexpression networks for all GRFs including their potential target genes. We observed a high level of variability among these 10 independently derived networks, pointing out that relying on results from a single study can only provide limited biological insights. To instead focus on the most confident information from these 10 networks we developed a method for integrating such independently derived networks into a consensus network. This consensus network revealed robust GRF interactions that are conserved across the frontal lobes of different healthy human individuals. Within this network, we detected a strong central module that is enriched for 166 GRFs known to be involved in brain development and/or cognitive disorders. Interestingly, several hubs of the consensus network encode for GRFs that have not yet been associated with brain functions. Their central role in the network suggests them as excellent new candidates for playing an essential role in the regulatory network of the human frontal lobe, which should be investigated in future studies. PMID:27014338

  18. Detection of gene communities in multi-networks reveals cancer drivers

    NASA Astrophysics Data System (ADS)

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-12-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes.

  19. Detection of gene communities in multi-networks reveals cancer drivers

    PubMed Central

    Cantini, Laura; Medico, Enzo; Fortunato, Santo; Caselle, Michele

    2015-01-01

    We propose a new multi-network-based strategy to integrate different layers of genomic information and use them in a coordinate way to identify driving cancer genes. The multi-networks that we consider combine transcription factor co-targeting, microRNA co-targeting, protein-protein interaction and gene co-expression networks. The rationale behind this choice is that gene co-expression and protein-protein interactions require a tight coregulation of the partners and that such a fine tuned regulation can be obtained only combining both the transcriptional and post-transcriptional layers of regulation. To extract the relevant biological information from the multi-network we studied its partition into communities. To this end we applied a consensus clustering algorithm based on state of art community detection methods. Even if our procedure is valid in principle for any pathology in this work we concentrate on gastric, lung, pancreas and colorectal cancer and identified from the enrichment analysis of the multi-network communities a set of candidate driver cancer genes. Some of them were already known oncogenes while a few are new. The combination of the different layers of information allowed us to extract from the multi-network indications on the regulatory pattern and functional role of both the already known and the new candidate driver genes. PMID:26639632

  20. Neutralization of Bacterial YoeBSpn Toxicity and Enhanced Plant Growth in Arabidopsis thaliana via Co-Expression of the Toxin-Antitoxin Genes

    PubMed Central

    Abu Bakar, Fauziah; Yeo, Chew Chieng; Harikrishna, Jennifer Ann

    2016-01-01

    Bacterial toxin-antitoxin (TA) systems have various cellular functions, including as part of the general stress response. The genome of the Gram-positive human pathogen Streptococcus pneumoniae harbors several putative TA systems, including yefM-yoeBSpn, which is one of four systems that had been demonstrated to be biologically functional. Overexpression of the yoeBSpn toxin gene resulted in cell stasis and eventually cell death in its native host, as well as in Escherichia coli. Our previous work showed that induced expression of a yoeBSpn toxin-Green Fluorescent Protein (GFP) fusion gene apparently triggered apoptosis and was lethal in the model plant, Arabidopsis thaliana. In this study, we investigated the effects of co-expression of the yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic A. thaliana. When co-expressed in Arabidopsis, the YefMSpn antitoxin was found to neutralize the toxicity of YoeBSpn-GFP. Interestingly, the inducible expression of both yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic hybrid Arabidopsis resulted in larger rosette leaves and taller plants with a higher number of inflorescence stems and increased silique production. To our knowledge, this is the first demonstration of a prokaryotic antitoxin neutralizing its cognate toxin in plant cells. PMID:27104531

  1. Neutralization of Bacterial YoeBSpn Toxicity and Enhanced Plant Growth in Arabidopsis thaliana via Co-Expression of the Toxin-Antitoxin Genes.

    PubMed

    Abu Bakar, Fauziah; Yeo, Chew Chieng; Harikrishna, Jennifer Ann

    2016-01-01

    Bacterial toxin-antitoxin (TA) systems have various cellular functions, including as part of the general stress response. The genome of the Gram-positive human pathogen Streptococcus pneumoniae harbors several putative TA systems, including yefM-yoeBSpn, which is one of four systems that had been demonstrated to be biologically functional. Overexpression of the yoeBSpn toxin gene resulted in cell stasis and eventually cell death in its native host, as well as in Escherichia coli. Our previous work showed that induced expression of a yoeBSpn toxin-Green Fluorescent Protein (GFP) fusion gene apparently triggered apoptosis and was lethal in the model plant, Arabidopsis thaliana. In this study, we investigated the effects of co-expression of the yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic A. thaliana. When co-expressed in Arabidopsis, the YefMSpn antitoxin was found to neutralize the toxicity of YoeBSpn-GFP. Interestingly, the inducible expression of both yefMSpn antitoxin and yoeBSpn toxin-GFP fusion in transgenic hybrid Arabidopsis resulted in larger rosette leaves and taller plants with a higher number of inflorescence stems and increased silique production. To our knowledge, this is the first demonstration of a prokaryotic antitoxin neutralizing its cognate toxin in plant cells. PMID:27104531

  2. Transient Co-Expression of Post-Transcriptional Gene Silencing Suppressors for Increased in Planta Expression of a Recombinant Anthrax Receptor Fusion Protein

    PubMed Central

    Arzola, Lucas; Chen, Junxing; Rattanaporn, Kittipong; Maclean, James M.; McDonald, Karen A.

    2011-01-01

    Potential epidemics of infectious diseases and the constant threat of bioterrorism demand rapid, scalable, and cost-efficient manufacturing of therapeutic proteins. Molecular farming of tobacco plants provides an alternative for the recombinant production of therapeutics. We have developed a transient production platform that uses Agrobacterium infiltration of Nicotiana benthamiana plants to express a novel anthrax receptor decoy protein (immunoadhesin), CMG2-Fc. This chimeric fusion protein, designed to protect against the deadly anthrax toxins, is composed of the von Willebrand factor A (VWA) domain of human capillary morphogenesis 2 (CMG2), an effective anthrax toxin receptor, and the Fc region of human immunoglobulin G (IgG). We evaluated, in N. benthamiana intact plants and detached leaves, the expression of CMG2-Fc under the control of the constitutive CaMV 35S promoter, and the co-expression of CMG2-Fc with nine different viral suppressors of post-transcriptional gene silencing (PTGS): p1, p10, p19, p21, p24, p25, p38, 2b, and HCPro. Overall, transient CMG2-Fc expression was higher on intact plants than detached leaves. Maximum expression was observed with p1 co-expression at 3.5 days post-infiltration (DPI), with a level of 0.56 g CMG2-Fc per kg of leaf fresh weight and 1.5% of the total soluble protein, a ten-fold increase in expression when compared to absence of suppression. Co-expression with the p25 PTGS suppressor also significantly increased the CMG2-Fc expression level after just 3.5 DPI. PMID:21954339

  3. Building Developmental Gene Regulatory Networks

    PubMed Central

    Li, Enhu; Davidson, Eric H.

    2009-01-01

    Animal development is an elaborate process programmed by genomic regulatory instructions. Regulatory genes encode transcription factors and signal molecules, and their expression is under the control of cis-regulatory modules that define the logic of transcriptional responses to the inputs of other regulatory genes. The functional linkages amongst regulatory genes constitute the gene regulatory networks (GRNs) that govern cell specification and patterning in development. Constructing such networks requires identification of the regulatory genes involved and characterization of their temporal and spatial expression patterns. Interactions (activation/repression) among transcription factors or signals can be investigated by large-scale perturbation analysis, in which the function of each gene is specifically blocked. Resultant expression changes are then integrated to identify direct linkages, and to reveal the structure of the GRN. Predicted GRN linkages can be tested and verified by cis-regulatory analysis. The explanatory power of the GRN was shown in the lineage specification of sea urchin endomesoderm. Acquiring such networks is essential for a systematic and mechanistic understanding of the developmental process. PMID:19530131

  4. Lists2Networks: Integrated analysis of gene/protein lists

    PubMed Central

    2010-01-01

    Background Systems biologists are faced with the difficultly of analyzing results from large-scale studies that profile the activity of many genes, RNAs and proteins, applied in different experiments, under different conditions, and reported in different publications. To address this challenge it is desirable to compare the results from different related studies such as mRNA expression microarrays, genome-wide ChIP-X, RNAi screens, proteomics and phosphoproteomics experiments in a coherent global framework. In addition, linking high-content multilayered experimental results with prior biological knowledge can be useful for identifying functional themes and form novel hypotheses. Results We present Lists2Networks, a web-based system that allows users to upload lists of mammalian genes/proteins onto a server-based program for integrated analysis. The system includes web-based tools to manipulate lists with different set operations, to expand lists using existing mammalian networks of protein-protein interactions, co-expression correlation, or background knowledge co-annotation correlation, as well as to apply gene-list enrichment analyses against many gene-list libraries of prior biological knowledge such as pathways, gene ontology terms, kinase-substrate, microRNA-mRAN, and protein-protein interactions, metabolites, and protein domains. Such analyses can be applied to several lists at once against many prior knowledge libraries of gene-lists associated with specific annotations. The system also contains features that allow users to export networks and share lists with other users of the system. Conclusions Lists2Networks is a user friendly web-based software system expected to significantly ease the computational analysis process for experimental systems biologists employing high-throughput experiments at multiple layers of regulation. The system is freely available at http://www.lists2networks.org. PMID:20152038

  5. Reconstruction of Gene Networks of Iron Response in Shewanella oneidensis

    SciTech Connect

    Yang, Yunfeng; Harris, Daniel P; Luo, Feng; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin Koo; Gao, Haichun; Arkin, Adam; Palumbo, Anthony Vito; Zhou, Jizhong

    2009-01-01

    It is of great interest to study the iron response of the -proteobacterium Shewanella oneidensis since it possesses a high content of iron and is capable of utilizing iron for anaerobic respiration. We report here that the iron response in S. oneidensis is a rapid process. To gain more insights into the bacterial response to iron, temporal gene expression profiles were examined for iron depletion and repletion, resulting in identification of iron-responsive biological pathways in a gene co-expression network. Iron acquisition systems, including genes unique to S. oneidensis, were rapidly and strongly induced by iron depletion, and repressed by iron repletion. Some were required for iron depletion, as exemplified by the mutational analysis of the putative siderophore biosynthesis protein SO3032. Unexpectedly, a number of genes related to anaerobic energy metabolism were repressed by iron depletion and induced by repletion, which might be due to the iron storage potential of their protein products. Other iron-responsive biological pathways include protein degradation, aerobic energy metabolism and protein synthesis. Furthermore, sequence motifs enriched in gene clusters as well as their corresponding DNA-binding proteins (Fur, CRP and RpoH) were identified, resulting in a regulatory network of iron response in S. oneidensis. Together, this work provides an overview of iron response and reveals novel features in S. oneidensis, including Shewanella-specific iron acquisition systems, and suggests the intimate relationship between anaerobic energy metabolism and iron response.

  6. Gene networks controlling petal organogenesis.

    PubMed

    Huang, Tengbo; Irish, Vivian F

    2016-01-01

    One of the biggest unanswered questions in developmental biology is how growth is controlled. Petals are an excellent organ system for investigating growth control in plants: petals are dispensable, have a simple structure, and are largely refractory to environmental perturbations that can alter their size and shape. In recent studies, a number of genes controlling petal growth have been identified. The overall picture of how such genes function in petal organogenesis is beginning to be elucidated. This review will focus on studies using petals as a model system to explore the underlying gene networks that control organ initiation, growth, and final organ morphology. PMID:26428062

  7. Buffering in cyclic gene networks

    NASA Astrophysics Data System (ADS)

    Glyzin, S. D.; Kolesov, A. Yu.; Rozov, N. Kh.

    2016-06-01

    We consider cyclic chains of unidirectionally coupled delay differential-difference equations that are mathematical models of artificial oscillating gene networks. We establish that the buffering phenomenon is realized in these system for an appropriate choice of the parameters: any given finite number of stable periodic motions of a special type, the so-called traveling waves, coexist.

  8. The Gene Network Underlying Hypodontia.

    PubMed

    Yin, W; Bian, Z

    2015-07-01

    Mammalian tooth development is a precise and complicated procedure. Several signaling pathways, such as nuclear factor (NF)-κB and WNT, are key regulators of tooth development. Any disturbance of these signaling pathways can potentially affect or block normal tooth development, and presently, there are more than 150 syndromes and 80 genes known to be related to tooth agenesis. Clarifying the interaction and crosstalk among these genes will provide important information regarding the mechanisms underlying missing teeth. In the current review, we summarize recently published findings on genes related to isolated and syndromic tooth agenesis; most of these genes function as positive regulators of cell proliferation or negative regulators of cell differentiation and apoptosis. Furthermore, we explore the corresponding networks involving these genes in addition to their implications for the clinical management of tooth agenesis. We conclude that this requires further study to improve patients' quality of life in the future. PMID:25910507

  9. Comparison of co-expression measures: mutual information, correlation, and model based indices

    PubMed Central

    2012-01-01

    Background Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). Results We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. Conclusion The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships

  10. Integrating Genetic and Network Analysis to Characterize Genes Related to Mouse Weight

    PubMed Central

    Zhang, Bin; Wang, Susanna; Plaisier, Christopher; Castellanos, Ruth; Brozell, Alec; Schadt, Eric E; Drake, Thomas A

    2006-01-01

    Systems biology approaches that are based on the genetics of gene expression have been fruitful in identifying genetic regulatory loci related to complex traits. We use microarray and genetic marker data from an F2 mouse intercross to examine the large-scale organization of the gene co-expression network in liver, and annotate several gene modules in terms of 22 physiological traits. We identify chromosomal loci (referred to as module quantitative trait loci, mQTL) that perturb the modules and describe a novel approach that integrates network properties with genetic marker information to model gene/trait relationships. Specifically, using the mQTL and the intramodular connectivity of a body weight–related module, we describe which factors determine the relationship between gene expression profiles and weight. Our approach results in the identification of genetic targets that influence gene modules (pathways) that are related to the clinical phenotypes of interest. PMID:16934000

  11. Integrating genetic and network analysis to characterize genes related to mouse weight.

    PubMed

    Ghazalpour, Anatole; Doss, Sudheer; Zhang, Bin; Wang, Susanna; Plaisier, Christopher; Castellanos, Ruth; Brozell, Alec; Schadt, Eric E; Drake, Thomas A; Lusis, Aldons J; Horvath, Steve

    2006-08-18

    Systems biology approaches that are based on the genetics of gene expression have been fruitful in identifying genetic regulatory loci related to complex traits. We use microarray and genetic marker data from an F2 mouse intercross to examine the large-scale organization of the gene co-expression network in liver, and annotate several gene modules in terms of 22 physiological traits. We identify chromosomal loci (referred to as module quantitative trait loci, mQTL) that perturb the modules and describe a novel approach that integrates network properties with genetic marker information to model gene/trait relationships. Specifically, using the mQTL and the intramodular connectivity of a body weight-related module, we describe which factors determine the relationship between gene expression profiles and weight. Our approach results in the identification of genetic targets that influence gene modules (pathways) that are related to the clinical phenotypes of interest. PMID:16934000

  12. Plant Evolution: Evolving Antagonistic Gene Regulatory Networks.

    PubMed

    Cooper, Endymion D

    2016-06-20

    Developing a structurally complex phenotype requires a complex regulatory network. A new study shows how gene duplication provides a potential source of antagonistic interactions, an important component of gene regulatory networks. PMID:27326708

  13. Gene networks and liar paradoxes

    PubMed Central

    Isalan, Mark

    2009-01-01

    Network motifs are small patterns of connections, found over-represented in gene regulatory networks. An example is the negative feedback loop (e.g. factor A represses itself). This opposes its own state so that when ‘on’ it tends towards ‘off’ – and vice versa. Here, we argue that such self-opposition, if considered dimensionlessly, is analogous to the liar paradox: ‘This statement is false’. When ‘true’ it implies ‘false’ – and vice versa. Such logical constructs have provided philosophical consternation for over 2000 years. Extending the analogy, other network topologies give strikingly varying outputs over different dimensions. For example, the motif ‘A activates B and A. B inhibits A’ can give switches or oscillators with time only, or can lead to Turing-type patterns with both space and time (spots, stripes or waves). It is argued here that the dimensionless form reduces to a variant of ‘The following statement is true. The preceding statement is false’. Thus, merely having a static topological description of a gene network can lead to a liar paradox. Network diagrams are only snapshots of dynamic biological processes and apparent paradoxes can reveal important biological mechanisms that are far from paradoxical when considered explicitly in time and space. PMID:19722183

  14. Gene networks and liar paradoxes.

    PubMed

    Isalan, Mark

    2009-10-01

    Network motifs are small patterns of connections, found over-represented in gene regulatory networks. An example is the negative feedback loop (e.g. factor A represses itself). This opposes its own state so that when 'on' it tends towards 'off' - and vice versa. Here, we argue that such self-opposition, if considered dimensionlessly, is analogous to the liar paradox: 'This statement is false'. When 'true' it implies 'false' - and vice versa. Such logical constructs have provided philosophical consternation for over 2000 years. Extending the analogy, other network topologies give strikingly varying outputs over different dimensions. For example, the motif 'A activates B and A. B inhibits A' can give switches or oscillators with time only, or can lead to Turing-type patterns with both space and time (spots, stripes or waves). It is argued here that the dimensionless form reduces to a variant of 'The following statement is true. The preceding statement is false'. Thus, merely having a static topological description of a gene network can lead to a liar paradox. Network diagrams are only snapshots of dynamic biological processes and apparent paradoxes can reveal important biological mechanisms that are far from paradoxical when considered explicitly in time and space. PMID:19722183

  15. A moth pheromone brewery: production of (Z)-11-hexadecenol by heterologous co-expression of two biosynthetic genes from a noctuid moth in a yeast cell factory

    PubMed Central

    2013-01-01

    Background Moths (Lepidoptera) are highly dependent on chemical communication to find a mate. Compared to conventional unselective insecticides, synthetic pheromones have successfully served to lure male moths as a specific and environmentally friendly way to control important pest species. However, the chemical synthesis and purification of the sex pheromone components in large amounts is a difficult and costly task. The repertoire of enzymes involved in moth pheromone biosynthesis in insecta can be seen as a library of specific catalysts that can be used to facilitate the synthesis of a particular chemical component. In this study, we present a novel approach to effectively aid in the preparation of semi-synthetic pheromone components using an engineered vector co-expressing two key biosynthetic enzymes in a simple yeast cell factory. Results We first identified and functionally characterized a ∆11 Fatty-Acyl Desaturase and a Fatty-Acyl Reductase from the Turnip moth, Agrotis segetum. The ∆11-desaturase produced predominantly Z11-16:acyl, a common pheromone component precursor, from the abundant yeast palmitic acid and the FAR transformed a series of saturated and unsaturated fatty acids into their corresponding alcohols which may serve as pheromone components in many moth species. Secondly, when we co-expressed the genes in the Brewer’s yeast Saccharomyces cerevisiae, a set of long-chain fatty acids and alcohols that are not naturally occurring in yeast were produced from inherent yeast fatty acids, and the presence of (Z)-11-hexadecenol (Z11-16:OH), demonstrated that both heterologous enzymes were active in concert. A 100 ml batch yeast culture produced on average 19.5 μg Z11-16:OH. Finally, we demonstrated that oxidized extracts from the yeast cells containing (Z)-11-hexadecenal and other aldehyde pheromone compounds elicited specific electrophysiological activity from male antennae of the Tobacco budworm, Heliothis virescens, supporting the idea that

  16. Co-regulation of metabolic genes is better explained by flux coupling than by network distance.

    PubMed

    Notebaart, Richard A; Teusink, Bas; Siezen, Roland J; Papp, Balázs

    2008-01-01

    To what extent can modes of gene regulation be explained by systems-level properties of metabolic networks? Prior studies on co-regulation of metabolic genes have mainly focused on graph-theoretical features of metabolic networks and demonstrated a decreasing level of co-expression with increasing network distance, a naïve, but widely used, topological index. Others have suggested that static graph representations can poorly capture dynamic functional associations, e.g., in the form of dependence of metabolic fluxes across genes in the network. Here, we systematically tested the relative importance of metabolic flux coupling and network position on gene co-regulation, using a genome-scale metabolic model of Escherichia coli. After validating the computational method with empirical data on flux correlations, we confirm that genes coupled by their enzymatic fluxes not only show similar expression patterns, but also share transcriptional regulators and frequently reside in the same operon. In contrast, we demonstrate that network distance per se has relatively minor influence on gene co-regulation. Moreover, the type of flux coupling can explain refined properties of the regulatory network that are ignored by simple graph-theoretical indices. Our results underline the importance of studying functional states of cellular networks to define physiologically relevant associations between genes and should stimulate future developments of novel functional genomic tools. PMID:18225949

  17. Adaptive Models for Gene Networks

    PubMed Central

    Shin, Yong-Jun; Sayed, Ali H.; Shen, Xiling

    2012-01-01

    Biological systems are often treated as time-invariant by computational models that use fixed parameter values. In this study, we demonstrate that the behavior of the p53-MDM2 gene network in individual cells can be tracked using adaptive filtering algorithms and the resulting time-variant models can approximate experimental measurements more accurately than time-invariant models. Adaptive models with time-variant parameters can help reduce modeling complexity and can more realistically represent biological systems. PMID:22359614

  18. iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections

    PubMed Central

    Imrichová, Hana; Van de Sande, Bram; Standaert, Laura; Christiaens, Valerie; Hulselmans, Gert; Herten, Koen; Naval Sanchez, Marina; Potier, Delphine; Svetlichnyy, Dmitry; Kalender Atak, Zeynep; Fiers, Mark; Marine, Jean-Christophe; Aerts, Stein

    2014-01-01

    Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org. PMID:25058159

  19. Dynamical properties of gene regulatory networks involved in long-term potentiation

    PubMed Central

    Nido, Gonzalo S.; Ryan, Margaret M.; Benuskova, Lubica; Williams, Joanna M.

    2015-01-01

    The long-lasting enhancement of synaptic effectiveness known as long-term potentiation (LTP) is considered to be the cellular basis of long-term memory. LTP elicits changes at the cellular and molecular level, including temporally specific alterations in gene networks. LTP can be seen as a biological process in which a transient signal sets a new homeostatic state that is “remembered” by cellular regulatory systems. Previously, we have shown that early growth response (Egr) transcription factors are of fundamental importance to gene networks recruited early after LTP induction. From a systems perspective, we hypothesized that these networks will show less stable architecture, while networks recruited later will exhibit increased stability, being more directly related to LTP consolidation. Using random Boolean network (RBN) simulations we found that the network derived at 24 h was markedly more stable than those derived at 20 min or 5 h post-LTP. This temporal effect on the vulnerability of the networks is mirrored by what is known about the vulnerability of LTP and memory itself. Differential gene co-expression analysis further highlighted the importance of the Egr family and found a rapid enrichment in connectivity at 20 min, followed by a systematic decrease, providing a potential explanation for the down-regulation of gene expression at 24 h documented in our preceding studies. We also found that the architecture exhibited by a control and the 24 h LTP co-expression networks fit well to a scale-free distribution, known to be robust against perturbations. By contrast the 20 min and 5 h networks showed more truncated distributions. These results suggest that a new homeostatic state is achieved 24 h post-LTP. Together, these data present an integrated view of the genomic response following LTP induction by which the stability of the networks regulated at different times parallel the properties observed at the synapse. PMID:26300724

  20. Co-expression of a modified maize ribosome-inactivating protein and a rice basic chitinase gene in transgenic rice plants confers enhanced resistance to sheath blight.

    PubMed

    Kim, Ju-Kon; Jang, In-Cheol; Wu, Ray; Zuo, Wei-Neng; Boston, Rebecca S; Lee, Yong-Hwan; Ahn, Il-Pyung; Nahm, Baek Hie

    2003-08-01

    Chitinases, beta-1,3-glucanases, and ribosome-inactivating proteins are reported to have antifungal activity in plants. With the aim of producing fungus-resistant transgenic plants, we co-expressed a modified maize ribosome-inactivating protein gene, MOD1, and a rice basic chitinase gene, RCH10, in transgenic rice plants. A construct containing MOD1 and RCH10 under the control of the rice rbcS and Act1 promoters, respectively, was co-transformed with a plasmid containing the herbicide-resistance gene bar as a selection marker into rice by particle bombardment. Several transformants analyzed by genomic Southern-blot hybridization demonstrated integration of multiple copies of the foreign gene into rice chromosomes. Immunoblot experiments showed that MOD1 formed approximately 0.5% of the total soluble protein in transgenic leaves. RCH10 expression was examined using the native polyacrylamide-overlay gel method, and high RCH10 activity was observed in leaf tissues where endogenous RCH10 is not expressed. R1 plants were analyzed in a similar way, and the Southern-blot patterns and levels of transgene expression remained the same as in the parental line. Analysis of the response of R2 plants to three fungal pathogens of rice, Rhizoctonia solani, Bipolaris oryzae, and Magnaporthe grisea, indicated statistically significant symptom reduction only in the case of R. solani (sheath blight). The increased resistance co-segregated with herbicide tolerance, reflecting a correlation between the resistance phenotype and transgene expression. PMID:12885168

  1. Network-based integration of GWAS and gene expression identifies a HOX-centric network associated with serous ovarian cancer risk

    PubMed Central

    Kar, Siddhartha P.; Tyrer, Jonathan P.; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie T.; Beckmann, Matthias W.; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F.; Edwards, Robert P.; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K.; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K.; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston-Campbell, Lara E.; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Monteiro, Alvaro N. A.; Freedman, Matthew L.; Gayther, Simon A.; Pharoah, Paul D. P.

    2015-01-01

    Background Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by co-expression may also be enriched for additional EOC risk associations. Methods We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly co-expressed with each selected TF gene in the unified microarray data set of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this data set were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Results Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P<0.05 and FDR<0.05). These results were replicated (P<0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. Conclusion We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Impact Network analysis integrating large, context-specific data sets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. PMID:26209509

  2. Consensus gene regulatory networks: combining multiple microarray gene expression datasets

    NASA Astrophysics Data System (ADS)

    Peeling, Emma; Tucker, Allan

    2007-09-01

    In this paper we present a method for modelling gene regulatory networks by forming a consensus Bayesian network model from multiple microarray gene expression datasets. Our method is based on combining Bayesian network graph topologies and does not require any special pre-processing of the datasets, such as re-normalisation. We evaluate our method on a synthetic regulatory network and part of the yeast heat-shock response regulatory network using publicly available yeast microarray datasets. Results are promising; the consensus networks formed provide a broader view of the potential underlying network, obtaining an increased true positive rate over networks constructed from a single data source.

  3. ESR1 Is Co-Expressed with Closely Adjacent Uncharacterised Genes Spanning a Breast Cancer Susceptibility Locus at 6q25.1

    PubMed Central

    Dunbier, Anita K.; Anderson, Helen; Ghazoui, Zara; Lopez-Knowles, Elena; Pancholi, Sunil; Ribas, Ricardo; Drury, Suzanne; Sidhu, Kally; Leary, Alexandra; Martin, Lesley-Ann; Dowsett, Mitch

    2011-01-01

    Approximately 80% of human breast carcinomas present as oestrogen receptor α-positive (ER+ve) disease, and ER status is a critical factor in treatment decision-making. Recently, single nucleotide polymorphisms (SNPs) in the region immediately upstream of the ER gene (ESR1) on 6q25.1 have been associated with breast cancer risk. Our investigation of factors associated with the level of expression of ESR1 in ER+ve tumours has revealed unexpected associations between genes in this region and ESR1 expression that are important to consider in studies of the genetic causes of breast cancer risk. RNA from tumour biopsies taken from 104 postmenopausal women before and after 2 weeks treatment with an aromatase (oestrogen synthase) inhibitor was analyzed on Illumina 48K microarrays. Multiple-testing corrected Spearman correlation revealed that three previously uncharacterized open reading frames (ORFs) located immediately upstream of ESR1, C6ORF96, C6ORF97, and C6ORF211 were highly correlated with ESR1 (Rs = 0.67, 0.64, and 0.55 respectively, FDR<1×10−7). Publicly available datasets confirmed this relationship in other groups of ER+ve tumours. DNA copy number changes did not account for the correlations. The correlations were maintained in cultured cells. An ERα antagonist did not affect the ORFs' expression or their correlation with ESR1, suggesting their transcriptional co-activation is not directly mediated by ERα. siRNA inhibition of C6ORF211 suppressed proliferation in MCF7 cells, and C6ORF211 positively correlated with a proliferation metagene in tumours. In contrast, C6ORF97 expression correlated negatively with the metagene and predicted for improved disease-free survival in a tamoxifen-treated published dataset, independently of ESR1. Our observations suggest that some of the biological effects previously attributed to ER could be mediated and/or modified by these co-expressed genes. The co-expression and function of these genes may be important influences

  4. Uncovering co-expression gene network regulating fruit acidity in diverse apples

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Acidity is a major contributor to fruit quality. Several organic acids are present in apple fruit, but malic acid is predominant and determines fruit acidity. The trait is largely controlled by the Malic acid (Ma) locus, underpinning which Ma1 that encodes an Aluminum-activated Malate Transporter1 (...

  5. GEM-TREND: a web tool for gene expression data mining toward relevant network discovery

    PubMed Central

    Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

    2009-01-01

    Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations

  6. Co-expression of the transcription factors CEH-14 and TTX-1 regulates AFD neuron-specific genes gcy-8 and gcy-18 in C. elegans.

    PubMed

    Kagoshima, Hiroshi; Kohara, Yuji

    2015-03-15

    A wide variety of cells are generated by the expression of characteristic sets of genes, primarily those regulated by cell-specific transcription. To elucidate the mechanism regulating cell-specific gene expression in a highly specialized cell, AFD thermosensory neuron in Caenorhabditis elegans, we analyzed the promoter sequences of guanylyl cyclase genes, gcy-8 and gcy-18, exclusively expressed in AFD. In this study, we showed that AFD-specific expression of gcy-8 and gcy-18 requires the co-expression of homeodomain proteins, CEH-14/LHX3 and TTX-1/OTX1. We observed that mutation of ttx-1 or ceh-14 caused a reduction in the expression of gcy-8 and gcy-18 and that the expression was completely lost in double mutants. This synergy effect was also observed with other AFD marker genes, such as ntc-1, nlp-21and cng-3. Electrophoretic mobility shift assays revealed direct interaction of CEH-14 and TTX-1 proteins with gcy-8 and gcy-18 promoters in vitro. The binding sites of CEH-14 and TTX-1 proteins were confirmed to be essential for AFD-specific expression of gcy-8 and gcy-18 in vivo. We also demonstrated that forced expression of CEH-14 and TTX-1 in AWB chemosensory neurons induced ectopic expression of gcy-8 and gcy-18 reporters in this neuron. Finally, we showed that the regulation of gcy-8 and gcy-18 expression by ceh-14 and ttx-1 is evolutionally conserved in five Caenorhabditis species. Taken together, ceh-14 and ttx-1 expression determines the fate of AFD as terminal selector genes at the final step of cell specification. PMID:25614239

  7. Transcriptional Regulatory Network Analysis of MYB Transcription Factor Family Genes in Rice

    PubMed Central

    Smita, Shuchi; Katiyar, Amit; Chinnusamy, Viswanathan; Pandey, Dev M.; Bansal, Kailash C.

    2015-01-01

    MYB transcription factor (TF) is one of the largest TF families and regulates defense responses to various stresses, hormone signaling as well as many metabolic and developmental processes in plants. Understanding these regulatory hierarchies of gene expression networks in response to developmental and environmental cues is a major challenge due to the complex interactions between the genetic elements. Correlation analyses are useful to unravel co-regulated gene pairs governing biological process as well as identification of new candidate hub genes in response to these complex processes. High throughput expression profiling data are highly useful for construction of co-expression networks. In the present study, we utilized transcriptome data for comprehensive regulatory network studies of MYB TFs by “top-down” and “guide-gene” approaches. More than 50% of OsMYBs were strongly correlated under 50 experimental conditions with 51 hub genes via “top-down” approach. Further, clusters were identified using Markov Clustering (MCL). To maximize the clustering performance, parameter evaluation of the MCL inflation score (I) was performed in terms of enriched GO categories by measuring F-score. Comparison of co-expressed cluster and clads analyzed from phylogenetic analysis signifies their evolutionarily conserved co-regulatory role. We utilized compendium of known interaction and biological role with Gene Ontology enrichment analysis to hypothesize function of coexpressed OsMYBs. In the other part, the transcriptional regulatory network analysis by “guide-gene” approach revealed 40 putative targets of 26 OsMYB TF hubs with high correlation value utilizing 815 microarray data. The putative targets with MYB-binding cis-elements enrichment in their promoter region, functional co-occurrence as well as nuclear localization supports our finding. Specially, enrichment of MYB binding regions involved in drought-inducibility implying their regulatory role in drought

  8. A regulatory gene network related to the porcine umami taste receptor (TAS1R1/TAS1R3).

    PubMed

    Kim, J M; Ren, D; Reverter, A; Roura, E

    2016-02-01

    Taste perception plays an important role in the mediation of food choices in mammals. The first porcine taste receptor genes identified, sequenced and characterized, TAS1R1 and TAS1R3, were related to the dimeric receptor for umami taste. However, little is known about their regulatory network. The objective of this study was to unfold the genetic network involved in porcine umami taste perception. We performed a meta-analysis of 20 gene expression studies spanning 480 porcine microarray chips and screened 328 taste-related genes by selective mining steps among the available 12,320 genes. A porcine umami taste-specific regulatory network was constructed based on the normalized coexpression data of the 328 genes across 27 tissues. From the network, we revealed the 'taste module' and identified a coexpression cluster for the umami taste according to the first connector with the TAS1R1/TAS1R3 genes. Our findings identify several taste-related regulatory genes and extend previous genetic background of porcine umami taste. PMID:26554867

  9. Co-expression of the mating-type genes involved in internuclear recognition is lethal in Podospora anserina.

    PubMed Central

    Coppin, E; Debuchy, R

    2000-01-01

    In the heterothallic filamentous fungus Podospora anserina, four mating-type genes encoding transcriptional factors have been characterized: FPR1 in the mat+ sequence and FMR1, SMR1, and SMR2 in the alternative mat- sequence. Fertilization is controlled by FPR1 and FMR1. After fertilization, male and female nuclei, which have divided in the same cell, form mat+/mat- pairs during migration into the ascogenous hyphae. Previous data indicate that the formation of mat+/mat- pairs is controlled by FPR1, FMR1, and SMR2. SMR1 was postulated to be necessary for initial development of ascogenous hyphae. In this study, we investigated the transcriptional control of the mat genes by seeking mat transcripts during the vegetative and sexual phase and fusing their promoter to a reporter gene. The data indicate that FMR1 and FPR1 are expressed in both mycelia and perithecia, whereas SMR1 and SMR2 are transcribed in perithecia. Increased or induced vegetative expression of the four mat genes has no effect when the recombined gene is solely in the wild-type strain. However, the combination of resident FPR1 with deregulated SMR2 and overexpressed FMR1 in the same nucleus is lethal. This lethality is suppressed by the expression of SMR1, confirming that SMR1 operates downstream of the other mat genes. PMID:10835389

  10. Evolving Robust Gene Regulatory Networks

    PubMed Central

    Noman, Nasimul; Monjo, Taku; Moscato, Pablo; Iba, Hitoshi

    2015-01-01

    Design and implementation of robust network modules is essential for construction of complex biological systems through hierarchical assembly of ‘parts’ and ‘devices’. The robustness of gene regulatory networks (GRNs) is ascribed chiefly to the underlying topology. The automatic designing capability of GRN topology that can exhibit robust behavior can dramatically change the current practice in synthetic biology. A recent study shows that Darwinian evolution can gradually develop higher topological robustness. Subsequently, this work presents an evolutionary algorithm that simulates natural evolution in silico, for identifying network topologies that are robust to perturbations. We present a Monte Carlo based method for quantifying topological robustness and designed a fitness approximation approach for efficient calculation of topological robustness which is computationally very intensive. The proposed framework was verified using two classic GRN behaviors: oscillation and bistability, although the framework is generalized for evolving other types of responses. The algorithm identified robust GRN architectures which were verified using different analysis and comparison. Analysis of the results also shed light on the relationship among robustness, cooperativity and complexity. This study also shows that nature has already evolved very robust architectures for its crucial systems; hence simulation of this natural process can be very valuable for designing robust biological systems. PMID:25616055

  11. Gene-sharing networks reveal organizing principles of transcriptomes in Arabidopsis and other multicellular organisms.

    PubMed

    Li, Song; Pandey, Sona; Gookin, Timothy E; Zhao, Zhixin; Wilson, Liza; Assmann, Sarah M

    2012-04-01

    Understanding tissue-related gene expression patterns can provide important insights into gene, tissue, and organ function. Transcriptome analyses often have focused on housekeeping or tissue-specific genes or on gene coexpression. However, by analyzing thousands of single-gene expression distributions in multiple tissues of Arabidopsis thaliana, rice (Oryza sativa), human (Homo sapiens), and mouse (Mus musculus), we found that these organisms primarily operate by gene sharing, a phenomenon where, in each organism, most genes exhibit a high expression level in a few key tissues. We designed an analytical pipeline to characterize this phenomenon and then derived Arabidopsis and human gene-sharing networks, in which tissues are connected solely based on the extent of shared preferentially expressed genes. The results show that tissues or cell types from the same organ system tend to group together to form network modules. Tissues that are in consecutive developmental stages or have common physiological functions are connected in these networks, revealing the importance of shared preferentially expressed genes in conferring specialized functions of each tissue type. The networks provide predictive power for each tissue type regarding gene functions of both known and heretofore unknown genes, as shown by the identification of four new genes with functions in guard cell and abscisic acid response. We provide a Web interface that enables, based on the extent of gene sharing, both prediction of tissue-related functions for any Arabidopsis gene of interest and predictions concerning the relatedness of tissues. Common gene-sharing patterns observed in the four model organisms suggest that gene sharing evolved as a fundamental organizing principle of gene expression in diverse multicellular eukaryotes. PMID:22517316

  12. Cell Cycle Gene Networks Are Associated with Melanoma Prognosis

    PubMed Central

    Watkins, Wendy; Araki, Hiromitsu; Tamada, Yoshinori; Muthukaruppan, Anita; Ranjard, Louis; Derkac, Eliane; Imoto, Seiya; Miyano, Satoru; Crampin, Edmund J.; Print, Cristin G.

    2012-01-01

    Background Our understanding of the molecular pathways that underlie melanoma remains incomplete. Although several published microarray studies of clinical melanomas have provided valuable information, we found only limited concordance between these studies. Therefore, we took an in vitro functional genomics approach to understand melanoma molecular pathways. Methodology/Principal Findings Affymetrix microarray data were generated from A375 melanoma cells treated in vitro with siRNAs against 45 transcription factors and signaling molecules. Analysis of this data using unsupervised hierarchical clustering and Bayesian gene networks identified proliferation-association RNA clusters, which were co-ordinately expressed across the A375 cells and also across melanomas from patients. The abundance in metastatic melanomas of these cellular proliferation clusters and their putative upstream regulators was significantly associated with patient prognosis. An 8-gene classifier derived from gene network hub genes correctly classified the prognosis of 23/26 metastatic melanoma patients in a cross-validation study. Unlike the RNA clusters associated with cellular proliferation described above, co-ordinately expressed RNA clusters associated with immune response were clearly identified across melanoma tumours from patients but not across the siRNA-treated A375 cells, in which immune responses are not active. Three uncharacterised genes, which the gene networks predicted to be upstream of apoptosis- or cellular proliferation-associated RNAs, were found to significantly alter apoptosis and cell number when over-expressed in vitro. Conclusions/Significance This analysis identified co-expression of RNAs that encode functionally-related proteins, in particular, proliferation-associated RNA clusters that are linked to melanoma patient prognosis. Our analysis suggests that A375 cells in vitro may be valid models in which to study the gene expression modules that underlie some melanoma

  13. Protein Co-Expression Analysis as a Strategy to Complement a Standard Quantitative Proteomics Approach: Case of a Glioblastoma Multiforme Study

    PubMed Central

    Deighton, Ruth F.

    2016-01-01

    Although correlation network studies from co-expression analysis are increasingly popular, they are rarely applied to proteomics datasets. Protein co-expression analysis provides a complementary view of underlying trends, which can be overlooked by conventional data analysis. The core of the present study is based on Weighted Gene Co-expression Network Analysis applied to a glioblastoma multiforme proteomic dataset. Using this method, we have identified three main modules which are associated with three different membrane associated groups; mitochondrial, endoplasmic reticulum, and a vesicle fraction. The three networks based on protein co-expression were assessed against a publicly available database (STRING) and show a statistically significant overlap. Each of the three main modules were de-clustered into smaller networks using different strategies based on the identification of highly connected networks, hierarchical clustering and enrichment of Gene Ontology functional terms. Most of the highly connected proteins found in the endoplasmic reticulum module were associated with redox activity while a core of the unfolded protein response was identified in addition to proteins involved in oxidative stress pathways. The proteins composing the electron transfer chain were found differently affected with proteins from mitochondrial Complex I being more down-regulated than proteins from Complex III. Finally, the two pyruvate kinases isoforms show major differences in their co-expressed protein networks suggesting roles in different cellular locations. PMID:27571357

  14. Protein Co-Expression Analysis as a Strategy to Complement a Standard Quantitative Proteomics Approach: Case of a Glioblastoma Multiforme Study.

    PubMed

    Kanonidis, Evangelos I; Roy, Marcia M; Deighton, Ruth F; Le Bihan, Thierry

    2016-01-01

    Although correlation network studies from co-expression analysis are increasingly popular, they are rarely applied to proteomics datasets. Protein co-expression analysis provides a complementary view of underlying trends, which can be overlooked by conventional data analysis. The core of the present study is based on Weighted Gene Co-expression Network Analysis applied to a glioblastoma multiforme proteomic dataset. Using this method, we have identified three main modules which are associated with three different membrane associated groups; mitochondrial, endoplasmic reticulum, and a vesicle fraction. The three networks based on protein co-expression were assessed against a publicly available database (STRING) and show a statistically significant overlap. Each of the three main modules were de-clustered into smaller networks using different strategies based on the identification of highly connected networks, hierarchical clustering and enrichment of Gene Ontology functional terms. Most of the highly connected proteins found in the endoplasmic reticulum module were associated with redox activity while a core of the unfolded protein response was identified in addition to proteins involved in oxidative stress pathways. The proteins composing the electron transfer chain were found differently affected with proteins from mitochondrial Complex I being more down-regulated than proteins from Complex III. Finally, the two pyruvate kinases isoforms show major differences in their co-expressed protein networks suggesting roles in different cellular locations. PMID:27571357

  15. NERI: network-medicine based integrative approach for disease gene prioritization by relative importance

    PubMed Central

    2015-01-01

    Background Complex diseases are characterized as being polygenic and multifactorial, so this poses a challenge regarding the search for genes related to them. With the advent of high-throughput technologies for genome sequencing, gene expression measurements (transcriptome), and protein-protein interactions, complex diseases have been sistematically investigated. Particularly, Protein-Protein Interaction (PPI) networks have been used to prioritize genes related to complex diseases according to its topological features. However, PPI networks are affected by ascertainment bias, in which more studied proteins tend to have more connections, degrading the results quality. Additionally, methods using only PPI networks can provide only static and non-specific results, since the topologies of these networks are not specific of a given disease. Results The goal of this work is to develop a methodology that integrates PPI networks with disease specific data sources, such as GWAS and gene expression, to find genes more specific of a given complex disease. After the integration of PPI networks and gene expression data, the resulting network is used to connect genes related to the disease through the shortest paths that have the greatest concordance between their gene expressions. Both case and control expression data are used separately and, at the end, the most altered genes between the two conditions are selected. To evaluate the method, schizophrenia was adopted as case study. Conclusion Results show that the proposed method successfully retrieves differentially coexpressed genes in two conditions, while avoiding the bias from literature. Moreover we were able to achieve a greater concordance in the selection of important genes from different microarray studies of the same disease and to produce a more specific gene set related to the studied disease. PMID:26696568

  16. Functional characterization of drought-responsive modules and genes in Oryza sativa: a network-based approach.

    PubMed

    Sircar, Sanchari; Parekh, Nita

    2015-01-01

    Drought is one of the major environmental stress conditions affecting the yield of rice across the globe. Unraveling the functional roles of the drought-responsive genes and their underlying molecular mechanisms will provide important leads to improve the yield of rice. Co-expression relationships derived from condition-dependent gene expression data is an effective way to identify the functional associations between genes that are part of the same biological process and may be under similar transcriptional control. For this purpose, vast amount of freely available transcriptomic data may be used. In this study, we consider gene expression data for different tissues and developmental stages in response to drought stress. We analyze the network of co-expressed genes to identify drought-responsive genes modules in a tissue and stage-specific manner based on differential expression and gene enrichment analysis. Taking cues from the systems-level behavior of these modules, we propose two approaches to identify clusters of tightly co-expressed/co-regulated genes. Using graph-centrality measures and differential gene expression, we identify biologically informative genes that lack any functional annotation. We show that using orthologous information from other plant species, the conserved co-expression patterns of the uncharacterized genes can be identified. Presence of a conserved neighborhood enables us to extrapolate functional annotation. Alternatively, we show that single 'guide-gene' approach can help in understanding tissue-specific transcriptional regulation of uncharacterized genes. Finally, we confirm the predicted roles of uncharacterized genes by the analysis of conserved cis-elements and explain the possible roles of these genes toward drought tolerance. PMID:26284112

  17. Guilt by rewiring: gene prioritization through network rewiring in Genome Wide Association Studies

    PubMed Central

    Hou, Lin; Chen, Min; Zhang, Clarence K.; Cho, Judy; Zhao, Hongyu

    2014-01-01

    Although Genome Wide Association Studies (GWAS) have identified many susceptibility loci for common diseases, they only explain a small portion of heritability. It is challenging to identify the remaining disease loci because their association signals are likely weak and difficult to identify among millions of candidates. One potentially useful direction to increase statistical power is to incorporate functional genomics information, especially gene expression networks, to prioritize GWAS signals. Most current methods utilizing network information to prioritize disease genes are based on the ‘guilt by association’ principle, in which networks are treated as static, and disease-associated genes are assumed to locate closer with each other than random pairs in the network. In contrast, we propose a novel ‘guilt by rewiring’ principle. Studying the dynamics of gene networks between controls and patients, this principle assumes that disease genes more likely undergo rewiring in patients, whereas most of the network remains unaffected in disease condition. To demonstrate this principle, we consider the changes of co-expression networks in Crohn's disease patients and controls, and how network dynamics reveals information on disease associations. Our results demonstrate that network rewiring is abundant in the immune system, and disease-associated genes are more likely to be rewired in patients. To integrate this network rewiring feature and GWAS signals, we propose to use the Markov random field framework to integrate network information to prioritize genes. Applications in Crohn's disease and Parkinson's disease show that this framework leads to more replicable results, and implicates potentially disease-associated pathways. PMID:24381306

  18. Co-Expression and Co-Localization of Cartilage Glycoproteins CHI3L1 and Lubricin in Osteoarthritic Cartilage: Morphological, Immunohistochemical and Gene Expression Profiles.

    PubMed

    Szychlinska, Marta Anna; Trovato, Francesca Maria; Di Rosa, Michelino; Malaguarnera, Lucia; Puzzo, Lidia; Leonardi, Rosy; Castrogiovanni, Paola; Musumeci, Giuseppe

    2016-01-01

    Osteoarthritis is the most common human arthritis characterized by degeneration of articular cartilage. Several studies reported that levels of human cartilage glycoprotein chitinase 3-like-1 (CHI3L1) are known as a potential marker for the activation of chondrocytes and the progression of Osteoarthritis (OA), whereas lubricin appears to be chondroprotective. The aim of this study was to investigate the co-expression and co-localization of CHI3L1 and lubricin in normal and osteoarthritic rat articular cartilage to correlate their modified expression to a specific grade of OA. Samples of normal and osteoarthritic rat articular cartilage were analyzed by the Kellgren-Lawrence OA severity scores, the Kraus' modified Mankin score and the Histopathology Osteoarthritis Research Society International (OARSI) system for histomorphometric evaluations, and through CHI3L1 and lubricin gene expression, immunohistochemistry and double immuno-staining analysis. The immunoexpression and the mRNA levels of lubricin increased in normal cartilage and decreased in OA cartilage (normal vs. OA, p < 0.01). By contrast, the immunoexpression and the mRNA levels of CHI3L1 increased in OA cartilage and decreased in normal cartilage (normal vs. OA, p < 0.01). Our findings are consistent with reports suggesting that these two glycoproteins are functionally associated with the development of OA and in particular with grade 2/3 of OA, suggesting that in the future they could be helpful to stage the severity and progression of the disease. PMID:26978347

  19. Efficient Production of Hydroxylated Human-Like Collagen Via the Co-Expression of Three Key Genes in Escherichia coli Origami (DE3).

    PubMed

    Tang, Yunping; Yang, Xiuliang; Hang, Baojian; Li, Jiangtao; Huang, Lei; Huang, Feng; Xu, Zhinan

    2016-04-01

    Mature collagen is abundant in human bodies and very valuable for a range of industrial and medical applications. The biosynthesis of mature collagen requires post-translational modifications to increase the stability of collagen triple helix structure. By co-expressing the human-like collagen (HLC) gene with human prolyl 4-hydroxylase (P4H) and D-arabinono-1, 4-lactone oxidase (ALO) in Escherichia coli, we have constructed a prokaryotic expression system to produce the hydroxylated HLC. Then, five different media, as well as the induction conditions were investigated with regard to the soluble expression of such protein. The results indicated that the highest soluble expression level of target HLC obtained in shaking flasks was 49.55 ± 0.36 mg/L, when recombinant cells were grew in MBL medium and induced by 0.1 mM IPTG at the middle stage of exponential growth phase. By adopting the glucose feeding strategy, the expression level of target HLC can be improved up to 260 mg/L in a 10 L bench-top fermentor. Further, HPLC analyses revealed that more than 10 % of proline residues in purified HLC were successfully hydroxylated. The present work has provided a solid base for the large-scale production of hydroxylated HLC in E. coli. PMID:26712247

  20. Co-Expression and Co-Localization of Cartilage Glycoproteins CHI3L1 and Lubricin in Osteoarthritic Cartilage: Morphological, Immunohistochemical and Gene Expression Profiles

    PubMed Central

    Szychlinska, Marta Anna; Trovato, Francesca Maria; Di Rosa, Michelino; Malaguarnera, Lucia; Puzzo, Lidia; Leonardi, Rosy; Castrogiovanni, Paola; Musumeci, Giuseppe

    2016-01-01

    Osteoarthritis is the most common human arthritis characterized by degeneration of articular cartilage. Several studies reported that levels of human cartilage glycoprotein chitinase 3-like-1 (CHI3L1) are known as a potential marker for the activation of chondrocytes and the progression of Osteoarthritis (OA), whereas lubricin appears to be chondroprotective. The aim of this study was to investigate the co-expression and co-localization of CHI3L1 and lubricin in normal and osteoarthritic rat articular cartilage to correlate their modified expression to a specific grade of OA. Samples of normal and osteoarthritic rat articular cartilage were analyzed by the Kellgren–Lawrence OA severity scores, the Kraus’ modified Mankin score and the Histopathology Osteoarthritis Research Society International (OARSI) system for histomorphometric evaluations, and through CHI3L1 and lubricin gene expression, immunohistochemistry and double immuno-staining analysis. The immunoexpression and the mRNA levels of lubricin increased in normal cartilage and decreased in OA cartilage (normal vs. OA, p < 0.01). By contrast, the immunoexpression and the mRNA levels of CHI3L1 increased in OA cartilage and decreased in normal cartilage (normal vs. OA, p < 0.01). Our findings are consistent with reports suggesting that these two glycoproteins are functionally associated with the development of OA and in particular with grade 2/3 of OA, suggesting that in the future they could be helpful to stage the severity and progression of the disease. PMID:26978347

  1. Genomic and Coexpression Analyses Predict Multiple Genes Involved in Triterpene Saponin Biosynthesis in Medicago truncatula[C][W

    PubMed Central

    Naoumkina, Marina A.; Modolo, Luzia V.; Huhman, David V.; Urbanczyk-Wochniak, Ewa; Tang, Yuhong; Sumner, Lloyd W.; Dixon, Richard A.

    2010-01-01

    Saponins, an important group of bioactive plant natural products, are glycosides of triterpenoid or steroidal aglycones (sapogenins). Saponins possess many biological activities, including conferring potential health benefits for humans. However, most of the steps specific for the biosynthesis of triterpene saponins remain uncharacterized at the molecular level. Here, we use comprehensive gene expression clustering analysis to identify candidate genes involved in the elaboration, hydroxylation, and glycosylation of the triterpene skeleton in the model legume Medicago truncatula. Four candidate uridine diphosphate glycosyltransferases were expressed in Escherichia coli, one of which (UGT73F3) showed specificity for multiple sapogenins and was confirmed to glucosylate hederagenin at the C28 position. Genetic loss-of-function studies in M. truncatula confirmed the in vivo function of UGT73F3 in saponin biosynthesis. This report provides a basis for future studies to define genetically the roles of multiple cytochromes P450 and glycosyltransferases in triterpene saponin biosynthesis in Medicago. PMID:20348429

  2. Functional characterization of drought-responsive modules and genes in Oryza sativa: a network-based approach

    PubMed Central

    Sircar, Sanchari; Parekh, Nita

    2015-01-01

    Drought is one of the major environmental stress conditions affecting the yield of rice across the globe. Unraveling the functional roles of the drought-responsive genes and their underlying molecular mechanisms will provide important leads to improve the yield of rice. Co-expression relationships derived from condition-dependent gene expression data is an effective way to identify the functional associations between genes that are part of the same biological process and may be under similar transcriptional control. For this purpose, vast amount of freely available transcriptomic data may be used. In this study, we consider gene expression data for different tissues and developmental stages in response to drought stress. We analyze the network of co-expressed genes to identify drought-responsive genes modules in a tissue and stage-specific manner based on differential expression and gene enrichment analysis. Taking cues from the systems-level behavior of these modules, we propose two approaches to identify clusters of tightly co-expressed/co-regulated genes. Using graph-centrality measures and differential gene expression, we identify biologically informative genes that lack any functional annotation. We show that using orthologous information from other plant species, the conserved co-expression patterns of the uncharacterized genes can be identified. Presence of a conserved neighborhood enables us to extrapolate functional annotation. Alternatively, we show that single ‘guide-gene’ approach can help in understanding tissue-specific transcriptional regulation of uncharacterized genes. Finally, we confirm the predicted roles of uncharacterized genes by the analysis of conserved cis-elements and explain the possible roles of these genes toward drought tolerance. PMID:26284112

  3. Tissue-specific Co-expression of Long Non-coding and Coding RNAs Associated with Breast Cancer

    PubMed Central

    Wu, Wenting; Wagner, Erin K.; Hao, Yangyang; Rao, Xi; Dai, Hongji; Han, Jiali; Chen, Jinhui; Storniolo, Anna Maria V.; Liu, Yunlong; He, Chunyan

    2016-01-01

    Inference of the biological roles of lncRNAs in breast cancer development remains a challenge. Here, we analyzed RNA-seq data in tumor and normal breast tissue samples from 18 breast cancer patients and 18 healthy controls and constructed a functional lncRNA-mRNA co-expression network. We revealed two distinctive co-expression patterns associated with breast cancer, reflecting different underlying regulatory mechanisms: (1) 516 pairs of lncRNA-mRNAs have differential co-expression pattern, in which the correlation between lncRNA and mRNA expression differs in tumor and normal breast tissue; (2) 291 pairs have dose-response co-expression pattern, in which the correlation is similar, but the expression level of lncRNA or mRNA differs in the two tissue types. We further validated our findings in TCGA dataset and annotated lncRNAs using TANRIC. One novel lncRNA, AC145110.1 on 8p12, was found differentially co-expressed with 127 mRNAs (including TOX4 and MAEL) in tumor and normal breast tissue and also highly correlated with breast cancer clinical outcomes. Functional enrichment and pathway analyses identified distinct biological functions for different patterns of co-expression regulations. Our data suggested that lncRNAs might be involved in breast tumorigenesis through the modulation of gene expression in multiple pathologic pathways. PMID:27597120

  4. Tissue-specific Co-expression of Long Non-coding and Coding RNAs Associated with Breast Cancer.

    PubMed

    Wu, Wenting; Wagner, Erin K; Hao, Yangyang; Rao, Xi; Dai, Hongji; Han, Jiali; Chen, Jinhui; Storniolo, Anna Maria V; Liu, Yunlong; He, Chunyan

    2016-01-01

    Inference of the biological roles of lncRNAs in breast cancer development remains a challenge. Here, we analyzed RNA-seq data in tumor and normal breast tissue samples from 18 breast cancer patients and 18 healthy controls and constructed a functional lncRNA-mRNA co-expression network. We revealed two distinctive co-expression patterns associated with breast cancer, reflecting different underlying regulatory mechanisms: (1) 516 pairs of lncRNA-mRNAs have differential co-expression pattern, in which the correlation between lncRNA and mRNA expression differs in tumor and normal breast tissue; (2) 291 pairs have dose-response co-expression pattern, in which the correlation is similar, but the expression level of lncRNA or mRNA differs in the two tissue types. We further validated our findings in TCGA dataset and annotated lncRNAs using TANRIC. One novel lncRNA, AC145110.1 on 8p12, was found differentially co-expressed with 127 mRNAs (including TOX4 and MAEL) in tumor and normal breast tissue and also highly correlated with breast cancer clinical outcomes. Functional enrichment and pathway analyses identified distinct biological functions for different patterns of co-expression regulations. Our data suggested that lncRNAs might be involved in breast tumorigenesis through the modulation of gene expression in multiple pathologic pathways. PMID:27597120

  5. Gene Function Prediction from Functional Association Networks Using Kernel Partial Least Squares Regression

    PubMed Central

    Lehtinen, Sonja; Lees, Jon; Bähler, Jürg; Shawe-Taylor, John; Orengo, Christine

    2015-01-01

    With the growing availability of large-scale biological datasets, automated methods of extracting functionally meaningful information from this data are becoming increasingly important. Data relating to functional association between genes or proteins, such as co-expression or functional association, is often represented in terms of gene or protein networks. Several methods of predicting gene function from these networks have been proposed. However, evaluating the relative performance of these algorithms may not be trivial: concerns have been raised over biases in different benchmarking methods and datasets, particularly relating to non-independence of functional association data and test data. In this paper we propose a new network-based gene function prediction algorithm using a commute-time kernel and partial least squares regression (Compass). We compare Compass to GeneMANIA, a leading network-based prediction algorithm, using a number of different benchmarks, and find that Compass outperforms GeneMANIA on these benchmarks. We also explicitly explore problems associated with the non-independence of functional association data and test data. We find that a benchmark based on the Gene Ontology database, which, directly or indirectly, incorporates information from other databases, may considerably overestimate the performance of algorithms exploiting functional association data for prediction. PMID:26288239

  6. Stabilizing gene regulatory networks through feedforward loops

    NASA Astrophysics Data System (ADS)

    Kadelka, C.; Murrugarra, D.; Laubenbacher, R.

    2013-06-01

    The global dynamics of gene regulatory networks are known to show robustness to perturbations in the form of intrinsic and extrinsic noise, as well as mutations of individual genes. One molecular mechanism underlying this robustness has been identified as the action of so-called microRNAs that operate via feedforward loops. We present results of a computational study, using the modeling framework of stochastic Boolean networks, which explores the role that such network motifs play in stabilizing global dynamics. The paper introduces a new measure for the stability of stochastic networks. The results show that certain types of feedforward loops do indeed buffer the network against stochastic effects.

  7. Modeling of hysteresis in gene regulatory networks.

    PubMed

    Hu, J; Qin, K R; Xiang, C; Lee, T H

    2012-08-01

    Hysteresis, observed in many gene regulatory networks, has a pivotal impact on biological systems, which enhances the robustness of cell functions. In this paper, a general model is proposed to describe the hysteretic gene regulatory network by combining the hysteresis component and the transient dynamics. The Bouc-Wen hysteresis model is modified to describe the hysteresis component in the mammalian gene regulatory networks. Rigorous mathematical analysis on the dynamical properties of the model is presented to ensure the bounded-input-bounded-output (BIBO) stability and demonstrates that the original Bouc-Wen model can only generate a clockwise hysteresis loop while the modified model can describe both clockwise and counter clockwise hysteresis loops. Simulation studies have shown that the hysteresis loops from our model are consistent with the experimental observations in three mammalian gene regulatory networks and two E.coli gene regulatory networks, which demonstrate the ability and accuracy of the mathematical model to emulate natural gene expression behavior with hysteresis. A comparison study has also been conducted to show that this model fits the experiment data significantly better than previous ones in the literature. The successful modeling of the hysteresis in all the five hysteretic gene regulatory networks suggests that the new model has the potential to be a unified framework for modeling hysteresis in gene regulatory networks and provide better understanding of the general mechanism that drives the hysteretic function. PMID:22588784

  8. Floral Transcriptomes in Woodland Strawberry Uncover Developing Receptacle and Anther Gene Networks.

    PubMed

    Hollender, Courtney A; Kang, Chunying; Darwish, Omar; Geretz, Aviva; Matthews, Benjamin F; Slovin, Janet; Alkharouf, Nadim; Liu, Zhongchi

    2014-05-14

    Flowers are reproductive organs and precursors to fruits and seeds. While the basic tenets of the ABCE model of flower development are conserved in angiosperms, different flowering plants exhibit different and sometimes unique characteristics. A distinct feature of strawberry (Fragaria spp.) flowers is the development of several hundreds of individual apocarpous (unfused) carpels. These individual carpels are arranged in a spiral pattern on the subtending stem tip, the receptacle. Therefore, the receptacle is an integral part of the strawberry flower and is of significant agronomic importance, being the precursor to strawberry fruit. Taking advantage of next-generation sequencing and laser capture microdissection, we generated different tissue- and stage-transcriptomic profiling of woodland strawberry (Fragaria vesca) flower development. Using pairwise comparisons and weighted gene coexpression network analysis, we identified modules of coexpressed genes and hub genes of tissue-specific networks. Of particular importance is the discovery of a developing receptacle-specific module exhibiting similar molecular features to those of young floral meristems. The strawberry homologs of a number of meristem regulators, including LOST MERISTEM and WUSCHEL, are identified as hub genes operating in the developing receptacle network. Furthermore, almost 25% of the F-box genes in the genome are transiently induced in developing anthers at the meiosis stage, indicating active protein degradation. Together, this work provides important insights into the molecular networks underlying strawberry's unique reproductive developmental processes. This extensive floral transcriptome data set is publicly available and can be readily queried at the project Web site, serving as an important genomic resource for the plant biology research community. PMID:24828307

  9. Gene set enrichment and topological analyses based on interaction networks in pediatric acute lymphoblastic leukemia

    PubMed Central

    SUI, SHUXIANG; WANG, XIN; ZHENG, HUA; GUO, HUA; CHEN, TONG; JI, DONG-MEI

    2015-01-01

    Pediatric acute lymphoblastic leukemia (ALL) accounts for over one-quarter of all pediatric cancers. Interacting genes and proteins within the larger human gene interaction network of the human genome are rarely investigated by studies investigating pediatric ALL. In the present study, interaction networks were constructed using the empirical Bayesian approach and the Search Tool for the Retrieval of Interacting Genes/proteins database, based on the differentially-expressed (DE) genes in pediatric ALL, which were identified using the RankProd package. Enrichment analysis of the interaction network was performed using the network-based methods EnrichNet and PathExpand, which were compared with the traditional expression analysis systematic explored (EASE) method. In total, 398 DE genes were identified in pediatric ALL, and LIF was the most significantly DE gene. The co-expression network consisted of 272 nodes, which indicated genes and proteins, and 602 edges, which indicated the number of interactions adjacent to the node. Comparison between EASE and PathExpand revealed that PathExpand detected more pathways or processes that were closely associated with pediatric ALL compared with the EASE method. There were 294 nodes and 1,588 edges in the protein-protein interaction network, with the processes of hematopoietic cell lineage and porphyrin metabolism demonstrating a close association with pediatric ALL. Network enrichment analysis based on the PathExpand algorithm was revealed to be more powerful for the analysis of interaction networks in pediatric ALL compared with the EASE method. LIF and MLLT11 were identified as the most significantly DE genes in pediatric ALL. The process of hematopoietic cell lineage was the pathway most significantly associated with pediatric ALL. PMID:26788135

  10. A Systems Approach Identifies Networks and Genes Linking Sleep and Stress: Implications for Neuropsychiatric Disorders

    PubMed Central

    Jiang, Peng; Scarpa, Joseph R.; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D.; Hao, Ke; Summa, Keith C.; Yang, He S.; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H.; Turek, Fred W.; Kasarskis, Andrew

    2016-01-01

    SUMMARY Sleep dysfunction and stress susceptibility are co-morbid complex traits, which often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multi-level organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J×A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests the interplay between sleep, stress, and neuropathology emerge from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework to interrogate the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. PMID:25921536

  11. Analysis of Cascading Failure in Gene Networks

    PubMed Central

    Sun, Longxiao; Wang, Shudong; Li, Kaikai; Meng, Dazhi

    2012-01-01

    It is an important subject to research the functional mechanism of cancer-related genes make in formation and development of cancers. The modern methodology of data analysis plays a very important role for deducing the relationship between cancers and cancer-related genes and analyzing functional mechanism of genome. In this research, we construct mutual information networks using gene expression profiles of glioblast and renal in normal condition and cancer conditions. We investigate the relationship between structure and robustness in gene networks of the two tissues using a cascading failure model based on betweenness centrality. Define some important parameters such as the percentage of failure nodes of the network, the average size-ratio of cascading failure, and the cumulative probability of size-ratio of cascading failure to measure the robustness of the networks. By comparing control group and experiment groups, we find that the networks of experiment groups are more robust than that of control group. The gene that can cause large scale failure is called structural key gene. Some of them have been confirmed to be closely related to the formation and development of glioma and renal cancer respectively. Most of them are predicted to play important roles during the formation of glioma and renal cancer, maybe the oncogenes, suppressor genes, and other cancer candidate genes in the glioma and renal cancer cells. However, these studies provide little information about the detailed roles of identified cancer genes. PMID:23248647

  12. Integrated in silico Analyses of Regulatory and Metabolic Networks of Synechococcus sp. PCC 7002 Reveal Relationships between Gene Centrality and Essentiality

    PubMed Central

    Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; Overall, Christopher C.; Hill, Eric A.; Beliaev, Alexander S.

    2015-01-01

    Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as “topologically important.” Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termed as “functionally important” genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles. PMID:25826650

  13. Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

    SciTech Connect

    Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; Overall, Christopher C.; Hill, Eric A.; Beliaev, Alex S.

    2015-03-27

    Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termed as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.

  14. Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

    DOE PAGESBeta

    Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; Overall, Christopher C.; Hill, Eric A.; Beliaev, Alex S.

    2015-03-27

    Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less

  15. Characterizing gene sets using discriminative random walks with restart on heterogeneous biological networks

    PubMed Central

    Blatti, Charles; Sinha, Saurabh

    2016-01-01

    Motivation: Analysis of co-expressed gene sets typically involves testing for enrichment of different annotations or ‘properties’ such as biological processes, pathways, transcription factor binding sites, etc., one property at a time. This common approach ignores any known relationships among the properties or the genes themselves. It is believed that known biological relationships among genes and their many properties may be exploited to more accurately reveal commonalities of a gene set. Previous work has sought to achieve this by building biological networks that combine multiple types of gene–gene or gene–property relationships, and performing network analysis to identify other genes and properties most relevant to a given gene set. Most existing network-based approaches for recognizing genes or annotations relevant to a given gene set collapse information about different properties to simplify (homogenize) the networks. Results: We present a network-based method for ranking genes or properties related to a given gene set. Such related genes or properties are identified from among the nodes of a large, heterogeneous network of biological information. Our method involves a random walk with restarts, performed on an initial network with multiple node and edge types that preserve more of the original, specific property information than current methods that operate on homogeneous networks. In this first stage of our algorithm, we find the properties that are the most relevant to the given gene set and extract a subnetwork of the original network, comprising only these relevant properties. We then re-rank genes by their similarity to the given gene set, based on a second random walk with restarts, performed on the above subnetwork. We demonstrate the effectiveness of this algorithm for ranking genes related to Drosophila embryonic development and aggressive responses in the brains of social animals. Availability and Implementation: DRaWR was implemented as

  16. Structure and function of gene regulatory networks associated with worker sterility in honeybees.

    PubMed

    Sobotka, Julia A; Daley, Mark; Chandrasekaran, Sriram; Rubin, Benjamin D; Thompson, Graham J

    2016-03-01

    A characteristic of eusocial bees is a reproductive division of labor in which one or a few queens monopolize reproduction, while her worker daughters take on reproductively altruistic roles within the colony. The evolution of worker reproductive altruism involves indirect selection for the coordinated expression of genes that regulate personal reproduction, but evidence for this type of selection remains elusive. In this study, we tested whether genes coexpressed under queen-induced worker sterility show evidence of adaptive organization within a model brain transcriptional regulatory network (TRN). If so, this structured pattern would imply that indirect selection on nonreproductive workers has influenced the functional organization of genes within the network, specifically to regulate the expression of sterility. We found that literature-curated sets of candidate genes for sterility, ranging in size from 18 to 267, show strong evidence of clustering within the three-dimensional space of the TRN. This finding suggests that our candidate sets of genes for sterility form functional modules within the living bee brain's TRN. Moreover, these same gene sets colocate to a single, albeit large, region of the TRN's topology. This spatially organized and convergent pattern contrasts with a null expectation for functionally unrelated genes to be haphazardly distributed throughout the network. Our meta-genomic analysis therefore provides first evidence for a truly "social transcriptome" that may regulate the conditional expression of honeybee worker sterility. PMID:26925214

  17. Network analysis of neurotransmitter related human kinase genes: possible role of SRC, RAF1, PTK2B?

    PubMed

    Brys, Zoltan; Pluhar, Andras; Kis, Janos Tibor; Buda, Bela; Szabo, Attila

    2013-09-01

    Previous co-expression analysis of human kinase genes highlighted 119 genes in neurotransmitter-related activity (based on Go:Terms). Using a merged interactome dataset, we analyzed the network of these Neurotransmitter Related Human Kinase Genes. Using the full interactome dataset we extended the network and calculating degrees and closeness centralities we identified SRC, MAPK1, RAF1, PTK2B and AKT1 kinase genes as potentially relevant nodes which did not show relevant activity in the original experimental study. As AKT1 and MAPK1 have already been indicated in various neuronal functions, we hypothesize a potential direct or indirect role for SRC, RAF1, PTK2B genes in neurotransmission and in central nervous system signaling processes. PMID:24108181

  18. Gene Regulation Networks for Modeling Drosophila Development

    NASA Technical Reports Server (NTRS)

    Mjolsness, E.

    1999-01-01

    This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila Melanogaster.

  19. Supervised classification for gene network reconstruction.

    PubMed

    Soinov, L A

    2003-12-01

    One of the central problems of functional genomics is revealing gene expression networks - the relationships between genes that reflect observations of how the expression level of each gene affects those of others. Microarray data are currently a major source of information about the interplay of biochemical network participants in living cells. Various mathematical techniques, such as differential equations, Bayesian and Boolean models and several statistical methods, have been applied to expression data in attempts to extract the underlying knowledge. Unsupervised clustering methods are often considered as the necessary first step in visualization and analysis of the expression data. As for supervised classification, the problem mainly addressed so far has been how to find discriminative genes separating various samples or experimental conditions. Numerous methods have been applied to identify genes that help to predict treatment outcome or to confirm a diagnosis, as well as to identify primary elements of gene regulatory circuits. However, less attention has been devoted to using supervised learning to uncover relationships between genes and/or their products. To start filling this gap a machine-learning approach for gene networks reconstruction is described here. This approach is based on building classifiers--functions, which determine the state of a gene's transcription machinery through expression levels of other genes. The method can be applied to various cases where relationships between gene expression levels could be expected. PMID:14641098

  20. Co-expression of Ubiquitin gene and capsid protein gene enhances the potency of DNA immunization of PCV2 in mice

    PubMed Central

    2011-01-01

    A recombinant plasmid that co-expressed ubiquitin and porcine circovirus type 2 (PCV2) virus capsid protein (Cap), denoted as pc-Ub-Cap, and a plasmid encoding PCV2 virus Cap alone, denoted as pc-Cap, were transfected into 293T cells. Indirect immunofluorescence (IIF) and confocal microscopy were performed to measure the cellular expression of Cap. Three groups of mice were then vaccinated once every three weeks for a total of three doses with pc-Ub-Cap, pc-Cap or the empty vector pCAGGS, followed by challenging all mice intraperitoneally with 0.5 mL 106.5 TCID50/mL PCV2. To characterize the protective immune response against PCV2 infection in mice, assays of antibody titer (including different IgG isotypes), flow cytometric analysis (FCM), lymphocyte proliferation, cytokine production and viremia were evaluated. The results showed that pc-Ub-Cap and pc-Cap were efficiently expressed in 293T cells. However, pc-Ub-Cap-vaccinated animals had a significantly higher level of Cap-specific antibody and induced a stronger Th1 type cellular immune response than did pc-Cap-vaccinated animals, suggesting that ubiquitin conjugation improved both the cellular and humoral immune responses. Additionally, viral replication in blood was lower in the pc-Ub-Cap-vaccinated group than in the pc-Cap and empty vector groups, suggesting that the protective immunity induced by pc-Ub-Cap is superior to that induced by pc-Cap. PMID:21624113

  1. Autonomous Boolean modeling of gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Socolar, Joshua; Sun, Mengyang; Cheng, Xianrui

    2014-03-01

    In cases where the dynamical properties of gene regulatory networks are important, a faithful model must include three key features: a network topology; a functional response of each element to its inputs; and timing information about the transmission of signals across network links. Autonomous Boolean network (ABN) models are efficient representations of these elements and are amenable to analysis. We present an ABN model of the gene regulatory network governing cell fate specification in the early sea urchin embryo, which must generate three bands of distinct tissue types after several cell divisions, beginning from an initial condition with only two distinct cell types. Analysis of the spatial patterning problem and the dynamics of a network constructed from available experimental results reveals that a simple mechanism is at work in this case. Supported by NSF Grant DMS-10-68602

  2. Time-course gene profiling and networks in demethylated retinoblastoma cell line

    PubMed Central

    Malusa, Federico; Taranta, Monia; Zaki, Nazar; Cinti, Caterina; Capobianco, Enrico

    2015-01-01

    Retinoblastoma, a very aggressive cancer of the developing retina, initiatiates by the biallelic loss of RB1 gene, and progresses very quickly following RB1 inactivation. While its genome is stable, multiple pathways are deregulated, also epigenetically. After reviewing the main findings in relation with recently validated markers, we propose an integrative bioinformatics approach to include in the previous group new markers obtained from the analysis of a single cell line subject to epigenetic treatment. In particular, differentially expressed genes are identified from time course microarray experiments on the WERI-RB1 cell line treated with 5-Aza-2′-deoxycytidine (decitabine; DAC). By inducing demethylation of CpG island in promoter genes that are involved in biological processes, for instance apoptosis, we performed the following main integrative analysis steps: i) Gene expression profiling at 48h, 72h and 96h after DAC treatment; ii) Time differential gene co-expression networks and iii) Context-driven marker association (transcriptional factor regulated protein networks, master regulatory paths). The observed DAC-driven temporal profiles and regulatory connectivity patterns are obtained by the application of computational tools, with support from curated literature. It is worth emphasizing the capacity of networks to reconcile multi-type evidences, thus generating testable hypotheses made available by systems scale predictive inference power. Despite our small experimental setting, we propose through such integrations valuable impacts of epigenetic treatment in terms of gene expression measurements, and then validate evidenced apoptotic effects. PMID:26143641

  3. Gene Networks Specific for Innate Immunity Define Post-Traumatic Stress Disorder

    PubMed Central

    Breen, Michael S.; Maihofer, Adam X.; Glatt, Stephen J.; Tylee, Daniel S.; Chandler, Sharon D.; Tsuang, Ming T.; Risbrough, Victoria B.; Baker, Dewleen G.; O’Connor, Daniel T.; Nievergelt, Caroline M.; Woelk, Christopher H.

    2015-01-01

    The molecular factors involved in the development of Post-Traumatic Stress Disorder (PTSD) remain poorly understood. Previous transcriptomic studies investigating the mechanisms of PTSD apply targeted approaches to identify individual genes under a cross-sectional framework lack a holistic view of the behaviours and properties of these genes at the system-level. Here we sought to apply an unsupervised gene-network based approach to a prospective experimental design using whole-transcriptome RNA-Seq gene expression from peripheral blood leukocytes of U.S. Marines (N=188), obtained both pre- and post-deployment to conflict zones. We identified discrete groups of co-regulated genes (i.e., co-expression modules) and tested them for association to PTSD. We identified one module at both pre- and post-deployment containing putative causal signatures for PTSD development displaying an over-expression of genes enriched for functions of innate-immune response and interferon signalling (Type-I and Type-II). Importantly, these results were replicated in a second non-overlapping independent dataset of U.S. Marines (N=96), further outlining the role of innate immune and interferon signalling genes within co-expression modules to explain at least part of the causal pathophysiology for PTSD development. A second module, consequential of trauma exposure, contained PTSD resiliency signatures and an over-expression of genes involved in hemostasis and wound responsiveness suggesting that chronic levels of stress impair proper wound healing during/after exposure to the battlefield while highlighting the role of the hemostatic system as a clinical indicator of chronic-based stress. These findings provide novel insights for early preventative measures and advanced PTSD detection, which may lead to interventions that delay or perhaps abrogate the development of PTSD. PMID:25754082

  4. Identification of the key regulating genes of diminished ovarian reserve (DOR) by network and gene ontology analysis.

    PubMed

    Pashaiasl, Maryam; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2016-09-01

    Diminished ovarian reserve (DOR) is one of the reasons for infertility that not only affects both older and young women. Ovarian reserve assessment can be used as a new prognostic tool for infertility treatment decision making. Here, up- and down-regulated gene expression profiles of granulosa cells were analysed to generate a putative interaction map of the involved genes. In addition, gene ontology (GO) analysis was used to get insight intol the biological processes and molecular functions of involved proteins in DOR. Eleven up-regulated genes and nine down-regulated genes were identified and assessed by constructing interaction networks based on their biological processes. PTGS2, CTGF, LHCGR, CITED, SOCS2, STAR and FSTL3 were the key nodes in the up-regulated networks, while the IGF2, AMH, GREM, and FOXC1 proteins were key in the down-regulated networks. MIRN101-1, MIRN153-1 and MIRN194-1 inhibited the expression of SOCS2, while CSH1 and BMP2 positively regulated IGF1 and IGF2. Ossification, ovarian follicle development, vasculogenesis, sequence-specific DNA binding transcription factor activity, and golgi apparatus are the major differential groups between up-regulated and down-regulated genes in DOR. Meta-analysis of publicly available transcriptomic data highlighted the high coexpression of CTGF, connective tissue growth factor, with the other key regulators of DOR. CTGF is involved in organ senescence and focal adhesion pathway according to GO analysis. These findings provide a comprehensive system biology based insight into the aetiology of DOR through network and gene ontology analyses. PMID:27324248

  5. Construction of a bivalent DNA vaccine co-expressing S genes of transmissible gastroenteritis virus and porcine epidemic diarrhea virus delivered by attenuated Salmonella typhimurium.

    PubMed

    Zhang, Yudi; Zhang, Xiaohui; Liao, Xiaodan; Huang, Xiaobo; Cao, Sanjie; Wen, Xintian; Wen, Yiping; Wu, Rui; Liu, Wumei

    2016-06-01

    Porcine transmissible gastroenteritis virus (TGEV) and porcine epidemic diarrhea virus (PEDV) can cause severe diarrhea in newborn piglets and led to significant economic losses. The S proteins are the main structural proteins of PEDV and TGEV capable of inducing neutralizing antibodies in vivo. In this study, a DNA vaccine SL7207 (pVAXD-PS1-TS) co-expressing S proteins of TGEV and PEDV delivered by attenuated Salmonella typhimurium was constructed and its immunogenicity in piglets was investigated. Twenty-day-old piglets were orally immunized with SL7207 (pVAXD-PS1-TS) at a dosage of 1.6 × 10(11) CFU per piglet and then booster immunized with 2.0 × 10(11) CFU after 2 weeks. Humoral immune responses, as reflected by virus neutralizing antibodies and specific IgG and sIgA, and cellular immune responses, as reflected by IFN-γ, IL-4, and lymphocyte proliferation, were evaluated. SL7207 (pVAXD-PS1-TS) simultaneously elicited immune responses against TGEV and PEDV after oral immunization. The immune levels started to increase at 2 weeks after immunization and increased to levels statistically significantly different than controls at 4 weeks post-immunization, peaking at 6 weeks and declined at 8 weeks. The humoral, mucosal, and cellular immune responses induced by SL7207 (pAXD-PS1-TS) were significantly higher than those of the PBS and SL7207 (pVAXD) (p < 0.01). In particular, the levels of IFN-γ and IL-4 were higher than those induced by the single-gene vaccine SL7207 (pVAXD-PS1) (p < 0.05). These results demonstrated that SL7207 (pVAXD-PS1-TS) possess the immunological functions of the two S proteins of TGEV and PEDV, indicating that SL7207 (pVAXD-PS1-TS) is a candidate oral vaccine for TGE and PED. PMID:26980672

  6. Mutated Genes in Schizophrenia Map to Brain Networks

    MedlinePlus

    ... 2013 Mutated Genes in Schizophrenia Map to Brain Networks Schizophrenia networks in the prefrontal cortex area of the brain. ... of spontaneous mutations in genes that form a network in the front region of the brain. The ...

  7. Inference of Gene Regulatory Network Based on Local Bayesian Networks

    PubMed Central

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan

    2016-01-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  8. Inference of Gene Regulatory Network Based on Local Bayesian Networks.

    PubMed

    Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan

    2016-08-01

    The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce

  9. Analysis of cellulose synthase genes from domesticated apple identifies collinear genes WDR53 and CesA8A: partial co-expression, bicistronic mRNA, and alternative splicing of CESA8A

    PubMed Central

    Guerriero, Gea; Spadiut, Oliver; Kerschbamer, Christine; Giorno, Filomena; Baric, Sanja; Ezcurra, Inés

    2016-01-01

    Cellulose synthase (CesA) genes constitute a complex multigene family with six major phylogenetic clades in angiosperms. The recently sequenced genome of domestic apple, Malus×domestica, was mined for CesA genes, by blasting full-length cellulose synthase protein (CESA) sequences annotated in the apple genome against protein databases from the plant models Arabidopsis thaliana and Populus trichocarpa. Thirteen genes belonging to the six angiosperm CesA clades and coding for proteins with conserved residues typical of processive glycosyltransferases from family 2 were detected. Based on their phylogenetic relationship to Arabidopsis CESAs, as well as expression patterns, a nomenclature is proposed to facilitate further studies. Examination of their genomic organization revealed that MdCesA8-A is closely linked and co-oriented with WDR53, a gene coding for a WD40 repeat protein. The WDR53 and CesA8 genes display conserved collinearity in dicots and are partially co-expressed in the apple xylem. Interestingly, the presence of a bicistronic WDR53–CesA8A transcript was detected in phytoplasma-infected phloem tissues of apple. The bicistronic transcript contains a spliced intergenic sequence that is predicted to fold into hairpin structures typical of internal ribosome entry sites, suggesting its potential cap-independent translation. Surprisingly, the CesA8A cistron is alternatively spliced and lacks the zinc-binding domain. The possible roles of WDR53 and the alternatively spliced CESA8 variant during cellulose biosynthesis in M.×domestica are discussed. PMID:23048131

  10. Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle

    PubMed Central

    2011-01-01

    Background Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information. Results We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively. Conclusion The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate. PMID:21226902

  11. Integrating heterogeneous gene expression data for gene regulatory network modelling.

    PubMed

    Sîrbu, Alina; Ruskin, Heather J; Crane, Martin

    2012-06-01

    Gene regulatory networks (GRNs) are complex biological systems that have a large impact on protein levels, so that discovering network interactions is a major objective of systems biology. Quantitative GRN models have been inferred, to date, from time series measurements of gene expression, but at small scale, and with limited application to real data. Time series experiments are typically short (number of time points of the order of ten), whereas regulatory networks can be very large (containing hundreds of genes). This creates an under-determination problem, which negatively influences the results of any inferential algorithm. Presented here is an integrative approach to model inference, which has not been previously discussed to the authors' knowledge. Multiple heterogeneous expression time series are used to infer the same model, and results are shown to be more robust to noise and parameter perturbation. Additionally, a wavelet analysis shows that these models display limited noise over-fitting within the individual datasets. PMID:21948152

  12. Gene-Sharing Networks Reveal Organizing Principles of Transcriptomes in Arabidopsis and Other Multicellular Organisms[W

    PubMed Central

    Li, Song; Pandey, Sona; Gookin, Timothy E.; Zhao, Zhixin; Wilson, Liza; Assmann, Sarah M.

    2012-01-01

    Understanding tissue-related gene expression patterns can provide important insights into gene, tissue, and organ function. Transcriptome analyses often have focused on housekeeping or tissue-specific genes or on gene coexpression. However, by analyzing thousands of single-gene expression distributions in multiple tissues of Arabidopsis thaliana, rice (Oryza sativa), human (Homo sapiens), and mouse (Mus musculus), we found that these organisms primarily operate by gene sharing, a phenomenon where, in each organism, most genes exhibit a high expression level in a few key tissues. We designed an analytical pipeline to characterize this phenomenon and then derived Arabidopsis and human gene-sharing networks, in which tissues are connected solely based on the extent of shared preferentially expressed genes. The results show that tissues or cell types from the same organ system tend to group together to form network modules. Tissues that are in consecutive developmental stages or have common physiological functions are connected in these networks, revealing the importance of shared preferentially expressed genes in conferring specialized functions of each tissue type. The networks provide predictive power for each tissue type regarding gene functions of both known and heretofore unknown genes, as shown by the identification of four new genes with functions in guard cell and abscisic acid response. We provide a Web interface that enables, based on the extent of gene sharing, both prediction of tissue-related functions for any Arabidopsis gene of interest and predictions concerning the relatedness of tissues. Common gene-sharing patterns observed in the four model organisms suggest that gene sharing evolved as a fundamental organizing principle of gene expression in diverse multicellular eukaryotes. PMID:22517316

  13. Protection of chickens from Newcastle disease and infectious laryngotracheitis with a recombinant fowlpox virus co-expressing the F, HN genes of Newcastle disease virus and gB gene of infectious laryngotracheitis virus.

    PubMed

    Sun, Hui-Ling; Wang, Yun-Feng; Tong, Guang-Zhi; Zhang, Pei-Jun; Miao, De-Yuan; Zhi, Hai-Dong; Wang, Ming; Wang, Mei

    2008-03-01

    A recombinant fowlpox virus (rFPV) coexpressing the Newcastle disease virus (NDV) fusion and hemagglutinin-neuraminidase genes and infectious laryngothracheitis virus (ILTV) glycoprotein B gene was constructed. This virus was then evaluated for its ability to protect specific-pathogen-free (SPF) chickens against clinical symptoms and death after challenge by virulent NDV and ILTV. SPF chickens were grouped and vaccinated with the rFPV and commercial NDV (La Sota) and ILTV attenuated live vaccine (Nobilis ILT), respectively. After challenge with NDV 10 days postvaccination, 70% of chickens vaccinated with rFPV were protected from death, whereas 100% of the commercial NDV-vaccinated chickens were protected from death. In contrast, 100% of the unvaccinated chickens died after challenge. After challenge with ILTV, both the rFPV and commercial ILTV-vaccinated chickens were completely protected from death and 70% of chickens were protected from respiratory signs. In comparison, 100% of the unvaccinated chickens developed severe respiratory disease and 10% of chickens died. The protective efficacy was also measured by the antibody responses and isolation of challenge viruses. Results showed that this rFPV could be a potential vaccine for preventing NDV and ILTV by a single immunization. PMID:18459306

  14. Hysteresis in a synthetic mammalian gene network.

    PubMed

    Kramer, Beat P; Fussenegger, Martin

    2005-07-01

    Bistable and hysteretic switches, enabling cells to adopt multiple internal expression states in response to a single external input signal, have a pivotal impact on biological systems, ranging from cell-fate decisions to cell-cycle control. We have designed a synthetic hysteretic mammalian transcription network. A positive feedback loop, consisting of a transgene and transactivator (TA) cotranscribed by TA's cognate promoter, is repressed by constitutive expression of a macrolide-dependent transcriptional silencer, whose activity is modulated by the macrolide antibiotic erythromycin. The antibiotic concentration, at which a quasi-discontinuous switch of transgene expression occurs, depends on the history of the synthetic transcription circuitry. If the network components are imbalanced, a graded rather than a quasi-discontinuous signal integration takes place. These findings are consistent with a mathematical model. Synthetic gene networks, which are able to emulate natural gene expression behavior, may foster progress in future gene therapy and tissue engineering initiatives. PMID:15972812

  15. Redeployment of a conserved gene regulatory network during Aedes aegypti development.

    PubMed

    Suryamohan, Kushal; Hanson, Casey; Andrews, Emily; Sinha, Saurabh; Scheel, Molly Duman; Halfon, Marc S

    2016-08-15

    Changes in gene regulatory networks (GRNs) underlie the evolution of morphological novelty and developmental system drift. The fruitfly Drosophila melanogaster and the dengue and Zika vector mosquito Aedes aegypti have substantially similar nervous system morphology. Nevertheless, they show significant divergence in a set of genes co-expressed in the midline of the Drosophila central nervous system, including the master regulator single minded and downstream genes including short gastrulation, Star, and NetrinA. In contrast to Drosophila, we find that midline expression of these genes is either absent or severely diminished in A. aegypti. Instead, they are co-expressed in the lateral nervous system. This suggests that in A. aegypti this "midline GRN" has been redeployed to a new location while lost from its previous site of activity. In order to characterize the relevant GRNs, we employed the SCRMshaw method we previously developed to identify transcriptional cis-regulatory modules in both species. Analysis of these regulatory sequences in transgenic Drosophila suggests that the altered gene expression observed in A. aegypti is the result of trans-dependent redeployment of the GRN, potentially stemming from cis-mediated changes in the expression of sim and other as-yet unidentified regulators. Our results illustrate a novel "repeal, replace, and redeploy" mode of evolution in which a conserved GRN acquires a different function at a new site while its original function is co-opted by a different GRN. This represents a striking example of developmental system drift in which the dramatic shift in gene expression does not result in gross morphological changes, but in more subtle differences in development and function of the late embryonic nervous system. PMID:27341759

  16. Mutational Robustness of Gene Regulatory Networks

    PubMed Central

    van Dijk, Aalt D. J.; van Mourik, Simon; van Ham, Roeland C. H. J.

    2012-01-01

    Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor – target gene interactions) but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive). In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence. PMID:22295094

  17. Snapshot of iron response in Shewanella oneidensis by gene network reconstruction

    SciTech Connect

    Yang, Yunfeng; Harris, Daniel P.; Luo, Feng; Xiong, Wenlu; Joachimiak, Marcin; Wu, Liyou; Dehal, Paramvir; Jacobsen, Janet; Yang, Zamin; Palumbo, Anthony V.; Arkin, Adam P.; Zhou, Jizhong

    2008-10-09

    Background: Iron homeostasis of Shewanella oneidensis, a gamma-proteobacterium possessing high iron content, is regulated by a global transcription factor Fur. However, knowledge is incomplete about other biological pathways that respond to changes in iron concentration, as well as details of the responses. In this work, we integrate physiological, transcriptomics and genetic approaches to delineate the iron response of S. oneidensis. Results: We show that the iron response in S. oneidensis is a rapid process. Temporal gene expression profiles were examined for iron depletion and repletion, and a gene co-expression network was reconstructed. Modules of iron acquisition systems, anaerobic energy metabolism and protein degradation were the most noteworthy in the gene network. Bioinformatics analyses suggested that genes in each of the modules might be regulated by DNA-binding proteins Fur, CRP and RpoH, respectively. Closer inspection of these modules revealed a transcriptional regulator (SO2426) involved in iron acquisition and ten transcriptional factors involved in anaerobic energy metabolism. Selected genes in the network were analyzed by genetic studies. Disruption of genes encoding a putative alcaligin biosynthesis protein (SO3032) and a gene previously implicated in protein degradation (SO2017) led to severe growth deficiency under iron depletion conditions. Disruption of a novel transcriptional factor (SO1415) caused deficiency in both anaerobic iron reduction and growth with thiosulfate or TMAO as an electronic acceptor, suggesting that SO1415 is required for specific branches of anaerobic energy metabolism pathways. Conclusions: Using a reconstructed gene network, we identified major biological pathways that were differentially expressed during iron depletion and repletion. Genetic studies not only demonstrated the importance of iron acquisition and protein degradation for iron depletion, but also characterized a novel transcriptional factor (SO1415) with a

  18. Diversity in Compartmental Dynamics of Gene Regulatory Networks: The Immune Response in Primary Influenza A Infection in Mice

    PubMed Central

    Hilchey, Shannon P.; Thakar, Juilee; Liu, Zhi-Ping; Welle, Stephen L.; Henn, Alicia D.; Wu, Hulin; Zand, Martin S.

    2015-01-01

    Current approaches to study transcriptional profiles post influenza infection typically rely on tissue sampling from one or two sites at a few time points, such as spleen and lung in murine models. In this study, we infected female C57/BL6 mice intranasally with mouse-adapted H3N2/Hong Kong/X31 avian influenza A virus, and then analyzed the gene expression profiles in four different compartments (blood, lung, mediastinal lymph nodes, and spleen) over 11 consecutive days post infection. These data were analyzed by an advanced statistical procedure based on ordinary differential equation (ODE) modeling. Vastly different lists of significant genes were identified by the same statistical procedure in each compartment. Only 11 of them are significant in all four compartments. We classified significant genes in each compartment into co-expressed modules based on temporal expression patterns. We then performed functional enrichment analysis on these co-expression modules and identified significant pathway and functional motifs. Finally, we used an ODE based model to reconstruct gene regulatory network (GRN) for each compartment and studied their network properties. PMID:26413862

  19. From gene expressions to genetic networks

    NASA Astrophysics Data System (ADS)

    Cieplak, Marek

    2009-03-01

    A method based on the principle of entropy maximization is used to identify the gene interaction network with the highest probability of giving rise to experimentally observed transcript profiles [1]. In its simplest form, the method yields the pairwise gene interaction network, but it can also be extended to deduce higher order correlations. Analysis of microarray data from genes in Saccharomyces cerevisiae chemostat cultures exhibiting energy metabollic oscillations identifies a gene interaction network that reflects the intracellular communication pathways. These pathways adjust cellular metabolic activity and cell division to the limiting nutrient conditions that trigger metabolic oscillations. The success of the present approach in extracting meaningful genetic connections suggests that the maximum entropy principle is a useful concept for understanding living systems, as it is for other complex, nonequilibrium systems. The time-dependent behavior of the genetic network is found to involve only a few fundamental modes [2,3]. [4pt] REFERENCES:[0pt] [1] T. R. Lezon, J. R. Banavar, M. Cieplak, A. Maritan, and N. Fedoroff, Using the principle of entropy maximization to infer genetic interaction networks from gene expression patterns, Proc. Natl. Acad. Sci. (USA) 103, 19033-19038 (2006) [0pt] [2] N. S. Holter, M. Mitra, A. Maritan, M. Cieplak, J. R. Banavar, and N. V. Fedoroff, Fundamental patterns underlying gene expression profiles: simplicity from complexity, Proc. Natl. Acad. Sci. USA 97, 8409-8414 (2000) [0pt] [3] N. S. Holter, A. Maritan, M. Cieplak, N. V. Fedoroff, and J. R. Banavar, Dynamic modeling of gene expression data, Proc. Natl. Acad. Sci. USA 98, 1693-1698 (2001)

  20. Gene Networks in the Wild: Identifying Transcriptional Modules that Mediate Coral Resistance to Experimental Heat Stress

    PubMed Central

    Rose, Noah H.; Seneca, Francois O.; Palumbi, Stephen R.

    2016-01-01

    Organisms respond to environmental variation partly through changes in gene expression, which underlie both homeostatic and acclimatory responses to environmental stress. In some cases, so many genes change in expression in response to different influences that understanding expression patterns for all these individual genes becomes difficult. To reduce this problem, we use a systems genetics approach to show that variation in the expression of thousands of genes of reef-building corals can be explained as variation in the expression of a small number of coexpressed “modules.” Modules were often enriched for specific cellular functions and varied predictably among individuals, experimental treatments, and physiological state. We describe two transcriptional modules for which expression levels immediately after heat stress predict bleaching a day later. One of these early “bleaching modules” is enriched for sequence-specific DNA-binding proteins, particularly E26 transformation-specific (ETS)-family transcription factors. The other module is enriched for extracellular matrix proteins. These classes of bleaching response genes are clear in the modular gene expression analysis we conduct but are much more difficult to discern in single gene analyses. Furthermore, the ETS-family module shows repeated differences in expression among coral colonies grown in the same common garden environment, suggesting a heritable genetic or epigenetic basis for these expression polymorphisms. This finding suggests that these corals harbor high levels of gene-network variation, which could facilitate rapid evolution in the face of environmental change. PMID:26710855

  1. Gene Networks in the Wild: Identifying Transcriptional Modules that Mediate Coral Resistance to Experimental Heat Stress.

    PubMed

    Rose, Noah H; Seneca, Francois O; Palumbi, Stephen R

    2016-01-01

    Organisms respond to environmental variation partly through changes in gene expression, which underlie both homeostatic and acclimatory responses to environmental stress. In some cases, so many genes change in expression in response to different influences that understanding expression patterns for all these individual genes becomes difficult. To reduce this problem, we use a systems genetics approach to show that variation in the expression of thousands of genes of reef-building corals can be explained as variation in the expression of a small number of coexpressed "modules." Modules were often enriched for specific cellular functions and varied predictably among individuals, experimental treatments, and physiological state. We describe two transcriptional modules for which expression levels immediately after heat stress predict bleaching a day later. One of these early "bleaching modules" is enriched for sequence-specific DNA-binding proteins, particularly E26 transformation-specific (ETS)-family transcription factors. The other module is enriched for extracellular matrix proteins. These classes of bleaching response genes are clear in the modular gene expression analysis we conduct but are much more difficult to discern in single gene analyses. Furthermore, the ETS-family module shows repeated differences in expression among coral colonies grown in the same common garden environment, suggesting a heritable genetic or epigenetic basis for these expression polymorphisms. This finding suggests that these corals harbor high levels of gene-network variation, which could facilitate rapid evolution in the face of environmental change. PMID:26710855

  2. An extensive (co-)expression analysis tool for the cytochrome P450 superfamily in Arabidopsis thaliana

    PubMed Central

    Ehlting, Jürgen; Sauveplane, Vincent; Olry, Alexandre; Ginglinger, Jean-François; Provart, Nicholas J; Werck-Reichhart, Danièle

    2008-01-01

    Background Sequencing of the first plant genomes has revealed that cytochromes P450 have evolved to become the largest family of enzymes in secondary metabolism. The proportion of P450 enzymes with characterized biochemical function(s) is however very small. If P450 diversification mirrors evolution of chemical diversity, this points to an unexpectedly poor understanding of plant metabolism. We assumed that extensive analysis of gene expression might guide towards the function of P450 enzymes, and highlight overlooked aspects of plant metabolism. Results We have created a comprehensive database, 'CYPedia', describing P450 gene expression in four data sets: organs and tissues, stress response, hormone response, and mutants of Arabidopsis thaliana, based on public Affymetrix ATH1 microarray expression data. P450 expression was then combined with the expression of 4,130 re-annotated genes, predicted to act in plant metabolism, for co-expression analyses. Based on the annotation of co-expressed genes from diverse pathway annotation databases, co-expressed pathways were identified. Predictions were validated for most P450s with known functions. As examples, co-expression results for P450s related to plastidial functions/photosynthesis, and to phenylpropanoid, triterpenoid and jasmonate metabolism are highlighted here. Conclusion The large scale hypothesis generation tools presented here provide leads to new pathways, unexpected functions, and regulatory networks for many P450s in plant metabolism. These can now be exploited by the community to validate the proposed functions experimentally using reverse genetics, biochemistry, and metabolic profiling. PMID:18433503

  3. Network enrichment analysis: extension of gene-set enrichment analysis to gene networks

    PubMed Central

    2012-01-01

    Background Gene-set enrichment analyses (GEA or GSEA) are commonly used for biological characterization of an experimental gene-set. This is done by finding known functional categories, such as pathways or Gene Ontology terms, that are over-represented in the experimental set; the assessment is based on an overlap statistic. Rich biological information in terms of gene interaction network is now widely available, but this topological information is not used by GEA, so there is a need for methods that exploit this type of information in high-throughput data analysis. Results We developed a method of network enrichment analysis (NEA) that extends the overlap statistic in GEA to network links between genes in the experimental set and those in the functional categories. For the crucial step in statistical inference, we developed a fast network randomization algorithm in order to obtain the distribution of any network statistic under the null hypothesis of no association between an experimental gene-set and a functional category. We illustrate the NEA method using gene and protein expression data from a lung cancer study. Conclusions The results indicate that the NEA method is more powerful than the traditional GEA, primarily because the relationships between gene sets were more strongly captured by network connectivity rather than by simple overlaps. PMID:22966941

  4. Inferring transcriptional gene regulation network of starch metabolism in Arabidopsis thaliana leaves using graphical Gaussian model

    PubMed Central

    2012-01-01

    Background Starch serves as a temporal storage of carbohydrates in plant leaves during day/night cycles. To study transcriptional regulatory modules of this dynamic metabolic process, we conducted gene regulation network analysis based on small-sample inference of graphical Gaussian model (GGM). Results Time-series significant analysis was applied for Arabidopsis leaf transcriptome data to obtain a set of genes that are highly regulated under a diurnal cycle. A total of 1,480 diurnally regulated genes included 21 starch metabolic enzymes, 6 clock-associated genes, and 106 transcription factors (TF). A starch-clock-TF gene regulation network comprising 117 nodes and 266 edges was constructed by GGM from these 133 significant genes that are potentially related to the diurnal control of starch metabolism. From this network, we found that β-amylase 3 (b-amy3: At4g17090), which participates in starch degradation in chloroplast, is the most frequently connected gene (a hub gene). The robustness of gene-to-gene regulatory network was further analyzed by TF binding site prediction and by evaluating global co-expression of TFs and target starch metabolic enzymes. As a result, two TFs, indeterminate domain 5 (AtIDD5: At2g02070) and constans-like (COL: At2g21320), were identified as positive regulators of starch synthase 4 (SS4: At4g18240). The inference model of AtIDD5-dependent positive regulation of SS4 gene expression was experimentally supported by decreased SS4 mRNA accumulation in Atidd5 mutant plants during the light period of both short and long day conditions. COL was also shown to positively control SS4 mRNA accumulation. Furthermore, the knockout of AtIDD5 and COL led to deformation of chloroplast and its contained starch granules. This deformity also affected the number of starch granules per chloroplast, which increased significantly in both knockout mutant lines. Conclusions In this study, we utilized a systematic approach of microarray analysis to discover

  5. Differential co-expression analysis of rheumatoid arthritis with microarray data.

    PubMed

    Wang, Kunpeng; Zhao, Liqiang; Liu, Xuefeng; Hao, Zhenyong; Zhou, Yong; Yang, Chuandong; Li, Hongqiang

    2014-11-01

    The aim of the present study was to investigate the underlying molecular mechanisms of rheumatoid arthritis (RA) using microarray expression profiles from osteoarthritis and RA patients, to improve diagnosis and treatment strategies for the condition. The gene expression profile of GSE27390 was downloaded from Gene Expression Omnibus, including 19 samples from patients with RA (n=9) or osteoarthritis (n=10). Firstly, the differentially expressed genes (DEGs) were obtained with the thresholds of |logFC|>1.0 and P<0.05, using the t‑test method in LIMMA package. Then, differentially co-expressed genes (DCGs) and differentially co-expressed links (DCLs) were screened with q<0.25 by the differential coexpression analysis and differential regulation analysis of gene expression microarray data package. Secondly, pathway enrichment analysis for DCGs was performed by the Database for Annotation, Visualization and Integrated Discovery and the DCLs associated with RA were selected by comparing the obtained DCLs with known transcription factor (TF)-targets in the TRANSFAC database. Finally, the obtained TFs were mapped to the known TF-targets to construct the network using cytoscape software. A total of 1755 DEGs, 457 DCGs and 101988 DCLs were achieved and there were 20 TFs in the obtained six TF-target relations (STAT3-TNF, PBX1‑PLAU, SOCS3-STAT3, GATA1-ETS2, ETS1-ICAM4 and CEBPE‑GATA1) and 457 DCGs. A number of TF-target relations in the constructed network were not within DCLs when the TF and target gene were DCGs. The identified TFs may have an important role in the pathogenesis of RA and have the potential to be used as biomarkers for the development of novel diagnostic and therapeutic strategies for RA. PMID:25118911

  6. MouseNet v2: a database of gene networks for studying the laboratory mouse and eight other model vertebrates

    PubMed Central

    Kim, Eiru; Hwang, Sohyun; Kim, Hyojin; Shim, Hongseok; Kang, Byunghee; Yang, Sunmo; Shim, Jae Ho; Shin, Seung Yeon; Marcotte, Edward M.; Lee, Insuk

    2016-01-01

    Laboratory mouse, Mus musculus, is one of the most important animal tools in biomedical research. Functional characterization of the mouse genes, hence, has been a long-standing goal in mammalian and human genetics. Although large-scale knockout phenotyping is under progress by international collaborative efforts, a large portion of mouse genome is still poorly characterized for cellular functions and associations with disease phenotypes. A genome-scale functional network of mouse genes, MouseNet, was previously developed in context of MouseFunc competition, which allowed only limited input data for network inferences. Here, we present an improved mouse co-functional network, MouseNet v2 (available at http://www.inetbio.org/mousenet), which covers 17 714 genes (>88% of coding genome) with 788 080 links, along with a companion web server for network-assisted functional hypothesis generation. The network database has been substantially improved by large expansion of genomics data. For example, MouseNet v2 database contains 183 co-expression networks inferred from 8154 public microarray samples. We demonstrated that MouseNet v2 is predictive for mammalian phenotypes as well as human diseases, which suggests its usefulness in discovery of novel disease genes and dissection of disease pathways. Furthermore, MouseNet v2 database provides functional networks for eight other vertebrate models used in various research fields. PMID:26527726

  7. Network-based survival-associated module biomarker and its crosstalk with cell death genes in ovarian cancer

    PubMed Central

    Jin, Nana; Wu, Hao; Miao, Zhengqiang; Huang, Yan; Hu, Yongfei; Bi, Xiaoman; Wu, Deng; Qian, Kun; Wang, Liqiang; Wang, Changliang; Wang, Hongwei; Li, Kongning; Li, Xia; Wang, Dong

    2015-01-01

    Ovarian cancer remains a dismal disease with diagnosing in the late, metastatic stages, therefore, there is a growing realization of the critical need to develop effective biomarkers for understanding underlying mechanisms. Although existing evidences demonstrate the important role of the single genetic abnormality in pathogenesis, the perturbations of interactors in the complex network are often ignored. Moreover, ovarian cancer diagnosis and treatment still exist a large gap that need to be bridged. In this work, we adopted a network-based survival-associated approach to capture a 12-gene network module based on differential co-expression PPI network in the advanced-stage, high-grade ovarian serous cystadenocarcinoma. Then, regulatory genes (protein-coding genes and non-coding genes) direct interacting with the module were found to be significantly overlapped with cell death genes. More importantly, these overlapping genes tightly clustered together pointing to the module, deciphering the crosstalk between network-based survival-associated module and cell death in ovarian cancer. PMID:26099452

  8. The murine VpreB1 and VpreB2 genes both encode a protein of the surrogate light chain and are co-expressed during B cell development.

    PubMed

    Dul, J L; Argon, Y; Winkler, T; ten Boekel, E; Melchers, F; Mårtensson, I L

    1996-04-01

    The surrogate light chain is composed of two polypeptides, VpreB and lambda 5. In the mouse there are two VpreB genes which are 99% identical within the coding regions. Extensive restriction enzyme mapping and sequencing of these two genes showed that only the coding region and immediate 5' and 3' flanking sequences exhibited such high homology. More distal sequences have diverged considerably. The region 5' of the respective gene directed transcription of a reporter gene in a pre-B cell line, indicating that it contained promoter, and perhaps enhancer function. The VpreB2 gene is functional, as it directed the production in COS cells of a 16-kDa protein that assembled with lambda 5 and was recognized by a VpreB-specific monoclonal antibody. Using transfected COS cells expressing either VpreB1 or VpreB2, a PCR assay was developed to examine the steady state level of transcripts from each gene. When this assay was applied to a number of cell lines representing early stages of B cell differentiation, co-expression of the two genes was observed in every case. VpreB1 and VpreB2 were co-expressed in the fetal liver of CB17 mice, where peak expression of each gene occurred at days 16-17 of gestation. Similarly, adult bone marrow from several strains of mice expressed both genes. In sorted bone marrow cells expression of both VpreB genes was detected in pro-B/pre-BI and large pre-BII cells, while the RNA steady state levels were at least 100-fold lower in small pre-BII and immature/mature B cells. Finally, single-cell reverse transcriptase-polymerase chain reaction on such sorted bone marrow cells detected VpreB1 and VpreB2 expression in at least 30% of all pro-B/pre-BI cells and large Ig heavy chain, surrogate light chain (pre-B receptor) expressing pre-BII cells. These results demonstrate that the control of expression of the two VpreB genes overlaps during development. They suggest that both VpreB1 and VpreB2 polypeptides can assemble with lambda 5 and mu to form pre

  9. Generation of oscillating gene regulatory network motifs

    NASA Astrophysics Data System (ADS)

    van Dorp, M.; Lannoo, B.; Carlon, E.

    2013-07-01

    Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.

  10. Positioning the expanded akirin gene family of Atlantic salmon within the transcriptional networks of myogenesis

    SciTech Connect

    Macqueen, Daniel J.; Bower, Neil I.; Johnston, Ian A.

    2010-10-01

    Research highlights: {yields} The expanded akirin gene family of Atlantic salmon was characterised. {yields} akirin paralogues are regulated between mono- and multi-nucleated muscle cells. {yields} akirin paralogues positioned within known genetic networks controlling myogenesis. {yields} Co-expression of akirin paralogues is evident across cell types/during myogenesis. {yields} Selection has likely maintained common regulatory elements among akirin paralogues. -- Abstract: Vertebrate akirin genes usually form a family with one-to-three members that regulate gene expression during the innate immune response, carcinogenesis and myogenesis. We recently established that an expanded family of eight akirin genes is conserved across salmonid fish. Here, we measured mRNA levels of the akirin family of Atlantic salmon (Salmo salar L.) during the differentiation of primary myoblasts cultured from fast-skeletal muscle. Using hierarchical clustering and correlation, the data was positioned into a network of expression profiles including twenty further genes that regulate myogenesis. akirin1(2b) was not significantly regulated during the maturation of the cell culture. akirin2(1a) and 2(1b), along with IGF-II and several igfbps, were most highly expressed in mononuclear cells, then significantly and constitutively downregulated as differentiation proceeded and myotubes formed/matured. Conversely, akirin1(1a), 1(1b), 1(2a), 2(2a) and 2(2b) were expressed at lowest levels when mononuclear cells dominated the culture and highest levels when confluent layers of myotubes were evident. However, akirin1(2a) and 2(2a) were first upregulated earlier than akirin1(1a), 1(1b) and 2(2b), when rates of myoblast proliferation were highest. Interestingly, akirin1(1b), 1(2a), 2(2a) and 2(2b) formed part of a module of co-expressed genes involved in muscle differentiation, including myod1a, myog, mef2a, 14-3-3{beta} and 14-3-3{gamma}. All akirin paralogues were expressed ubiquitously across ten

  11. COMODO: an adaptive coclustering strategy to identify conserved coexpression modules between organisms.

    PubMed

    Zarrineh, Peyman; Fierro, Ana C; Sánchez-Rodríguez, Aminael; De Moor, Bart; Engelen, Kristof; Marchal, Kathleen

    2011-04-01

    Increasingly large-scale expression compendia for different species are becoming available. By exploiting the modularity of the coexpression network, these compendia can be used to identify biological processes for which the expression behavior is conserved over different species. However, comparing module networks across species is not trivial. The definition of a biologically meaningful module is not a fixed one and changing the distance threshold that defines the degree of coexpression gives rise to different modules. As a result when comparing modules across species, many different partially overlapping conserved module pairs across species exist and deciding which pair is most relevant is hard. Therefore, we developed a method referred to as conserved modules across organisms (COMODO) that uses an objective selection criterium to identify conserved expression modules between two species. The method uses as input microarray data and a gene homology map and provides as output pairs of conserved modules and searches for the pair of modules for which the number of sharing homologs is statistically most significant relative to the size of the linked modules. To demonstrate its principle, we applied COMODO to study coexpression conservation between the two well-studied bacteria Escherichia coli and Bacillus subtilis. COMODO is available at: http://homes.esat.kuleuven.be/∼kmarchal/Supplementary_Information_Zarrineh_2010/comodo/index.html. PMID:21149270

  12. Gene regulatory networks and the underlying biology of developmental toxicity

    EPA Science Inventory

    Embryonic cells are specified by large-scale networks of functionally linked regulatory genes. Knowledge of the relevant gene regulatory networks is essential for understanding phenotypic heterogeneity that emerges from disruption of molecular functions, cellular processes or sig...

  13. Hybrid stochastic simplifications for multiscale gene networks

    PubMed Central

    Crudu, Alina; Debussche, Arnaud; Radulescu, Ovidiu

    2009-01-01

    Background Stochastic simulation of gene networks by Markov processes has important applications in molecular biology. The complexity of exact simulation algorithms scales with the number of discrete jumps to be performed. Approximate schemes reduce the computational time by reducing the number of simulated discrete events. Also, answering important questions about the relation between network topology and intrinsic noise generation and propagation should be based on general mathematical results. These general results are difficult to obtain for exact models. Results We propose a unified framework for hybrid simplifications of Markov models of multiscale stochastic gene networks dynamics. We discuss several possible hybrid simplifications, and provide algorithms to obtain them from pure jump processes. In hybrid simplifications, some components are discrete and evolve by jumps, while other components are continuous. Hybrid simplifications are obtained by partial Kramers-Moyal expansion [1-3] which is equivalent to the application of the central limit theorem to a sub-model. By averaging and variable aggregation we drastically reduce simulation time and eliminate non-critical reactions. Hybrid and averaged simplifications can be used for more effective simulation algorithms and for obtaining general design principles relating noise to topology and time scales. The simplified models reproduce with good accuracy the stochastic properties of the gene networks, including waiting times in intermittence phenomena, fluctuation amplitudes and stationary distributions. The methods are illustrated on several gene network examples. Conclusion Hybrid simplifications can be used for onion-like (multi-layered) approaches to multi-scale biochemical systems, in which various descriptions are used at various scales. Sets of discrete and continuous variables are treated with different methods and are coupled together in a physically justified approach. PMID:19735554

  14. Apolipoprotein E*4 (APOE*4) Genotype Is Associated with Altered Levels of Glutamate Signaling Proteins and Synaptic Coexpression Networks in the Prefrontal Cortex in Mild to Moderate Alzheimer Disease.

    PubMed

    Sweet, Robert A; MacDonald, Matthew L; Kirkwood, Caitlin M; Ding, Ying; Schempf, Tadhg; Jones-Laughner, Jackie; Kofler, Julia; Ikonomovic, Milos D; Lopez, Oscar L; Garver, Megan E; Fitz, Nicholas F; Koldamova, Radosveta; Yates, Nathan A

    2016-07-01

    It has been hypothesized that Alzheimer disease (AD) is primarily a disorder of the synapse. However, assessment of the synaptic proteome in AD subjects has been limited to a small number of proteins and often included subjects with end-stage pathology. Protein from prefrontal cortex gray matter of 59 AD subjects with mild to moderate dementia and 12 normal elderly subjects was assayed using targeted mass spectrometry to quantify 191 synaptically expressed proteins. The profile of synaptic protein expression clustered AD subjects into two groups. One of these was characterized by reduced expression of glutamate receptor proteins, significantly increased synaptic protein network coexpression, and associated withApolipoprotein E*4 (APOE*4) carrier status. The second group, by contrast, showed few differences from control subjects. A subset of AD subjects had altered prefrontal cortex synaptic proteostasis for glutamate receptors and their signaling partners. Efforts to therapeutically target glutamate receptors in AD may have outcomes dependent on APOE*4 genotype. PMID:27103636

  15. Construction of gene regulatory networks using biclustering and bayesian networks

    PubMed Central

    2011-01-01

    Background Understanding gene interactions in complex living systems can be seen as the ultimate goal of the systems biology revolution. Hence, to elucidate disease ontology fully and to reduce the cost of drug development, gene regulatory networks (GRNs) have to be constructed. During the last decade, many GRN inference algorithms based on genome-wide data have been developed to unravel the complexity of gene regulation. Time series transcriptomic data measured by genome-wide DNA microarrays are traditionally used for GRN modelling. One of the major problems with microarrays is that a dataset consists of relatively few time points with respect to the large number of genes. Dimensionality is one of the interesting problems in GRN modelling. Results In this paper, we develop a biclustering function enrichment analysis toolbox (BicAT-plus) to study the effect of biclustering in reducing data dimensions. The network generated from our system was validated via available interaction databases and was compared with previous methods. The results revealed the performance of our proposed method. Conclusions Because of the sparse nature of GRNs, the results of biclustering techniques differ significantly from those of previous methods. PMID:22018164

  16. Engineering stability in gene networks by autoregulation

    NASA Astrophysics Data System (ADS)

    Becskei, Attila; Serrano, Luis

    2000-06-01

    The genetic and biochemical networks which underlie such things as homeostasis in metabolism and the developmental programs of living cells, must withstand considerable variations and random perturbations of biochemical parameters. These occur as transient changes in, for example, transcription, translation, and RNA and protein degradation. The intensity and duration of these perturbations differ between cells in a population. The unique state of cells, and thus the diversity in a population, is owing to the different environmental stimuli the individual cells experience and the inherent stochastic nature of biochemical processes (for example, refs 5 and 6). It has been proposed, but not demonstrated, that autoregulatory, negative feedback loops in gene circuits provide stability, thereby limiting the range over which the concentrations of network components fluctuate. Here we have designed and constructed simple gene circuits consisting of a regulator and transcriptional repressor modules in Escherichia coli and we show the gain of stability produced by negative feedback.

  17. Engineering stability in gene networks by autoregulation.

    PubMed

    Becskei, A; Serrano, L

    2000-06-01

    The genetic and biochemical networks which underlie such things as homeostasis in metabolism and the developmental programs of living cells, must withstand considerable variations and random perturbations of biochemical parameters. These occur as transient changes in, for example, transcription, translation, and RNA and protein degradation. The intensity and duration of these perturbations differ between cells in a population. The unique state of cells, and thus the diversity in a population, is owing to the different environmental stimuli the individual cells experience and the inherent stochastic nature of biochemical processes (for example, refs 5 and 6). It has been proposed, but not demonstrated, that autoregulatory, negative feedback loops in gene circuits provide stability, thereby limiting the range over which the concentrations of network components fluctuate. Here we have designed and constructed simple gene circuits consisting of a regulator and transcriptional repressor modules in Escherichia coli and we show the gain of stability produced by negative feedback. PMID:10850721

  18. Pathway network inference from gene expression data

    PubMed Central

    2014-01-01

    Background The development of high-throughput omics technologies enabled genome-wide measurements of the activity of cellular elements and provides the analytical resources for the progress of the Systems Biology discipline. Analysis and interpretation of gene expression data has evolved from the gene to the pathway and interaction level, i.e. from the detection of differentially expressed genes, to the establishment of gene interaction networks and the identification of enriched functional categories. Still, the understanding of biological systems requires a further level of analysis that addresses the characterization of the interaction between functional modules. Results We present a novel computational methodology to study the functional interconnections among the molecular elements of a biological system. The PANA approach uses high-throughput genomics measurements and a functional annotation scheme to extract an activity profile from each functional block -or pathway- followed by machine-learning methods to infer the relationships between these functional profiles. The result is a global, interconnected network of pathways that represents the functional cross-talk within the molecular system. We have applied this approach to describe the functional transcriptional connections during the yeast cell cycle and to identify pathways that change their connectivity in a disease condition using an Alzheimer example. Conclusions PANA is a useful tool to deepen in our understanding of the functional interdependences that operate within complex biological systems. We show the approach is algorithmically consistent and the inferred network is well supported by the available functional data. The method allows the dissection of the molecular basis of the functional connections and we describe the different regulatory mechanisms that explain the network's topology obtained for the yeast cell cycle data. PMID:25032889

  19. Gene Regulatory Networks Elucidating Huanglongbing Disease Mechanisms

    PubMed Central

    Martinelli, Federico; Reagan, Russell L.; Uratsu, Sandra L.; Phu, My L.; Albrecht, Ute; Zhao, Weixiang; Davis, Cristina E.; Bowman, Kim D.; Dandekar, Abhaya M.

    2013-01-01

    Next-generation sequencing was exploited to gain deeper insight into the response to infection by Candidatus liberibacter asiaticus (CaLas), especially the immune disregulation and metabolic dysfunction caused by source-sink disruption. Previous fruit transcriptome data were compared with additional RNA-Seq data in three tissues: immature fruit, and young and mature leaves. Four categories of orchard trees were studied: symptomatic, asymptomatic, apparently healthy, and healthy. Principal component analysis found distinct expression patterns between immature and mature fruits and leaf samples for all four categories of trees. A predicted protein – protein interaction network identified HLB-regulated genes for sugar transporters playing key roles in the overall plant responses. Gene set and pathway enrichment analyses highlight the role of sucrose and starch metabolism in disease symptom development in all tissues. HLB-regulated genes (glucose-phosphate-transporter, invertase, starch-related genes) would likely determine the source-sink relationship disruption. In infected leaves, transcriptomic changes were observed for light reactions genes (downregulation), sucrose metabolism (upregulation), and starch biosynthesis (upregulation). In parallel, symptomatic fruits over-expressed genes involved in photosynthesis, sucrose and raffinose metabolism, and downregulated starch biosynthesis. We visualized gene networks between tissues inducing a source-sink shift. CaLas alters the hormone crosstalk, resulting in weak and ineffective tissue-specific plant immune responses necessary for bacterial clearance. Accordingly, expression of WRKYs (including WRKY70) was higher in fruits than in leaves. Systemic acquired responses were inadequately activated in young leaves, generally considered the sites where most new infections occur. PMID:24086326

  20. Developmental Progression in the Coral Acropora digitifera Is Controlled by Differential Expression of Distinct Regulatory Gene Networks

    PubMed Central

    Reyes-Bermudez, Alejandro; Villar-Briones, Alejandro; Ramirez-Portilla, Catalina; Hidaka, Michio; Mikheyev, Alexander S.

    2016-01-01

    Corals belong to the most basal class of the Phylum Cnidaria, which is considered the sister group of bilaterian animals, and thus have become an emerging model to study the evolution of developmental mechanisms. Although cell renewal, differentiation, and maintenance of pluripotency are cellular events shared by multicellular animals, the cellular basis of these fundamental biological processes are still poorly understood. To understand how changes in gene expression regulate morphogenetic transitions at the base of the eumetazoa, we performed quantitative RNA-seq analysis during Acropora digitifera’s development. We collected embryonic, larval, and adult samples to characterize stage-specific transcription profiles, as well as broad expression patterns. Transcription profiles reconstructed development revealing two main expression clusters. The first cluster grouped blastula and gastrula and the second grouped subsequent developmental time points. Consistently, we observed clear differences in gene expression between early and late developmental transitions, with higher numbers of differentially expressed genes and fold changes around gastrulation. Furthermore, we identified three coexpression clusters that represented discrete gene expression patterns. During early transitions, transcriptional networks seemed to regulate cellular fate and morphogenesis of the larval body. In late transitions, these networks seemed to play important roles preparing planulae for switch in lifestyle and regulation of adult processes. Although developmental progression in A. digitifera is regulated to some extent by differential coexpression of well-defined gene networks, stage-specific transcription profiles appear to be independent entities. While negative regulation of transcription is predominant in early development, cell differentiation was upregulated in larval and adult stages. PMID:26941230

  1. Developmental Progression in the Coral Acropora digitifera Is Controlled by Differential Expression of Distinct Regulatory Gene Networks.

    PubMed

    Reyes-Bermudez, Alejandro; Villar-Briones, Alejandro; Ramirez-Portilla, Catalina; Hidaka, Michio; Mikheyev, Alexander S

    2016-03-01

    Corals belong to the most basal class of the Phylum Cnidaria, which is considered the sister group of bilaterian animals, and thus have become an emerging model to study the evolution of developmental mechanisms. Although cell renewal, differentiation, and maintenance of pluripotency are cellular events shared by multicellular animals, the cellular basis of these fundamental biological processes are still poorly understood. To understand how changes in gene expression regulate morphogenetic transitions at the base of the eumetazoa, we performed quantitative RNA-seq analysis duringAcropora digitifera's development. We collected embryonic, larval, and adult samples to characterize stage-specific transcription profiles, as well as broad expression patterns. Transcription profiles reconstructed development revealing two main expression clusters. The first cluster grouped blastula and gastrula and the second grouped subsequent developmental time points. Consistently, we observed clear differences in gene expression between early and late developmental transitions, with higher numbers of differentially expressed genes and fold changes around gastrulation. Furthermore, we identified three coexpression clusters that represented discrete gene expression patterns. During early transitions, transcriptional networks seemed to regulate cellular fate and morphogenesis of the larval body. In late transitions, these networks seemed to play important roles preparing planulae for switch in lifestyle and regulation of adult processes. Although developmental progression inA. digitiferais regulated to some extent by differential coexpression of well-defined gene networks, stage-specific transcription profiles appear to be independent entities. While negative regulation of transcription is predominant in early development, cell differentiation was upregulated in larval and adult stages. PMID:26941230

  2. Mapping gene regulatory networks in Drosophila eye development by large-scale transcriptome perturbations and motif inference.

    PubMed

    Potier, Delphine; Davie, Kristofer; Hulselmans, Gert; Naval Sanchez, Marina; Haagen, Lotte; Huynh-Thu, Vân Anh; Koldere, Duygu; Celik, Arzu; Geurts, Pierre; Christiaens, Valerie; Aerts, Stein

    2014-12-24

    Genome control is operated by transcription factors (TFs) controlling their target genes by binding to promoters and enhancers. Conceptually, the interactions between TFs, their binding sites, and their functional targets are represented by gene regulatory networks (GRNs). Deciphering in vivo GRNs underlying organ development in an unbiased genome-wide setting involves identifying both functional TF-gene interactions and physical TF-DNA interactions. To reverse engineer the GRNs of eye development in Drosophila, we performed RNA-seq across 72 genetic perturbations and sorted cell types and inferred a coexpression network. Next, we derived direct TF-DNA interactions using computational motif inference, ultimately connecting 241 TFs to 5,632 direct target genes through 24,926 enhancers. Using this network, we found network motifs, cis-regulatory codes, and regulators of eye development. We validate the predicted target regions of Grainyhead by ChIP-seq and identify this factor as a general cofactor in the eye network, being bound to thousands of nucleosome-free regions. PMID:25533349

  3. A Novel Network Model for Molecular Prognosis

    PubMed Central

    Wan, Ying-Wooi; Bose, Swetha; Denvir, James; Guo, Nancy Lan

    2015-01-01

    Network-based genome-wide association studies (NWAS) utilize the molecular interactions between genes and functional pathways in biomarker identification. This study presents a novel network-based methodology for identifying prognostic gene signatures to predict cancer recurrence. The methodology contains the following steps: 1) Constructing genome-wide coexpression networks for different disease states (metastatic vs. non-metastatic). Prediction logic is used to induct valid implication relations between each pair of gene expression profiles in terms of formal logic rules. 2) Identifying differential components associated with specific disease states from the genome-wide coexpression networks. 3) Dissecting network modules that are tightly connected with major disease signal hallmarks from the disease specific differential components. 4) Identifying most significant genes/probes associated with clinical outcome from the pathway connected network modules. Using this methodology, a 14-gene prognostic signature was identified for accurate patient stratification in early stage lung cancer. PMID:26005718

  4. Intersecting transcription networks constrain gene regulatory evolution.

    PubMed

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-07-16

    Epistasis-the non-additive interactions between different genetic loci-constrains evolutionary pathways, blocking some and permitting others. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeast. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analysing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  5. Intersecting transcription networks constrain gene regulatory evolution

    PubMed Central

    Sorrells, Trevor R; Booth, Lauren N; Tuch, Brian B; Johnson, Alexander D

    2015-01-01

    Epistasis—the non-additive interactions between different genetic loci—constrains evolutionary pathways, blocking some and permitting others1–8. For biological networks such as transcription circuits, the nature of these constraints and their consequences are largely unknown. Here we describe the evolutionary pathways of a transcription network that controls the response to mating pheromone in yeasts9. A component of this network, the transcription regulator Ste12, has evolved two different modes of binding to a set of its target genes. In one group of species, Ste12 binds to specific DNA binding sites, while in another lineage it occupies DNA indirectly, relying on a second transcription regulator to recognize DNA. We show, through the construction of various possible evolutionary intermediates, that evolution of the direct mode of DNA binding was not directly accessible to the ancestor. Instead, it was contingent on a lineage-specific change to an overlapping transcription network with a different function, the specification of cell type. These results show that analyzing and predicting the evolution of cis-regulatory regions requires an understanding of their positions in overlapping networks, as this placement constrains the available evolutionary pathways. PMID:26153861

  6. Paper-based synthetic gene networks.

    PubMed

    Pardee, Keith; Green, Alexander A; Ferrante, Tom; Cameron, D Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J

    2014-11-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides an alternate, versatile venue for synthetic biologists to operate and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze dried onto paper, enabling the inexpensive, sterile, and abiotic distribution of synthetic-biology-based technologies for the clinic, global health, industry, research, and education. For field use, we create circuits with colorimetric outputs for detection by eye and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small-molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  7. Paper-based Synthetic Gene Networks

    PubMed Central

    Pardee, Keith; Green, Alexander A.; Ferrante, Tom; Cameron, D. Ewen; DaleyKeyser, Ajay; Yin, Peng; Collins, James J.

    2014-01-01

    Synthetic gene networks have wide-ranging uses in reprogramming and rewiring organisms. To date, there has not been a way to harness the vast potential of these networks beyond the constraints of a laboratory or in vivo environment. Here, we present an in vitro paper-based platform that provides a new venue for synthetic biologists to operate, and a much-needed medium for the safe deployment of engineered gene circuits beyond the lab. Commercially available cell-free systems are freeze-dried onto paper, enabling the inexpensive, sterile and abiotic distribution of synthetic biology-based technologies for the clinic, global health, industry, research and education. For field use, we create circuits with colorimetric outputs for detection by eye, and fabricate a low-cost, electronic optical interface. We demonstrate this technology with small molecule and RNA actuation of genetic switches, rapid prototyping of complex gene circuits, and programmable in vitro diagnostics, including glucose sensors and strain-specific Ebola virus sensors. PMID:25417167

  8. Next-Generation Synthetic Gene Networks

    PubMed Central

    Lu, Timothy K.; Khalil, Ahmad S.; Collins, James J.

    2009-01-01

    Synthetic biology is focused on the rational construction of biological systems based on engineering principles. During the field’s first decade of development, significant progress has been made in designing biological parts and assembling them into genetic circuits to achieve basic functionalities. These circuits have been used to construct proof-of-principle systems with promising results in industrial and medical applications. However, advances in synthetic biology have been limited by a lack of interoperable parts, techniques for dynamically probing biological systems, and frameworks for the reliable construction and operation of complex, higher-order networks. Here, we highlight challenges and goals for next-generation synthetic gene networks, in the context of potential applications in medicine, biotechnology, bioremediation, and bioenergy. PMID:20010597

  9. Pathway-Dependent Effectiveness of Network Algorithms for Gene Prioritization

    PubMed Central

    Shim, Jung Eun; Hwang, Sohyun; Lee, Insuk

    2015-01-01

    A network-based approach has proven useful for the identification of novel genes associated with complex phenotypes, including human diseases. Because network-based gene prioritization algorithms are based on propagating information of known phenotype-associated genes through networks, the pathway structure of each phenotype might significantly affect the effectiveness of algorithms. We systematically compared two popular network algorithms with distinct mechanisms – direct neighborhood which propagates information to only direct network neighbors, and network diffusion which diffuses information throughout the entire network – in prioritization of genes for worm and human phenotypes. Previous studies reported that network diffusion generally outperforms direct neighborhood for human diseases. Although prioritization power is generally measured for all ranked genes, only the top candidates are significant for subsequent functional analysis. We found that high prioritizing power of a network algorithm for all genes cannot guarantee successful prioritization of top ranked candidates for a given phenotype. Indeed, the majority of the phenotypes that were more efficiently prioritized by network diffusion showed higher prioritizing power for top candidates by direct neighborhood. We also found that connectivity among pathway genes for each phenotype largely determines which network algorithm is more effective, suggesting that the network algorithm used for each phenotype should be chosen with consideration of pathway gene connectivity. PMID:26091506

  10. Discovering Implicit Entity Relation with the Gene-Citation-Gene Network

    PubMed Central

    Song, Min; Han, Nam-Gi; Kim, Yong-Hwan; Ding, Ying; Chambers, Tamy

    2013-01-01

    In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG) network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG) network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner. PMID:24358368

  11. Discovering implicit entity relation with the gene-citation-gene network.

    PubMed

    Song, Min; Han, Nam-Gi; Kim, Yong-Hwan; Ding, Ying; Chambers, Tamy

    2013-01-01

    In this paper, we apply the entitymetrics model to our constructed Gene-Citation-Gene (GCG) network. Based on the premise there is a hidden, but plausible, relationship between an entity in one article and an entity in its citing article, we constructed a GCG network of gene pairs implicitly connected through citation. We compare the performance of this GCG network to a gene-gene (GG) network constructed over the same corpus but which uses gene pairs explicitly connected through traditional co-occurrence. Using 331,411 MEDLINE abstracts collected from 18,323 seed articles and their references, we identify 25 gene pairs. A comparison of these pairs with interactions found in BioGRID reveal that 96% of the gene pairs in the GCG network have known interactions. We measure network performance using degree, weighted degree, closeness, betweenness centrality and PageRank. Combining all measures, we find the GCG network has more gene pairs, but a lower matching rate than the GG network. However, combining top ranked genes in both networks produces a matching rate of 35.53%. By visualizing both the GG and GCG networks, we find that cancer is the most dominant disease associated with the genes in both networks. Overall, the study indicates that the GCG network can be useful for detecting gene interaction in an implicit manner. PMID:24358368

  12. Modular composition of gene transcription networks.

    PubMed

    Gyorgy, Andras; Del Vecchio, Domitilla

    2014-03-01

    Predicting the dynamic behavior of a large network from that of the composing modules is a central problem in systems and synthetic biology. Yet, this predictive ability is still largely missing because modules display context-dependent behavior. One cause of context-dependence is retroactivity, a phenomenon similar to loading that influences in non-trivial ways the dynamic performance of a module upon connection to other modules. Here, we establish an analysis framework for gene transcription networks that explicitly accounts for retroactivity. Specifically, a module's key properties are encoded by three retroactivity matrices: internal, scaling, and mixing retroactivity. All of them have a physical interpretation and can be computed from macroscopic parameters (dissociation constants and promoter concentrations) and from the modules' topology. The internal retroactivity quantifies the effect of intramodular connections on an isolated module's dynamics. The scaling and mixing retroactivity establish how intermodular connections change the dynamics of connected modules. Based on these matrices and on the dynamics of modules in isolation, we can accurately predict how loading will affect the behavior of an arbitrary interconnection of modules. We illustrate implications of internal, scaling, and mixing retroactivity on the performance of recurrent network motifs, including negative autoregulation, combinatorial regulation, two-gene clocks, the toggle switch, and the single-input motif. We further provide a quantitative metric that determines how robust the dynamic behavior of a module is to interconnection with other modules. This metric can be employed both to evaluate the extent of modularity of natural networks and to establish concrete design guidelines to minimize retroactivity between modules in synthetic systems. PMID:24626132

  13. Modular Composition of Gene Transcription Networks

    PubMed Central

    Gyorgy, Andras; Del Vecchio, Domitilla

    2014-01-01

    Predicting the dynamic behavior of a large network from that of the composing modules is a central problem in systems and synthetic biology. Yet, this predictive ability is still largely missing because modules display context-dependent behavior. One cause of context-dependence is retroactivity, a phenomenon similar to loading that influences in non-trivial ways the dynamic performance of a module upon connection to other modules. Here, we establish an analysis framework for gene transcription networks that explicitly accounts for retroactivity. Specifically, a module's key properties are encoded by three retroactivity matrices: internal, scaling, and mixing retroactivity. All of them have a physical interpretation and can be computed from macroscopic parameters (dissociation constants and promoter concentrations) and from the modules' topology. The internal retroactivity quantifies the effect of intramodular connections on an isolated module's dynamics. The scaling and mixing retroactivity establish how intermodular connections change the dynamics of connected modules. Based on these matrices and on the dynamics of modules in isolation, we can accurately predict how loading will affect the behavior of an arbitrary interconnection of modules. We illustrate implications of internal, scaling, and mixing retroactivity on the performance of recurrent network motifs, including negative autoregulation, combinatorial regulation, two-gene clocks, the toggle switch, and the single-input motif. We further provide a quantitative metric that determines how robust the dynamic behavior of a module is to interconnection with other modules. This metric can be employed both to evaluate the extent of modularity of natural networks and to establish concrete design guidelines to minimize retroactivity between modules in synthetic systems. PMID:24626132

  14. Biomarker Gene Signature Discovery Integrating Network Knowledge

    PubMed Central

    Cun, Yupeng; Fröhlich, Holger

    2012-01-01

    Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step towards a better personalized medicine. During the last decade various methods, mainly coming from the machine learning or statistical domain, have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Here we review the current state of research in this field by giving an overview about so-far proposed approaches. PMID:24832044

  15. Reverse engineering of gene regulatory networks.

    PubMed

    Cho, K H; Choo, S M; Jung, S H; Kim, J R; Choi, H S; Kim, J

    2007-05-01

    Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided. PMID:17591174

  16. An in vivo bioassay for detecting antiandrogens using humanized transgenic mice coexpressing the tetracycline-controlled transactivator and human CYP1B1 gene.

    PubMed

    Hwang, Dae Y; Cho, Jung S; Oh, Jae H; Shim, Sun B; Jee, Seung W; Lee, Su H; Seo, Su J; Kang, Hyun G; Sheen, Yhun Y; Lee, Seok H; Kim, Yong K

    2005-01-01

    The typical strategy used in analysis of antiandrogens involves the morphological changes of a marker in castrated rats Hershberger assay for the prostate, seminal vesicle, levator ani plus bulbocavernosus muscles (LABC), Cowper's gland, and glans penis. However, there are disadvantages to this approach, such as the time required, and the results may not correspond to those in actual human exposure. To evaluate its ability for detecting antiandrogens, in vivo the dose effect of di-(2-ethylhexyl) phthalate (DEHP) and time effect of five antiandrogens, DEHP, di-n-butyl phthalate (DBP), diethyl phthalate (DEP), linuron (3-(4-dichlorophenyl)-methoxy-1-methylurea), and 2,4'-DDE (1,1-dichloro-2-(p-chlorophenyl)-2-(o-chlorophenyl)ethylene), were investigated using humanized transgenic mice coexpressing tetracycline-controlled transactivator (tTA) and the human cytochrome P450 (CYP) enzyme CYP1B1 (hCYP1B1). Adult transgenic males were treated with each of the five antiandrogens, and their tTA-driven hCYP1B1 expressions analyzed by real-time polymerase chain reaction (PCR) and/or Western blot and for O-debenzylation activity. Herein, the treatments of adult males with the five antiandrogens were shown to affect the increased levels of tTA-driven hCYP1B1 expression in both dose-dependent and repeated experiments. Thus, this novel in vivo bioassay, using humanized transgenic mice, is useful for measuring antiandrogens, and is a means to a more relevant bioassay relating to actual human exposure. PMID:16040568

  17. A Unique Gene Regulatory Network Resets the Human Germline Epigenome for Development.

    PubMed

    Tang, Walfred W C; Dietmann, Sabine; Irie, Naoko; Leitch, Harry G; Floros, Vasileios I; Bradshaw, Charles R; Hackett, Jamie A; Chinnery, Patrick F; Surani, M Azim

    2015-06-01

    Resetting of the epigenome in human primordial germ cells (hPGCs) is critical for development. We show that the transcriptional program of hPGCs is distinct from that in mice, with co-expression of somatic specifiers and naive pluripotency genes TFCP2L1 and KLF4. This unique gene regulatory network, established by SOX17 and BLIMP1, drives comprehensive germline DNA demethylation by repressing DNA methylation pathways and activating TET-mediated hydroxymethylation. Base-resolution methylome analysis reveals progressive DNA demethylation to basal levels in week 5-7 in vivo hPGCs. Concurrently, hPGCs undergo chromatin reorganization, X reactivation, and imprint erasure. Despite global hypomethylation, evolutionarily young and potentially hazardous retroelements, like SVA, remain methylated. Remarkably, some loci associated with metabolic and neurological disorders are also resistant to DNA demethylation, revealing potential for transgenerational epigenetic inheritance that may have phenotypic consequences. We provide comprehensive insight on early human germline transcriptional network and epigenetic reprogramming that subsequently impacts human development and disease. PMID:26046444

  18. A predicted functional gene network for the plant pathogen Phytophthora infestans as a framework for genomic biology

    PubMed Central

    2013-01-01

    Background Associations between proteins are essential to understand cell biology. While this complex interplay between proteins has been studied in model organisms, it has not yet been described for the oomycete late blight pathogen Phytophthora infestans. Results We present an integrative probabilistic functional gene network that provides associations for 37 percent of the predicted P. infestans proteome. Our method unifies available genomic, transcriptomic and comparative genomic data into a single comprehensive network using a Bayesian approach. Enrichment of proteins residing in the same or related subcellular localization validates the biological coherence of our predictions. The network serves as a framework to query existing genomic data using network-based methods, which thus far was not possible in Phytophthora. We used the network to study the set of interacting proteins that are encoded by genes co-expressed during sporulation. This identified potential novel roles for proteins in spore formation through their links to proteins known to be involved in this process such as the phosphatase Cdc14. Conclusions The functional association network represents a novel genome-wide data source for P. infestans that also acts as a framework to interrogate other system-wide data. In both capacities it will improve our understanding of the complex biology of P. infestans and related oomycete pathogens. PMID:23865555

  19. Single-Cell Co-expression Analysis Reveals Distinct Functional Modules, Co-regulation Mechanisms and Clinical Outcomes

    PubMed Central

    Wang, Jie; Xia, Shuli; Arand, Brian; Zhu, Heng; Machiraju, Raghu; Huang, Kun; Ji, Hongkai; Qian, Jiang

    2016-01-01

    Co-expression analysis has been employed to predict gene function, identify functional modules, and determine tumor subtypes. Previous co-expression analysis was mainly conducted at bulk tissue level. It is unclear whether co-expression analysis at the single-cell level will provide novel insights into transcriptional regulation. Here we developed a computational approach to compare glioblastoma expression profiles at the single-cell level with those obtained from bulk tumors. We found that the co-expressed genes observed in single cells and bulk tumors have little overlap and show distinct characteristics. The co-expressed genes identified in bulk tumors tend to have similar biological functions, and are enriched for intrachromosomal interactions with synchronized promoter activity. In contrast, single-cell co-expressed genes are enriched for known protein-protein interactions, and are regulated through interchromosomal interactions. Moreover, gene members of some protein complexes are co-expressed only at the bulk level, while those of other complexes are co-expressed at both single-cell and bulk levels. Finally, we identified a set of co-expressed genes that can predict the survival of glioblastoma patients. Our study highlights that comparative analyses of single-cell and bulk gene expression profiles enable us to identify functional modules that are regulated at different levels and hold great translational potential. PMID:27100869

  20. Vitamin C deficiency improves somatic embryo development through distinct gene regulatory networks in Arabidopsis.

    PubMed

    Becker, Michael G; Chan, Ainsley; Mao, Xingyu; Girard, Ian J; Lee, Samantha; Elhiti, Mohamed; Stasolla, Claudio; Belmonte, Mark F

    2014-11-01

    Changes in the endogenous ascorbate redox status through genetic manipulation of cellular ascorbate levels were shown to accelerate cell proliferation during the induction phase and improve maturation of somatic embryos in Arabidopsis. Mutants defective in ascorbate biosynthesis such as vtc2-5 contained ~70 % less cellular ascorbate compared with their wild-type (WT; Columbia-0) counterparts. Depletion of cellular ascorbate accelerated cell division processes and cellular reorganization and improved the number and quality of mature somatic embryos grown in culture by 6-fold compared with WT tissues. To gain insight into the molecular mechanisms underlying somatic embryogenesis (SE), we profiled dynamic changes in the transcriptome and analysed dominant patterns of gene activity in the WT and vtc2-5 lines across the somatic embryo culturing process. Our results provide insight into the gene regulatory networks controlling SE in Arabidopsis based on the association of transcription factors with DNA sequence motifs enriched in biological processes of large co-expressed gene sets. These data provide the first detailed account of temporal changes in the somatic embryo transcriptome starting with the zygotic embryo, through tissue dedifferentiation, and ending with the mature somatic embryo, and impart insight into possible mechanisms for the improved culture of somatic embryos in the vtc2-5 mutant line. PMID:25151615

  1. Vitamin C deficiency improves somatic embryo development through distinct gene regulatory networks in Arabidopsis

    PubMed Central

    Becker, Michael G.; Chan, Ainsley; Mao, Xingyu; Girard, Ian J.; Lee, Samantha; Elhiti, Mohamed; Stasolla, Claudio; Belmonte, Mark F.

    2014-01-01

    Changes in the endogenous ascorbate redox status through genetic manipulation of cellular ascorbate levels were shown to accelerate cell proliferation during the induction phase and improve maturation of somatic embryos in Arabidopsis. Mutants defective in ascorbate biosynthesis such as vtc2-5 contained ~70 % less cellular ascorbate compared with their wild-type (WT; Columbia-0) counterparts. Depletion of cellular ascorbate accelerated cell division processes and cellular reorganization and improved the number and quality of mature somatic embryos grown in culture by 6-fold compared with WT tissues. To gain insight into the molecular mechanisms underlying somatic embryogenesis (SE), we profiled dynamic changes in the transcriptome and analysed dominant patterns of gene activity in the WT and vtc2-5 lines across the somatic embryo culturing process. Our results provide insight into the gene regulatory networks controlling SE in Arabidopsis based on the association of transcription factors with DNA sequence motifs enriched in biological processes of large co-expressed gene sets. These data provide the first detailed account of temporal changes in the somatic embryo transcriptome starting with the zygotic embryo, through tissue dedifferentiation, and ending with the mature somatic embryo, and impart insight into possible mechanisms for the improved culture of somatic embryos in the vtc2-5 mutant line. PMID:25151615

  2. Phosphorylation network rewiring by gene duplication

    PubMed Central

    Freschi, Luca; Courcelles, Mathieu; Thibault, Pierre; Michnick, Stephen W; Landry, Christian R

    2011-01-01

    Elucidating how complex regulatory networks have assembled during evolution requires a detailed understanding of the evolutionary dynamics that follow gene duplication events, including changes in post-translational modifications. We compared the phosphorylation profiles of paralogous proteins in the budding yeast Saccharomyces cerevisiae to that of a species that diverged from the budding yeast before the duplication of those genes. We found that 100 million years of post-duplication divergence are sufficient for the majority of phosphorylation sites to be lost or gained in one paralog or the other, with a strong bias toward losses. However, some losses may be partly compensated for by the evolution of other phosphosites, as paralogous proteins tend to preserve similar numbers of phosphosites over time. We also found that up to 50% of kinase–substrate relationships may have been rewired during this period. Our results suggest that after gene duplication, proteins tend to subfunctionalize at the level of post-translational regulation and that even when phosphosites are preserved, there is a turnover of the kinases that phosphorylate them. PMID:21734643

  3. How to identify essential genes from molecular networks?

    PubMed Central

    del Rio, Gabriel; Koschützki, Dirk; Coello, Gerardo

    2009-01-01

    Background The prediction of essential genes from molecular networks is a way to test the understanding of essentiality in the context of what is known about the network. However, the current knowledge on molecular network structures is incomplete yet, and consequently the strategies aimed to predict essential genes are prone to uncertain predictions. We propose that simultaneously evaluating different network structures and different algorithms representing gene essentiality (centrality measures) may identify essential genes in networks in a reliable fashion. Results By simultaneously analyzing 16 different centrality measures on 18 different reconstructed metabolic networks for Saccharomyces cerevisiae, we show that no single centrality measure identifies essential genes from these networks in a statistically significant way; however, the combination of at least 2 centrality measures achieves a reliable prediction of most but not all of the essential genes. No improvement is achieved in the prediction of essential genes when 3 or 4 centrality measures were combined. Conclusion The method reported here describes a reliable procedure to predict essential genes from molecular networks. Our results show that essential genes may be predicted only by combining centrality measures, revealing the complex nature of the function of essential genes. PMID:19822021

  4. Centrality Analysis Methods for Biological Networks and Their Application to Gene Regulatory Networks

    PubMed Central

    Koschützki, Dirk; Schreiber, Falk

    2008-01-01

    The structural analysis of biological networks includes the ranking of the vertices based on the connection structure of a network. To support this analysis we discuss centrality measures which indicate the importance of vertices, and demonstrate their applicability on a gene regulatory network. We show that common centrality measures result in different valuations of the vertices and that novel measures tailored to specific biological investigations are useful for the analysis of biological networks, in particular gene regulatory networks. PMID:19787083

  5. A Systems Genetics Approach Identifies Gene Regulatory Networks Associated with Fatty Acid Composition in Brassica rapa Seed.

    PubMed

    Basnet, Ram Kumar; Del Carpio, Dunia Pino; Xiao, Dong; Bucher, Johan; Jin, Mina; Boyle, Kerry; Fobert, Pierre; Visser, Richard G F; Maliepaard, Chris; Bonnema, Guusje

    2016-01-01

    Fatty acids in seeds affect seed germination and seedling vigor, and fatty acid composition determines the quality of seed oil. In this study, quantitative trait locus (QTL) mapping of fatty acid and transcript abundance was integrated with gene network analysis to unravel the genetic regulation of seed fatty acid composition in a Brassica rapa doubled haploid population from a cross between a yellow sarson oil type and a black-seeded pak choi. The distribution of major QTLs for fatty acids showed a relationship with the fatty acid types: linkage group A03 for monounsaturated fatty acids, A04 for saturated fatty acids, and A05 for polyunsaturated fatty acids. Using a genetical genomics approach, expression quantitative trait locus (eQTL) hotspots were found at major fatty acid QTLs on linkage groups A03, A04, A05, and A09. An eQTL-guided gene coexpression network of lipid metabolism-related genes showed major hubs at the genes BrPLA2-ALPHA, BrWD-40, a number of seed storage protein genes, and the transcription factor BrMD-2, suggesting essential roles for these genes in lipid metabolism. Three subnetworks were extracted for the economically important and most abundant fatty acids erucic, oleic, linoleic, and linolenic acids. Network analysis, combined with comparison of the genome positions of cis- or trans-eQTLs with fatty acid QTLs, allowed the identification of candidate genes for genetic regulation of these fatty acids. The generated insights in the genetic architecture of fatty acid composition and the underlying complex gene regulatory networks in B. rapa seeds are discussed. PMID:26518343

  6. The Inferred Cardiogenic Gene Regulatory Network in the Mammalian Heart

    PubMed Central

    Li, Xing; Thiagarajan, Raghuram; Nelson, Timothy J.; Tomita-Mitchell, Aoy; Beard, Daniel A.

    2014-01-01

    Cardiac development is a complex, multiscale process encompassing cell fate adoption, differentiation and morphogenesis. To elucidate pathways underlying this process, a recently developed algorithm to reverse engineer gene regulatory networks was applied to time-course microarray data obtained from the developing mouse heart. Approximately 200 genes of interest were input into the algorithm to generate putative network topologies that are capable of explaining the experimental data via model simulation. To cull specious network interactions, thousands of putative networks are merged and filtered to generate scale-free, hierarchical networks that are statistically significant and biologically relevant. The networks are validated with known gene interactions and used to predict regulatory pathways important for the developing mammalian heart. Area under the precision-recall curve and receiver operator characteristic curve are 9% and 58%, respectively. Of the top 10 ranked predicted interactions, 4 have already been validated. The algorithm is further tested using a network enriched with known interactions and another depleted of them. The inferred networks contained more interactions for the enriched network versus the depleted network. In all test cases, maximum performance of the algorithm was achieved when the purely data-driven method of network inference was combined with a data-independent, functional-based association method. Lastly, the network generated from the list of approximately 200 genes of interest was expanded using gene-profile uniqueness metrics to include approximately 900 additional known mouse genes and to form the most likely cardiogenic gene regulatory network. The resultant network supports known regulatory interactions and contains several novel cardiogenic regulatory interactions. The method outlined herein provides an informative approach to network inference and leads to clear testable hypotheses related to gene regulation. PMID:24971943

  7. Identification of Gene Networks for Residual Feed Intake in Angus Cattle Using Genomic Prediction and RNA-seq

    PubMed Central

    Weber, Kristina L.; Welly, Bryan T.; Van Eenennaam, Alison L.; Young, Amy E.; Porto-Neto, Laercio R.; Reverter, Antonio; Rincon, Gonzalo

    2016-01-01

    Improvement in feed conversion efficiency can improve the sustainability of beef cattle production, but genomic selection for feed efficiency affects many underlying molecular networks and physiological traits. This study describes the differences between steer progeny of two influential Angus bulls with divergent genomic predictions for residual feed intake (RFI). Eight steer progeny of each sire were phenotyped for growth and feed intake from 8 mo. of age (average BW 254 kg, with a mean difference between sire groups of 4.8 kg) until slaughter at 14–16 mo. of age (average BW 534 kg, sire group difference of 28.8 kg). Terminal samples from pituitary gland, skeletal muscle, liver, adipose, and duodenum were collected from each steer for transcriptome sequencing. Gene expression networks were derived using partial correlation and information theory (PCIT), including differentially expressed (DE) genes, tissue specific (TS) genes, transcription factors (TF), and genes associated with RFI from a genome-wide association study (GWAS). Relative to progeny of the high RFI sire, progeny of the low RFI sire had -0.56 kg/d finishing period RFI (P = 0.05), -1.08 finishing period feed conversion ratio (P = 0.01), +3.3 kg^0.75 finishing period metabolic mid-weight (MMW; P = 0.04), +28.8 kg final body weight (P = 0.01), -12.9 feed bunk visits per day (P = 0.02) with +0.60 min/visit duration (P = 0.01), and +0.0045 carcass specific gravity (weight in air/weight in air—weight in water, a predictor of carcass fat content; P = 0.03). RNA-seq identified 633 DE genes between sire groups among 17,016 expressed genes. PCIT analysis identified >115,000 significant co-expression correlations between genes and 25 TF hubs, i.e. controllers of clusters of DE, TS, and GWAS SNP genes. Pathway analysis suggests low RFI bull progeny possess heightened gut inflammation and reduced fat deposition. This multi-omics analysis shows how differences in RFI genomic breeding values can impact other

  8. Identification of Gene Networks for Residual Feed Intake in Angus Cattle Using Genomic Prediction and RNA-seq.

    PubMed

    Weber, Kristina L; Welly, Bryan T; Van Eenennaam, Alison L; Young, Amy E; Porto-Neto, Laercio R; Reverter, Antonio; Rincon, Gonzalo

    2016-01-01

    Improvement in feed conversion efficiency can improve the sustainability of beef cattle production, but genomic selection for feed efficiency affects many underlying molecular networks and physiological traits. This study describes the differences between steer progeny of two influential Angus bulls with divergent genomic predictions for residual feed intake (RFI). Eight steer progeny of each sire were phenotyped for growth and feed intake from 8 mo. of age (average BW 254 kg, with a mean difference between sire groups of 4.8 kg) until slaughter at 14-16 mo. of age (average BW 534 kg, sire group difference of 28.8 kg). Terminal samples from pituitary gland, skeletal muscle, liver, adipose, and duodenum were collected from each steer for transcriptome sequencing. Gene expression networks were derived using partial correlation and information theory (PCIT), including differentially expressed (DE) genes, tissue specific (TS) genes, transcription factors (TF), and genes associated with RFI from a genome-wide association study (GWAS). Relative to progeny of the high RFI sire, progeny of the low RFI sire had -0.56 kg/d finishing period RFI (P = 0.05), -1.08 finishing period feed conversion ratio (P = 0.01), +3.3 kg^0.75 finishing period metabolic mid-weight (MMW; P = 0.04), +28.8 kg final body weight (P = 0.01), -12.9 feed bunk visits per day (P = 0.02) with +0.60 min/visit duration (P = 0.01), and +0.0045 carcass specific gravity (weight in air/weight in air-weight in water, a predictor of carcass fat content; P = 0.03). RNA-seq identified 633 DE genes between sire groups among 17,016 expressed genes. PCIT analysis identified >115,000 significant co-expression correlations between genes and 25 TF hubs, i.e. controllers of clusters of DE, TS, and GWAS SNP genes. Pathway analysis suggests low RFI bull progeny possess heightened gut inflammation and reduced fat deposition. This multi-omics analysis shows how differences in RFI genomic breeding values can impact other

  9. Integrating large-scale functional genomics data to dissect metabolic networks for hydrogen production

    SciTech Connect

    Harwood, Caroline S

    2012-12-17

    The goal of this project is to identify gene networks that are critical for efficient biohydrogen production by leveraging variation in gene content and gene expression in independently isolated Rhodopseudomonas palustris strains. Coexpression methods were applied to large data sets that we have collected to define probabilistic causal gene networks. To our knowledge this a first systems level approach that takes advantage of strain-to strain variability to computationally define networks critical for a particular bacterial phenotypic trait.

  10. Integration of omic networks in a developmental atlas of maize.

    PubMed

    Walley, Justin W; Sartor, Ryan C; Shen, Zhouxin; Schmitz, Robert J; Wu, Kevin J; Urich, Mark A; Nery, Joseph R; Smith, Laurie G; Schnable, James C; Ecker, Joseph R; Briggs, Steven P

    2016-08-19

    Coexpression networks and gene regulatory networks (GRNs) are emerging as important tools for predicting functional roles of individual genes at a system-wide scale. To enable network reconstructions, we built a large-scale gene expression atlas composed of 62,547 messenger RNAs (mRNAs), 17,862 nonmodified proteins, and 6227 phosphoproteins harboring 31,595 phosphorylation sites quantified across maize development. Networks in which nodes are genes connected on the basis of highly correlated expression patterns of mRNAs were very different from networks that were based on coexpression of proteins. Roughly 85% of highly interconnected hubs were not conserved in expression between RNA and protein networks. However, networks from either data type were enriched in similar ontological categories and were effective in predicting known regulatory relationships. Integration of mRNA, protein, and phosphoprotein data sets greatly improved the predictive power of GRNs. PMID:27540173

  11. Long-term safety and stability of angiogenesis induced by balanced single-vector co-expression of PDGF-BB and VEGF164 in skeletal muscle

    PubMed Central

    Gianni-Barrera, Roberto; Burger, Maximilian; Wolff, Thomas; Heberer, Michael; Schaefer, Dirk J.; Gürke, Lorenz; Mujagic, Edin; Banfi, Andrea

    2016-01-01

    Therapeutic angiogenesis by growth factor delivery is an attractive treatment strategy for ischemic diseases, yet clinical efficacy has been elusive. The angiogenic master regulator VEGF-A can induce aberrant angiogenesis if expressed above a threshold level. Since VEGF remains localized in the matrix around expressing cells, homogeneous dose distribution in target tissues is required, which is challenging. We found that co-expression of the pericyte-recruiting factor PDGF-BB at a fixed ratio with VEGF from a single bicistronic vector ensured normal angiogenesis despite heterogeneous high VEGF levels. Taking advantage of a highly controlled gene delivery platform, based on monoclonal populations of transduced myoblasts, in which every cell stably produces the same amount of each factor, here we rigorously investigated a) the dose-dependent effects, and b) the long-term safety and stability of VEGF and PDGF-BB co-expression in skeletal muscle. PDGF-BB co-expression did not affect the normal angiogenesis by low and medium VEGF doses, but specifically prevented vascular tumors by high VEGF, yielding instead normal and mature capillary networks, accompanied by robust arteriole formation. Induced angiogenesis persisted unchanged up to 4 months, while no tumors appeared. Therefore, PDGF-BB co-expression is an attractive strategy to improve safety and efficacy of therapeutic angiogenesis by VEGF gene delivery. PMID:26882992

  12. Analysis of Global Gene Expression in Brachypodium distachyon Reveals Extensive Network Plasticity in Response to Abiotic Stress

    PubMed Central

    Priest, Henry D.; Fox, Samuel E.; Rowley, Erik R.; Murray, Jessica R.; Michael, Todd P.; Mockler, Todd C.

    2014-01-01

    Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium. PMID:24489928

  13. Trainable Gene Regulation Networks with Applications to Drosophila Pattern Formation

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric

    2000-01-01

    This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila melanogaster. For details the reader is referred to the papers introduced below. It will then introduce a new gene regulation network model which can describe promoter-level substructure in gene regulation. As described in chapter 2, gene regulation may be thought of as a combination of cis-acting regulation by the extended promoter of a gene (including all regulatory sequences) by way of the transcription complex, and of trans-acting regulation by the transcription factor products of other genes. If we simplify the cis-action by using a phenomenological model which can be tuned to data, such as a unit or other small portion of an artificial neural network, then the full transacting interaction between multiple genes during development can be modelled as a larger network which can again be tuned or trained to data. The larger network will in general need to have recurrent (feedback) connections since at least some real gene regulation networks do. This is the basic modeling approach taken, which describes how a set of recurrent neural networks can be used as a modeling language for multiple developmental processes including gene regulation within a single cell, cell-cell communication, and cell division. Such network models have been called "gene circuits", "gene regulation networks", or "genetic regulatory networks", sometimes without distinguishing the models from the actual modeled systems.

  14. A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization

    PubMed Central

    Li, Jianhua; Lin, Xiaoyan; Teng, Yueyang; Qi, Shouliang; Xiao, Dayu; Zhang, Jianying; Kang, Yan

    2016-01-01

    Identification of disease-causing genes is a fundamental challenge for human health studies. The phenotypic similarity among diseases may reflect the interactions at the molecular level, and phenotype comparison can be used to predict disease candidate genes. Online Mendelian Inheritance in Man (OMIM) is a database of human genetic diseases and related genes that has become an authoritative source of disease phenotypes. However, disease phenotypes have been described by free text; thus, standardization of phenotypic descriptions is needed before diseases can be compared. Several disease phenotype networks have been established in OMIM using different standardization methods. Two of these networks are important for phenotypic similarity analysis: the first and most commonly used network (mimMiner) is standardized by medical subject heading, and the other network (resnikHPO) is the first to be standardized by human phenotype ontology. This paper comprehensively evaluates for the first time the accuracy of these two networks in gene prioritization based on protein–protein interactions using large-scale, leave-one-out cross-validation experiments. The results show that both networks can effectively prioritize disease-causing genes, and the approach that relates two diseases using a logistic function improves prioritization performance. Tanimoto, one of four methods for normalizing resnikHPO, generates a symmetric network and it performs similarly to mimMiner. Furthermore, an integration of these two networks outperforms either network alone in gene prioritization, indicating that these two disease networks are complementary. PMID:27415759

  15. Gene network inference and biochemical assessment delineates GPCR pathways and CREB targets in small intestinal neuroendocrine neoplasia.

    PubMed

    Drozdov, Ignat; Svejda, Bernhard; Gustafsson, Bjorn I; Mane, Shrikant; Pfragner, Roswitha; Kidd, Mark; Modlin, Irvin M

    2011-01-01

    Small intestinal (SI) neuroendocrine tumors (NET) are increasing in incidence, however little is known about their biology. High throughput techniques such as inference of gene regulatory networks from microarray experiments can objectively define signaling machinery in this disease. Genome-wide co-expression analysis was used to infer gene relevance network in SI-NETs. The network was confirmed to be non-random, scale-free, and highly modular. Functional analysis of gene co-expression modules revealed processes including 'Nervous system development', 'Immune response', and 'Cell-cycle'. Importantly, gene network topology and differential expression analysis identified over-expression of the GPCR signaling regulators, the cAMP synthetase, ADCY2, and the protein kinase A, PRKAR1A. Seven CREB response element (CRE) transcripts associated with proliferation and secretion: BEX1, BICD1, CHGB, CPE, GABRB3, SCG2 and SCG3 as well as ADCY2 and PRKAR1A were measured in an independent SI dataset (n = 10 NETs; n = 8 normal preparations). All were up-regulated (p<0.035) with the exception of SCG3 which was not differently expressed. Forskolin (a direct cAMP activator, 10(-5) M) significantly stimulated transcription of pCREB and 3/7 CREB targets, isoproterenol (a selective ß-adrenergic receptor agonist and cAMP activator, 10(-5) M) stimulated pCREB and 4/7 targets while BIM-53061 (a dopamine D(2) and Serotonin [5-HT(2)] receptor agonist, 10(-6) M) stimulated 100% of targets as well as pCREB; CRE transcription correlated with the levels of cAMP accumulation and PKA activity; BIM-53061 stimulated the highest levels of cAMP and PKA (2.8-fold and 2.5-fold vs. 1.8-2-fold for isoproterenol and forskolin). Gene network inference and graph topology analysis in SI NETs suggests that SI NETs express neural GPCRs that activate different CRE targets associated with proliferation and secretion. In vitro studies, in a model NET cell system, confirmed that transcriptional effects are

  16. Graphical Features of Functional Genes in Human Protein Interaction Network.

    PubMed

    Wang, Pei; Chen, Yao; Lü, Jinhu; Wang, Qingyun; Yu, Xinghuo

    2016-06-01

    With the completion of the human genome project, it is feasible to investigate large-scale human protein interaction network (HPIN) with complex networks theory. Proteins are encoded by genes. Essential, viable, disease, conserved, housekeeping (HK) and tissue-enriched (TE) genes are functional genes, which are organized and functioned via interaction networks. Based on up-to-date data from various databases or literature, two large-scale HPINs and six subnetworks are constructed. We illustrate that the HPINs and most of the subnetworks are sparse, small-world, scale-free, disassortative and with hierarchical modularity. Among the six subnetworks, essential, disease and HK subnetworks are more densely connected than the others. Statistical analysis on the topological structures of the HPIN reveals that the lethal, the conserved, the HK and the TE genes are with hallmark graphical features. Receiver operating characteristic (ROC) curves indicate that the essential genes can be distinguished from the viable ones with accuracy as high as almost 70%. Closeness, semi-local and eigenvector centralities can distinguish the HK genes from the TE ones with accuracy around 82%. Furthermore, the Venn diagram, cluster dendgrams and classifications of disease genes reveal that some classes of disease genes are with hallmark graphical features, especially for cancer genes, HK disease genes and TE disease genes. The findings facilitate the identification of some functional genes via topological structures. The investigations shed some light on the characteristics of the compete interactome, which have potential implications in networked medicine and biological network control. PMID:26841412

  17. An Integrative Network Approach to Map the Transcriptome to the Phenome

    PubMed Central

    Mehan, Michael R.; Nunez-Iglesias, Juan; Kalakrishnan, Mrinal; Waterman, Michael S.

    2009-01-01

    Abstract Although many studies have been successful in the discovery of cooperating groups of genes, mapping these groups to phenotypes has proved a much more challenging task. In this article, we present the first genome-wide mapping of gene coexpression modules onto the phenome. We annotated coexpression networks from 136 microarray datasets with phenotypes from the Unified Medical Language System (UMLS). We then designed an efficient graph-based simulated annealing approach to identify coexpression modules frequently and specifically occurring in datasets related to individual phenotypes. By requiring phenotype-specific recurrence, we ensure the robustness of our findings. We discovered 118,772 modules specific to 42 phenotypes, and developed validation tests combining Gene Ontology, GeneRIF and UMLS. Our method is generally applicable to any kind of abundant network data with defined phenotype association, and thus paves the way for genome-wide, gene network-phenotype maps. PMID:19630539

  18. Co-expression analysis of fetal weight-related genes in ovine skeletal muscle during mid and late fetal development stages

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Muscle development and lipid metabolism play important roles during fetal development stages. The commercial Texel sheep are more muscular than the indigenous Ujumqin sheep which are fatter. We performed serial transcriptomics assays and systems biology analyses to investigate the dynamics of gene e...

  19. Psg22 expression in mouse trophoblast giant cells is associated with gene inversion and co-expression of antisense long non-coding RNAs.

    PubMed

    Williams, John M; Ball, Melanie; Ward, Andrew; Moore, Tom

    2015-01-01

    Pregnancy-specific glycoproteins (PSGs) are secreted carcinoembryonic antigen (CEA)-related cell adhesion molecules-related members of the immunoglobulin superfamily and are encoded by multigene families in species with haemochorial placentation. PSGs may be the most abundant trophoblast-derived proteins in human maternal blood in late pregnancy and there is evidence that dysregulation of PSG expression is associated with gestational pathology. PSGs are produced by syncytiotrophoblast in the human placenta and by trophoblast giant cells (TGCs) and spongiotrophoblast in rodents, and are implicated in immune regulation, angiogenesis and regulation of platelet function. PSGs are encoded by 17 genes in the mouse and ten genes in the human. While functions appear to be conserved, the typical protein domain organisation differs between species. We analysed the evolution of the mouse Psg genomic locus structure and report inversion of the Psg22 gene within the locus. Psg22 is the most abundant Psg transcript detected in the first half of mouse pregnancy and we identified antisense long non-coding RNA (lncRNA) transcripts adjacent to Psg22 associated with an active local chromatin conformation. This suggests that an epigenetic regulatory mechanism may underpin high Psg22 expression relative to the other Psg gene family members in TGCs. PMID:25359516

  20. The Effects of Gene Recruitment on the Evolvability and Robustness of Pattern-Forming Gene Networks

    NASA Astrophysics Data System (ADS)

    Spirov, Alexander V.; Holloway, David M.

    Gene recruitment or co-option is defined as the placement of a new gene under a foreign regulatory system. Such re-arrangement of pre-existing regulatory networks can lead to an increase in genomic complexity. This reorganization is recognized as a major driving force in evolution. We simulated the evolution of gene networks by means of the Genetic Algorithms (GA) technique. We used standard GA methods of point mutation and multi-point crossover, as well as our own operators for introducing or withdrawing new genes on the network. The starting point for our computer evolutionary experiments was a 4-gene dynamic model representing the real genetic network controlling segmentation in the fruit fly Drosophila. Model output was fit to experimentally observed gene expression patterns in the early fly embryo. We compared this to output for networks with more and less genes, and with variation in maternal regulatory input. We found that the mutation operator, together with the gene introduction procedure, was sufficient for recruiting new genes into pre-existing networks. Reinforcement of the evolutionary search by crossover operators facilitates this recruitment, but is not necessary. Gene recruitment causes outgrowth of an evolving network, resulting in redundancy, in the sense that the number of genes goes up, as well as the regulatory interactions on the original genes. The recruited genes can have uniform or patterned expressions, many of which recapitulate gene patterns seen in flies, including genes which are not explicitly put in our model. Recruitment of new genes can affect the evolvability of networks (in general, their ability to produce the variation to facilitate adaptive evolution). We see this in particular with a 2-gene subnetwork. To study robustness, we have subjected the networks to experimental levels of variability in maternal regulatory patterns. The majority of networks are not robust to these perturbations. However, a significant subset of the

  1. Exhaustive Search for Fuzzy Gene Networks from Microarray Data

    SciTech Connect

    Sokhansanj, B A; Fitch, J P; Quong, J N; Quong, A A

    2003-07-07

    Recent technological advances in high-throughput data collection allow for the study of increasingly complex systems on the scale of the whole cellular genome and proteome. Gene network models are required to interpret large and complex data sets. Rationally designed system perturbations (e.g. gene knock-outs, metabolite removal, etc) can be used to iteratively refine hypothetical models, leading to a modeling-experiment cycle for high-throughput biological system analysis. We use fuzzy logic gene network models because they have greater resolution than Boolean logic models and do not require the precise parameter measurement needed for chemical kinetics-based modeling. The fuzzy gene network approach is tested by exhaustive search for network models describing cyclin gene interactions in yeast cell cycle microarray data, with preliminary success in recovering interactions predicted by previous biological knowledge and other analysis techniques. Our goal is to further develop this method in combination with experiments we are performing on bacterial regulatory networks.

  2. C. elegans Metabolic Gene Regulatory Networks Govern the Cellular Economy

    PubMed Central

    Watson, Emma; Walhout, Albertha J.M.

    2014-01-01

    Diet greatly impacts metabolism in health and disease. In response to the presence or absence of specific nutrients, metabolic gene regulatory networks sense the metabolic state of the cell and regulate metabolic flux accordingly, for instance by the transcriptional control of metabolic enzymes. Here we discuss recent insights regarding metazoan metabolic regulatory networks using the nematode Caenorhabditis elegans as a model, including the modular organization of metabolic gene regulatory networks, the prominent impact of diet on the transcriptome and metabolome, specialized roles of nuclear hormone receptors in responding to dietary conditions, regulation of metabolic genes and metabolic regulators by microRNAs, and feedback between metabolic genes and their regulators. PMID:24731597

  3. Functional-Network-Based Gene Set Analysis Using Gene-Ontology

    PubMed Central

    Chang, Billy; Kustra, Rafal; Tian, Weidong

    2013-01-01

    To account for the functional non-equivalence among a set of genes within a biological pathway when performing gene set analysis, we introduce GOGANPA, a network-based gene set analysis method, which up-weights genes with functions relevant to the gene set of interest. The genes are weighted according to its degree within a genome-scale functional network constructed using the functional annotations available from the gene ontology database. By benchmarking GOGANPA using a well-studied P53 data set and three breast cancer data sets, we will demonstrate the power and reproducibility of our proposed method over traditional unweighted approaches and a competing network-based approach that involves a complex integrated network. GOGANPA’s sole reliance on gene ontology further allows GOGANPA to be widely applicable to the analysis of any gene-ontology-annotated genome. PMID:23418449

  4. Identifying disease candidate genes via large-scale gene network analysis.

    PubMed

    Kim, Haseong; Park, Taesung; Gelenbe, Erol

    2014-01-01

    Gene Regulatory Networks (GRN) provide systematic views of complex living systems, offering reliable and large-scale GRNs to identify disease candidate genes. A reverse engineering technique, Bayesian Model Averaging-based Networks (BMAnet), which ensembles all appropriate linear models to tackle uncertainty in model selection that integrates heterogeneous biological data sets is introduced. Using network evaluation metrics, we compare the networks that are thus identified. The metric 'Random walk with restart (Rwr)' is utilised to search for disease genes. In a simulation our method shows better performance than elastic-net and Gaussian graphical models, but topological quantities vary among the three methods. Using real-data, brain tumour gene expression samples consisting of non-tumour, grade III and grade IV are analysed to estimate networks with a total of 4422 genes. Based on these networks, 169 brain tumour-related candidate genes were identified and some were found to relate to 'wound', 'apoptosis', and 'cell death' processes. PMID:25796737

  5. Identification and functional characterization of cDNAs coding for hydroxybenzoate/hydroxycinnamate glucosyltransferases co-expressed with genes related to proanthocyanidin biosynthesis

    PubMed Central

    Khater, F.; Fournand, D.; Vialet, S.; Meudec, E.; Cheynier, V.; Terrier, N.

    2012-01-01

    Grape proanthocyanidins (PAs) play a major role in the organoleptic properties of wine. They are accumulated mainly in grape skin and seeds during the early stages of berry development. Despite the recent progress in the identification of genes involved in PA biosynthesis, the mechanisms involved in subunit condensation, galloylation, or fine regulation of the spatio-temporal composition of grape berries in PAs are still not elucidated. Two Myb transcription factors, VvMybPA1 and VvMybPA2, controlling the PA pathway have recently been identified and ectopically over-expressed in an homologous system. In addition to already known PA genes, three genes coding for glucosyltransferases were significantly differentially expressed between hairy roots over-expressing VvMybPA1 or VvMybPA2 and control lines. The involvement of these genes in PA biosynthesis metabolism is unclear. The three glucosyltransferases display high sequence similarities with other plant glucosyltransferases able to catalyse the formation of glucose esters, which are important intermediate actors for the synthesis of different phenolic compounds. Studies of the in vitro properties of these three enzymes (Km, Vmax, substrate specificity, pH sensitivity) were performed through production of recombinant proteins in E. coli and demonstrated that they are able to catalyse the formation of 1-O-acyl-Glc esters of phenolic acids but are not active on flavonoids and stilbenes. The transcripts are expressed in the early stages of grape berry development, mainly in the berry skins and seeds. The results presented here suggest that these enzymes could be involved in vivo in PA galloylation or in the synthesis of hydroxycinnamic esters. PMID:22090445

  6. Pathogenic Network Analysis Predicts Candidate Genes for Cervical Cancer

    PubMed Central

    Zhang, Yun-Xia

    2016-01-01

    Purpose. The objective of our study was to predicate candidate genes in cervical cancer (CC) using a network-based strategy and to understand the pathogenic process of CC. Methods. A pathogenic network of CC was extracted based on known pathogenic genes (seed genes) and differentially expressed genes (DEGs) between CC and normal controls. Subsequently, cluster analysis was performed to identify the subnetworks in the pathogenic network using ClusterONE. Each gene in the pathogenic network was assigned a weight value, and then candidate genes were obtained based on the weight distribution. Eventually, pathway enrichment analysis for candidate genes was performed. Results. In this work, a total of 330 DEGs were identified between CC and normal controls. From the pathogenic network, 2 intensely connected clusters were extracted, and a total of 52 candidate genes were detected under the weight values greater than 0.10. Among these candidate genes, VIM had the highest weight value. Moreover, candidate genes MMP1, CDC45, and CAT were, respectively, enriched in pathway in cancer, cell cycle, and methane metabolism. Conclusion. Candidate pathogenic genes including MMP1, CDC45, CAT, and VIM might be involved in the pathogenesis of CC. We believe that our results can provide theoretical guidelines for future clinical application. PMID:27034707

  7. Implicit methods for qualitative modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Mohanram, Kartik; De Micheli, Giovanni; Xenarios, Ioannis

    2012-01-01

    Advancements in high-throughput technologies to measure increasingly complex biological phenomena at the genomic level are rapidly changing the face of biological research from the single-gene single-protein experimental approach to studying the behavior of a gene in the context of the entire genome (and proteome). This shift in research methodologies has resulted in a new field of network biology that deals with modeling cellular behavior in terms of network structures such as signaling pathways and gene regulatory networks. In these networks, different biological entities such as genes, proteins, and metabolites interact with each other, giving rise to a dynamical system. Even though there exists a mature field of dynamical systems theory to model such network structures, some technical challenges are unique to biology such as the inability to measure precise kinetic information on gene-gene or gene-protein interactions and the need to model increasingly large networks comprising thousands of nodes. These challenges have renewed interest in developing new computational techniques for modeling complex biological systems. This chapter presents a modeling framework based on Boolean algebra and finite-state machines that are reminiscent of the approach used for digital circuit synthesis and simulation in the field of very-large-scale integration (VLSI). The proposed formalism enables a common mathematical framework to develop computational techniques for modeling different aspects of the regulatory networks such as steady-state behavior, stochasticity, and gene perturbation experiments. PMID:21938638

  8. Origin of Co-Expression Patterns in E.coli and S.cerevisiae Emerging from Reverse Engineering Algorithms

    PubMed Central

    Zampieri, Mattia; Soranzo, Nicola; Bianchini, Daniele; Altafini, Claudio

    2008-01-01

    Background The concept of reverse engineering a gene network, i.e., of inferring a genome-wide graph of putative gene-gene interactions from compendia of high throughput microarray data has been extensively used in the last few years to deduce/integrate/validate various types of “physical” networks of interactions among genes or gene products. Results This paper gives a comprehensive overview of which of these networks emerge significantly when reverse engineering large collections of gene expression data for two model organisms, E.coli and S.cerevisiae, without any prior information. For the first organism the pattern of co-expression is shown to reflect in fine detail both the operonal structure of the DNA and the regulatory effects exerted by the gene products when co-participating in a protein complex. For the second organism we find that direct transcriptional control (e.g., transcription factor–binding site interactions) has little statistical significance in comparison to the other regulatory mechanisms (such as co-sharing a protein complex, co-localization on a metabolic pathway or compartment), which are however resolved at a lower level of detail than in E.coli. Conclusion The gene co-expression patterns deduced from compendia of profiling experiments tend to unveil functional categories that are mainly associated to stable bindings rather than transient interactions. The inference power of this systematic analysis is substantially reduced when passing from E.coli to S.cerevisiae. This extensive analysis provides a way to describe the different complexity between the two organisms and discusses the critical limitations affecting this type of methodologies. PMID:18714358

  9. An Arabidopsis gene network based on the graphical Gaussian model

    PubMed Central

    Ma, Shisong; Gong, Qingqiu; Bohnert, Hans J.

    2007-01-01

    We describe a gene network for the Arabidopsis thaliana transcriptome based on a modified graphical Gaussian model (GGM). Through partial correlation (pcor), GGM infers coregulation patterns between gene pairs conditional on the behavior of other genes. Regularized GGM calculated pcor between gene pairs among ∼2000 input genes at a time. Regularized GGM coupled with iterative random samplings of genes was expanded into a network that covered the Arabidopsis genome (22,266 genes). This resulted in a network of 18,625 interactions (edges) among 6760 genes (nodes) with high confidence and connections representing ∼0.01% of all possible edges. When queried for selected genes, locally coherent subnetworks mainly related to metabolic functions, and stress responses emerged. Examples of networks for biochemical pathways, cell wall metabolism, and cold responses are presented. GGM displayed known coregulation pathways as subnetworks and added novel components to known edges. Finally, the network reconciled individual subnetworks in a topology joined at the whole-genome level and provided a general framework that can instruct future studies on plant metabolism and stress responses. The network model is included. PMID:17921353

  10. Hub-Centered Gene Network Reconstruction Using Automatic Relevance Determination

    PubMed Central

    Böck, Matthias; Ogishima, Soichi; Tanaka, Hiroshi; Kramer, Stefan; Kaderali, Lars

    2012-01-01

    Network inference deals with the reconstruction of biological networks from experimental data. A variety of different reverse engineering techniques are available; they differ in the underlying assumptions and mathematical models used. One common problem for all approaches stems from the complexity of the task, due to the combinatorial explosion of different network topologies for increasing network size. To handle this problem, constraints are frequently used, for example on the node degree, number of edges, or constraints on regulation functions between network components. We propose to exploit topological considerations in the inference of gene regulatory networks. Such systems are often controlled by a small number of hub genes, while most other genes have only limited influence on the network's dynamic. We model gene regulation using a Bayesian network with discrete, Boolean nodes. A hierarchical prior is employed to identify hub genes. The first layer of the prior is used to regularize weights on edges emanating from one specific node. A second prior on hyperparameters controls the magnitude of the former regularization for different nodes. The net effect is that central nodes tend to form in reconstructed networks. Network reconstruction is then performed by maximization of or sampling from the posterior distribution. We evaluate our approach on simulated and real experimental data, indicating that we can reconstruct main regulatory interactions from the data. We furthermore compare our approach to other state-of-the art methods, showing superior performance in identifying hubs. Using a large publicly available dataset of over 800 cell cycle regulated genes, we are able to identify several main hub genes. Our method may thus provide a valuable tool to identify interesting candidate genes for further study. Furthermore, the approach presented may stimulate further developments in regularization methods for network reconstruction from data. PMID:22570688

  11. Using Effective Subnetworks to Predict Selected Properties of Gene Networks

    PubMed Central

    Gunaratne, Gemunu H.; Gunaratne, Preethi H.; Seemann, Lars; Török, Andrei

    2010-01-01

    Background Difficulties associated with implementing gene therapy are caused by the complexity of the underlying regulatory networks. The forms of interactions between the hundreds of genes, proteins, and metabolites in these networks are not known very accurately. An alternative approach is to limit consideration to genes on the network. Steady state measurements of these influence networks can be obtained from DNA microarray experiments. However, since they contain a large number of nodes, the computation of influence networks requires a prohibitively large set of microarray experiments. Furthermore, error estimates of the network make verifiable predictions impossible. Methodology/Principal Findings Here, we propose an alternative approach. Rather than attempting to derive an accurate model of the network, we ask what questions can be addressed using lower dimensional, highly simplified models. More importantly, is it possible to use such robust features in applications? We first identify a small group of genes that can be used to affect changes in other nodes of the network. The reduced effective empirical subnetwork (EES) can be computed using steady state measurements on a small number of genetically perturbed systems. We show that the EES can be used to make predictions on expression profiles of other mutants, and to compute how to implement pre-specified changes in the steady state of the underlying biological process. These assertions are verified in a synthetic influence network. We also use previously published experimental data to compute the EES associated with an oxygen deprivation network of E.coli, and use it to predict gene expression levels on a double mutant. The predictions are significantly different from the experimental results for less than of genes. Conclusions/Significance The constraints imposed by gene expression levels of mutants can be used to address a selected set of questions about a gene network. PMID:20949025

  12. Reveal genes functionally associated with ACADS by a network study.

    PubMed

    Chen, Yulong; Su, Zhiguang

    2015-09-15

    Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. PMID:26045367

  13. Network analysis of EtOH-related candidate genes.

    PubMed

    Guo, An-Yuan; Sun, Jingchun; Jia, Peilin; Zhao, Zhongming

    2010-05-01

    Recently, we collected many large-scale datasets for alcohol dependence and EtOH response in five organisms and deposited them in our EtOH-related gene resource database (ERGR, http://bioinfo.mc.vanderbilt.edu/ERGR/). Based on multidimensional evidence among these datasets, we prioritized 57 EtOH-related candidate genes. To explore their biological roles, and the molecular mechanisms of EtOH response and alcohol dependence, we examined the features of these genes by the Gene Ontology (GO) term-enrichment test and network/pathway analysis. Our analysis revealed that these candidate genes were highly enriched in alcohol dependence/alcoholism and highly expressed in brain or liver tissues. All the significantly enriched GO terms were related to neurotransmitter systems or EtOH metabolic processes. Using the Ingenuity Pathway Analysis system, we found that these genes were involved in networks of neurological disease, cardiovascular disease, inflammatory response, and small molecular metabolism. Many key genes in signaling pathways were in the central position of these networks. Furthermore, our protein-protein interaction (PPI) network analysis suggested some novel candidate genes which also had evidence in the ERGR database. This study demonstrated that our candidate gene selection is effective and our network/pathway analysis is useful for uncovering the molecular mechanisms of EtOH response and alcohol dependence. This approach can be applied to study the features of candidate genes of other complex traits/phenotypes. PMID:20491071

  14. Identification of Gene Modules Associated with Low Temperatures Response in Bambara Groundnut by Network-Based Analysis.

    PubMed

    Bonthala, Venkata Suresh; Mayes, Katie; Moreton, Joanna; Blythe, Martin; Wright, Victoria; May, Sean Tobias; Massawe, Festo; Mayes, Sean; Twycross, Jamie

    2016-01-01

    Bambara groundnut (Vigna subterranea (L.) Verdc.) is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip) coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01) under the sub-optimal (23°C) and very sub-optimal (18°C) temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes) that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties. PMID:26859686

  15. Identification of Gene Modules Associated with Low Temperatures Response in Bambara Groundnut by Network-Based Analysis

    PubMed Central

    Bonthala, Venkata Suresh; Mayes, Katie; Moreton, Joanna; Blythe, Martin; Wright, Victoria; May, Sean Tobias; Massawe, Festo; Mayes, Sean; Twycross, Jamie

    2016-01-01

    Bambara groundnut (Vigna subterranea (L.) Verdc.) is an African legume and is a promising underutilized crop with good seed nutritional values. Low temperature stress in a number of African countries at night, such as Botswana, can effect the growth and development of bambara groundnut, leading to losses in potential crop yield. Therefore, in this study we developed a computational pipeline to identify and analyze the genes and gene modules associated with low temperature stress responses in bambara groundnut using the cross-species microarray technique (as bambara groundnut has no microarray chip) coupled with network-based analysis. Analyses of the bambara groundnut transcriptome using cross-species gene expression data resulted in the identification of 375 and 659 differentially expressed genes (p<0.01) under the sub-optimal (23°C) and very sub-optimal (18°C) temperatures, respectively, of which 110 genes are commonly shared between the two stress conditions. The construction of a Highest Reciprocal Rank-based gene co-expression network, followed by its partition using a Heuristic Cluster Chiseling Algorithm resulted in 6 and 7 gene modules in sub-optimal and very sub-optimal temperature stresses being identified, respectively. Modules of sub-optimal temperature stress are principally enriched with carbohydrate and lipid metabolic processes, while most of the modules of very sub-optimal temperature stress are significantly enriched with responses to stimuli and various metabolic processes. Several transcription factors (from MYB, NAC, WRKY, WHIRLY & GATA classes) that may regulate the downstream genes involved in response to stimulus in order for the plant to withstand very sub-optimal temperature stress were highlighted. The identified gene modules could be useful in breeding for low-temperature stress tolerant bambara groundnut varieties. PMID:26859686

  16. The transfer and transformation of collective network information in gene-matched networks

    PubMed Central

    Kitsukawa, Takashi; Yagi, Takeshi

    2015-01-01

    Networks, such as the human society network, social and professional networks, and biological system networks, contain vast amounts of information. Information signals in networks are distributed over nodes and transmitted through intricately wired links, making the transfer and transformation of such information difficult to follow. Here we introduce a novel method for describing network information and its transfer using a model network, the Gene-matched network (GMN), in which nodes (neurons) possess attributes (genes). In the GMN, nodes are connected according to their expression of common genes. Because neurons have multiple genes, the GMN is cluster-rich. We show that, in the GMN, information transfer and transformation were controlled systematically, according to the activity level of the network. Furthermore, information transfer and transformation could be traced numerically with a vector using genes expressed in the activated neurons, the active-gene array, which was used to assess the relative activity among overlapping neuronal groups. Interestingly, this coding style closely resembles the cell-assembly neural coding theory. The method introduced here could be applied to many real-world networks, since many systems, including human society and various biological systems, can be represented as a network of this type. PMID:26450411

  17. The transfer and transformation of collective network information in gene-matched networks.

    PubMed

    Kitsukawa, Takashi; Yagi, Takeshi

    2015-01-01

    Networks, such as the human society network, social and professional networks, and biological system networks, contain vast amounts of information. Information signals in networks are distributed over nodes and transmitted through intricately wired links, making the transfer and transformation of such information difficult to follow. Here we introduce a novel method for describing network information and its transfer using a model network, the Gene-matched network (GMN), in which nodes (neurons) possess attributes (genes). In the GMN, nodes are connected according to their expression of common genes. Because neurons have multiple genes, the GMN is cluster-rich. We show that, in the GMN, information transfer and transformation were controlled systematically, according to the activity level of the network. Furthermore, information transfer and transformation could be traced numerically with a vector using genes expressed in the activated neurons, the active-gene array, which was used to assess the relative activity among overlapping neuronal groups. Interestingly, this coding style closely resembles the cell-assembly neural coding theory. The method introduced here could be applied to many real-world networks, since many systems, including human society and various biological systems, can be represented as a network of this type. PMID:26450411

  18. Evolutionary and Topological Properties of Genes and Community Structures in Human Gene Regulatory Networks.

    PubMed

    Szedlak, Anthony; Smith, Nicholas; Liu, Li; Paternostro, Giovanni; Piermarocchi, Carlo

    2016-06-01

    The diverse, specialized genes present in today's lifeforms evolved from a common core of ancient, elementary genes. However, these genes did not evolve individually: gene expression is controlled by a complex network of interactions, and alterations in one gene may drive reciprocal changes in its proteins' binding partners. Like many complex networks, these gene regulatory networks (GRNs) are composed of communities, or clusters of genes with relatively high connectivity. A deep understanding of the relationship between the evolutionary history of single genes and the topological properties of the underlying GRN is integral to evolutionary genetics. Here, we show that the topological properties of an acute myeloid leukemia GRN and a general human GRN are strongly coupled with its genes' evolutionary properties. Slowly evolving ("cold"), old genes tend to interact with each other, as do rapidly evolving ("hot"), young genes. This naturally causes genes to segregate into community structures with relatively homogeneous evolutionary histories. We argue that gene duplication placed old, cold genes and communities at the center of the networks, and young, hot genes and communities at the periphery. We demonstrate this with single-node centrality measures and two new measures of efficiency, the set efficiency and the interset efficiency. We conclude that these methods for studying the relationships between a GRN's community structures and its genes' evolutionary properties provide new perspectives for understanding evolutionary genetics. PMID:27359334

  19. Functional Gene Networks: R/Bioc package to generate and analyse gene networks derived from functional enrichment and clustering

    PubMed Central

    Aibar, Sara; Fontanillo, Celia; Droste, Conrad; De Las Rivas, Javier

    2015-01-01

    Summary: Functional Gene Networks (FGNet) is an R/Bioconductor package that generates gene networks derived from the results of functional enrichment analysis (FEA) and annotation clustering. The sets of genes enriched with specific biological terms (obtained from a FEA platform) are transformed into a network by establishing links between genes based on common functional annotations and common clusters. The network provides a new view of FEA results revealing gene modules with similar functions and genes that are related to multiple functions. In addition to building the functional network, FGNet analyses the similarity between the groups of genes and provides a distance heatmap and a bipartite network of functionally overlapping genes. The application includes an interface to directly perform FEA queries using different external tools: DAVID, GeneTerm Linker, TopGO or GAGE; and a graphical interface to facilitate the use. Availability and implementation: FGNet is available in Bioconductor, including a tutorial. URL: http://bioconductor.org/packages/release/bioc/html/FGNet.html Contact: jrivas@usal.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25600944

  20. Inferring slowly-changing dynamic gene-regulatory networks

    PubMed Central

    2015-01-01

    Dynamic gene-regulatory networks are complex since the interaction patterns between their components mean that it is impossible to study parts of the network in separation. This holistic character of gene-regulatory networks poses a real challenge to any type of modelling. Graphical models are a class of models that connect the network with a conditional independence relationships between random variables. By interpreting these random variables as gene activities and the conditional independence relationships as functional non-relatedness, graphical models have been used to describe gene-regulatory networks. Whereas the literature has been focused on static networks, most time-course experiments are designed in order to tease out temporal changes in the underlying network. It is typically reasonable to assume that changes in genomic networks are few, because biological systems tend to be stable. We introduce a new model for estimating slow changes in dynamic gene-regulatory networks, which is suitable for high-dimensional data, e.g. time-course microarray data. Our aim is to estimate a dynamically changing genomic network based on temporal activity measurements of the genes in the network. Our method is based on the penalized likelihood with ℓ1-norm, that penalizes conditional dependencies between genes as well as differences between conditional independence elements across time points. We also present a heuristic search strategy to find optimal tuning parameters. We re-write the penalized maximum likelihood problem into a standard convex optimization problem subject to linear equality constraints. We show that our method performs well in simulation studies. Finally, we apply the proposed model to a time-course T-cell dataset. PMID:25917062

  1. GINI: From ISH Images to Gene Interaction Networks

    PubMed Central

    Puniyani, Kriti; Xing, Eric P.

    2013-01-01

    Accurate inference of molecular and functional interactions among genes, especially in multicellular organisms such as Drosophila, often requires statistical analysis of correlations not only between the magnitudes of gene expressions, but also between their temporal-spatial patterns. The ISH (in-situ-hybridization)-based gene expression micro-imaging technology offers an effective approach to perform large-scale spatial-temporal profiling of whole-body mRNA abundance. However, analytical tools for discovering gene interactions from such data remain an open challenge due to various reasons, including difficulties in extracting canonical representations of gene activities from images, and in inference of statistically meaningful networks from such representations. In this paper, we present GINI, a machine learning system for inferring gene interaction networks from Drosophila embryonic ISH images. GINI builds on a computer-vision-inspired vector-space representation of the spatial pattern of gene expression in ISH images, enabled by our recently developed system; and a new multi-instance-kernel algorithm that learns a sparse Markov network model, in which, every gene (i.e., node) in the network is represented by a vector-valued spatial pattern rather than a scalar-valued gene intensity as in conventional approaches such as a Gaussian graphical model. By capturing the notion of spatial similarity of gene expression, and at the same time properly taking into account the presence of multiple images per gene via multi-instance kernels, GINI is well-positioned to infer statistically sound, and biologically meaningful gene interaction networks from image data. Using both synthetic data and a small manually curated data set, we demonstrate the effectiveness of our approach in network building. Furthermore, we report results on a large publicly available collection of Drosophila embryonic ISH images from the Berkeley Drosophila Genome Project, where GINI makes novel and

  2. Time-Delayed Models of Gene Regulatory Networks

    PubMed Central

    Parmar, K.; Blyuss, K. B.; Kyrychko, Y. N.; Hogan, S. J.

    2015-01-01

    We discuss different mathematical models of gene regulatory networks as relevant to the onset and development of cancer. After discussion of alternative modelling approaches, we use a paradigmatic two-gene network to focus on the role played by time delays in the dynamics of gene regulatory networks. We contrast the dynamics of the reduced model arising in the limit of fast mRNA dynamics with that of the full model. The review concludes with the discussion of some open problems. PMID:26576197

  3. Co-expression of the carbamoyl-phosphate synthase 1 gene and its long non-coding RNA correlates with poor prognosis of patients with intrahepatic cholangiocarcinoma

    PubMed Central

    MA, SEN-LIN; LI, AI-JUN; HU, ZHAO-YANG; SHANG, FU-SHENG; WU, MENG-CHAO

    2015-01-01

    The mechanisms leading to high rates of malignancy and recurrence of human intrahepatic cholangiocarcinoma (ICC) remain unclear. It is difficult to diagnose and assess the prognosis of patients with ICC in the clinic due to the lack of specific biomarkers. In addition, long non-coding RNAs (lncRNAs) have been reported to serve important roles in certain types of tumorigenesis however a role in ICC remains to be reported. The aim of the current study was to screen for genes and lncRNAs that are abnormally expressed in ICC and to investigate their biological and clinicopathological significance in ICC. The global gene and lncRNA expression profiles in ICC were measured using bioinformatics analysis. Carbamoyl-phosphate synthase 1 (CPS1) and its lncRNA CPS1 intronic transcript 1 (CPS1-IT1) were observed to be upregulated in ICC. The expression of CPS1 and CPS1-IT1 was measured in 31 tissue samples from patients with ICC and a number of cell lines. The effects of CPS1 and CPS1-IT1 on the proliferation and apoptosis of the ICC-9810 cell line were measured. In addition, the clinicopathological features and survival rates of patients with ICC with respect to the gene and lncRNA expression status were analyzed. CPS1 and CPS1-IT1 were co-upregulated in ICC tissues compared with non-cancerous tissues. Knockdown of CPS1 andor CPS1-IT1 reduced the proliferation and increased the apoptosis of ICC-9810 cells. Additionally, clinical analysis indicated that CPS1 and CPS1-IT1 were associated with poor liver function and reduced survival rates when the relative expression values were greater than 4 in cancer tissues. The comparisons between the high CPS1 expression group and the low expression group indicated significant differences in international normalized ratio (P=0.048), total protein (P=0.049), indirect bilirubin (P=0.025), alkaline phosphatase (P=0.003) and disease-free survival (P=0.034). In addition, there were differential trends in CA19-9 (P=0.068), globulin (P=0

  4. Phenotype accessibility and noise in random threshold gene regulatory networks.

    PubMed

    Pinho, Ricardo; Garcia, Victor; Feldman, Marcus W

    2014-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  5. Phenotype Accessibility and Noise in Random Threshold Gene Regulatory Networks

    PubMed Central

    Feldman, Marcus W.

    2015-01-01

    Evolution requires phenotypic variation in a population of organisms for selection to function. Gene regulatory processes involved in organismal development affect the phenotypic diversity of organisms. Since only a fraction of all possible phenotypes are predicted to be accessed by the end of development, organisms may evolve strategies to use environmental cues and noise-like fluctuations to produce additional phenotypic diversity, and hence to enhance the speed of adaptation. We used a generic model of organismal development --gene regulatory networks-- to investigate how different levels of noise on gene expression states (i.e. phenotypes) may affect access to new, unique phenotypes, thereby affecting phenotypic diversity. We studied additional strategies that organisms might adopt to attain larger phenotypic diversity: either by augmenting their genome or the number of gene expression states. This was done for different types of gene regulatory networks that allow for distinct levels of regulatory influence on gene expression or are more likely to give rise to stable phenotypes. We found that if gene expression is binary, increasing noise levels generally decreases phenotype accessibility for all network types studied. If more gene expression states are considered, noise can moderately enhance the speed of discovery if three or four gene expression states are allowed, and if there are enough distinct regulatory networks in the population. These results were independent of the network types analyzed, and were robust to different implementations of noise. Hence, for noise to increase the number of accessible phenotypes in gene regulatory networks, very specific conditions need to be satisfied. If the number of distinct regulatory networks involved in organismal development is large enough, and the acquisition of more genes or fine tuning of their expression states proves costly to the organism, noise can be useful in allowing access to more unique phenotypes

  6. The propagation of perturbations in rewired bacterial gene networks

    PubMed Central

    Baumstark, Rebecca; Hänzelmann, Sonja; Tsuru, Saburo; Schaerli, Yolanda; Francesconi, Mirko; Mancuso, Francesco M.; Castelo, Robert; Isalan, Mark

    2015-01-01

    What happens to gene expression when you add new links to a gene regulatory network? To answer this question, we profile 85 network rewirings in E. coli. Here we report that concerted patterns of differential expression propagate from reconnected hub genes. The rewirings link promoter regions to different transcription factor and σ-factor genes, resulting in perturbations that span four orders of magnitude, changing up to ∼70% of the transcriptome. Importantly, factor connectivity and promoter activity both associate with perturbation size. Perturbations from related rewirings have more similar transcription profiles and a statistical analysis reveals ∼20 underlying states of the system, associating particular gene groups with rewiring constructs. We examine two large clusters (ribosomal and flagellar genes) in detail. These represent alternative global outcomes from different rewirings because of antagonism between these major cell states. This data set of systematically related perturbations enables reverse engineering and discovery of underlying network interactions. PMID:26670742

  7. Regulatory gene networks and the properties of the developmental process

    NASA Technical Reports Server (NTRS)

    Davidson, Eric H.; McClay, David R.; Hood, Leroy

    2003-01-01

    Genomic instructions for development are encoded in arrays of regulatory DNA. These specify large networks of interactions among genes producing transcription factors and signaling components. The architecture of such networks both explains and predicts developmental phenomenology. Although network analysis is yet in its early stages, some fundamental commonalities are already emerging. Two such are the use of multigenic feedback loops to ensure the progressivity of developmental regulatory states and the prevalence of repressive regulatory interactions in spatial control processes. Gene regulatory networks make it possible to explain the process of development in causal terms and eventually will enable the redesign of developmental regulatory circuitry to achieve different outcomes.

  8. A genetically engineered live-attenuated simian-human immunodeficiency virus that co-expresses the RANTES gene improves the magnitude of cellular immunity in rhesus macaques

    SciTech Connect

    Shimizu, Yuya; Inaba, Katsuhisa; Kaneyasu, Kentaro; Ibuki, Kentaro; Himeno, Ai; Okoba, Masashi; Goto, Yoshitaka; Hayami, Masanori; Miura, Tomoyuki; Haga, Takeshi . E-mail: a0d518u@cc.miyazaki-u.ac.jp

    2007-04-25

    Regulated-on-activation-normal-T-cell-expressed-and-secreted (RANTES), a CC-chemokine, enhances antigen-specific T helper (Th) type-1 responses against HIV-1. To evaluate the adjuvant effects of RANTES against HIV vaccine candidate in SHIV-macaque models, we genetically engineered a live-attenuated SHIV to express the RANTES gene (SHIV-RANTES) and characterized the virus's properties in vivo. After the vaccination, the plasma viral loads were same in the SHIV-RANTES-inoculated monkeys and the parental nef-deleted SHIV (SHIV-NI)-inoculated monkeys. SHIV-RANTES provided some immunity in monkeys by remarkably increasing the antigen-specific CD4{sup +} Th cell-proliferative response and by inducing an antigen-specific IFN-{gamma} ELISpot response. The magnitude of the immunity in SHIV-RANTES-immunized animals, however, failed to afford greater protection against a heterologous pathogenic SHIV (SHIV-C2/1) challenge compared to control SHIV-NI-immunized animals. SHIV-RANTES immunized monkeys, elicited robust cellular CD4{sup +} Th responses and IFN-{gamma} ELISpot responses after SHIV-C2/1 challenge. These findings suggest that the chemokine RANTES can augment vaccine-elicited, HIV-specific CD4{sup +} T cell responses.

  9. PoplarGene: poplar gene network and resource for mining functional information for genes from woody plants

    PubMed Central

    Liu, Qi; Ding, Changjun; Chu, Yanguang; Chen, Jiafei; Zhang, Weixi; Zhang, Bingyu; Huang, Qinjun; Su, Xiaohua

    2016-01-01

    Poplar is not only an important resource for the production of paper, timber and other wood-based products, but it has also emerged as an ideal model system for studying woody plants. To better understand the biological processes underlying various traits in poplar, e.g., wood development, a comprehensive functional gene interaction network is highly needed. Here, we constructed a genome-wide functional gene network for poplar (covering ~70% of the 41,335 poplar genes) and created the network web service PoplarGene, offering comprehensive functional interactions and extensive poplar gene functional annotations. PoplarGene incorporates two network-based gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization, which can be used to perform gene prioritization in a complementary manner. Furthermore, the co-functional information in PoplarGene can be applied to other woody plant proteomes with high efficiency via orthology transfer. In addition to poplar gene sequences, the webserver also accepts Arabidopsis reference gene as input to guide the search for novel candidate functional genes in PoplarGene. We believe that PoplarGene (http://bioinformatics.caf.ac.cn/PoplarGene and http://124.127.201.25/PoplarGene) will greatly benefit the research community, facilitating studies of poplar and other woody plants. PMID:27515999

  10. PoplarGene: poplar gene network and resource for mining functional information for genes from woody plants.

    PubMed

    Liu, Qi; Ding, Changjun; Chu, Yanguang; Chen, Jiafei; Zhang, Weixi; Zhang, Bingyu; Huang, Qinjun; Su, Xiaohua

    2016-01-01

    Poplar is not only an important resource for the production of paper, timber and other wood-based products, but it has also emerged as an ideal model system for studying woody plants. To better understand the biological processes underlying various traits in poplar, e.g., wood development, a comprehensive functional gene interaction network is highly needed. Here, we constructed a genome-wide functional gene network for poplar (covering ~70% of the 41,335 poplar genes) and created the network web service PoplarGene, offering comprehensive functional interactions and extensive poplar gene functional annotations. PoplarGene incorporates two network-based gene prioritization algorithms, neighborhood-based prioritization and context-based prioritization, which can be used to perform gene prioritization in a complementary manner. Furthermore, the co-functional information in PoplarGene can be applied to other woody plant proteomes with high efficiency via orthology transfer. In addition to poplar gene sequences, the webserver also accepts Arabidopsis reference gene as input to guide the search for novel candidate functional genes in PoplarGene. We believe that PoplarGene (http://bioinformatics.caf.ac.cn/PoplarGene and http://124.127.201.25/PoplarGene) will greatly benefit the research community, facilitating studies of poplar and other woody plants. PMID:27515999

  11. In silico network topology-based prediction of gene essentiality

    NASA Astrophysics Data System (ADS)

    da Silva, João Paulo Müller; Acencio, Marcio Luis; Mombach, José Carlos Merino; Vieira, Renata; da Silva, José Camargo; Lemke, Ney; Sinigaglia, Marialva

    2008-02-01

    The identification of genes essential for survival is important for the understanding of the minimal requirements for cellular life and for drug design. As experimental studies with the purpose of building a catalog of essential genes for a given organism are time-consuming and laborious, a computational approach which could predict gene essentiality with high accuracy would be of great value. We present here a novel computational approach, called NTPGE (Network Topology-based Prediction of Gene Essentiality), that relies on the network topology features of a gene to estimate its essentiality. The first step of NTPGE is to construct the integrated molecular network for a given organism comprising protein physical, metabolic and transcriptional regulation interactions. The second step consists in training a decision-tree-based machine-learning algorithm on known essential and non-essential genes of the organism of interest, considering as learning attributes the network topology information for each of these