Mallik, Saurav; Zhao, Zhongming
2017-12-28
For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
Ping, Yanyan; Deng, Yulan; Wang, Li; Zhang, Hongyi; Zhang, Yong; Xu, Chaohan; Zhao, Hongying; Fan, Huihui; Yu, Fulong; Xiao, Yun; Li, Xia
2015-01-01
The driver genetic aberrations collectively regulate core cellular processes underlying cancer development. However, identifying the modules of driver genetic alterations and characterizing their functional mechanisms are still major challenges for cancer studies. Here, we developed an integrative multi-omics method CMDD to identify the driver modules and their affecting dysregulated genes through characterizing genetic alteration-induced dysregulated networks. Applied to glioblastoma (GBM), the CMDD identified a core gene module of 17 genes, including seven known GBM drivers, and their dysregulated genes. The module showed significant association with shorter survival of GBM. When classifying driver genes in the module into two gene sets according to their genetic alteration patterns, we found that one gene set directly participated in the glioma pathway, while the other indirectly regulated the glioma pathway, mostly, via their dysregulated genes. Both of the two gene sets were significant contributors to survival and helpful for classifying GBM subtypes, suggesting their critical roles in GBM pathogenesis. Also, by applying the CMDD to other six cancers, we identified some novel core modules associated with overall survival of patients. Together, these results demonstrate integrative multi-omics data can identify driver modules and uncover their dysregulated genes, which is useful for interpreting cancer genome. PMID:25653168
2013-01-01
Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize deregulated genes and group them into gene modules by simultaneously considering gene expression level changes and gene-gene co-regulations. When applied to both simulated and empirical data, nDGE outperforms the traditional DGE method. More specifically, when applied to smoker and non-smoker lung cancer sets, nDGE results illustrate the molecular differences between smoker and non-smoker lung cancer. PMID:24341432
Hsiao, Tzu-Hung; Chiu, Yu-Chiao; Hsu, Pei-Yin; Lu, Tzu-Pin; Lai, Liang-Chuan; Tsai, Mong-Hsun; Huang, Tim H.-M.; Chuang, Eric Y.; Chen, Yidong
2016-01-01
Several mutual information (MI)-based algorithms have been developed to identify dynamic gene-gene and function-function interactions governed by key modulators (genes, proteins, etc.). Due to intensive computation, however, these methods rely heavily on prior knowledge and are limited in genome-wide analysis. We present the modulated gene/gene set interaction (MAGIC) analysis to systematically identify genome-wide modulation of interaction networks. Based on a novel statistical test employing conjugate Fisher transformations of correlation coefficients, MAGIC features fast computation and adaption to variations of clinical cohorts. In simulated datasets MAGIC achieved greatly improved computation efficiency and overall superior performance than the MI-based method. We applied MAGIC to construct the estrogen receptor (ER) modulated gene and gene set (representing biological function) interaction networks in breast cancer. Several novel interaction hubs and functional interactions were discovered. ER+ dependent interaction between TGFβ and NFκB was further shown to be associated with patient survival. The findings were verified in independent datasets. Using MAGIC, we also assessed the essential roles of ER modulation in another hormonal cancer, ovarian cancer. Overall, MAGIC is a systematic framework for comprehensively identifying and constructing the modulated interaction networks in a whole-genome landscape. MATLAB implementation of MAGIC is available for academic uses at https://github.com/chiuyc/MAGIC. PMID:26972162
Bian, Zhong-Rui; Yin, Juan; Sun, Wen; Lin, Dian-Jie
2017-04-01
Diagnose of active tuberculosis (TB) is challenging and treatment response is also difficult to efficiently monitor. The aim of this study was to use an integrated analysis of microarray and network-based method to the samples from publically available datasets to obtain a diagnostic module set and pathways in active TB. Towards this goal, background protein-protein interactions (PPI) network was generated based on global PPI information and gene expression data, following by identification of differential expression network (DEN) from the background PPI network. Then, ego genes were extracted according to the degree features in DEN. Next, module collection was conducted by ego gene expansion based on EgoNet algorithm. After that, differential expression of modules between active TB and controls was evaluated using random permutation test. Finally, biological significance of differential modules was detected by pathways enrichment analysis based on Reactome database, and Fisher's exact test was implemented to extract differential pathways for active TB. Totally, 47 ego genes and 47 candidate modules were identified from the DEN. By setting the cutoff-criteria of gene size >5 and classification accuracy ≥0.9, 7 ego modules (Module 4, Module 7, Module 9, Module 19, Module 25, Module 38 and Module 43) were extracted, and all of them had the statistical significance between active TB and controls. Then, Fisher's exact test was conducted to capture differential pathways for active TB. Interestingly, genes in Module 4, Module 25, Module 38, and Module 43 were enriched in the same pathway, formation of a pool of free 40S subunits. Significant pathway for Module 7 and Module 9 was eukaryotic translation termination, and for Module 19 was nonsense mediated decay enhanced by the exon junction complex (EJC). Accordingly, differential modules and pathways might be potential biomarkers for treating active TB, and provide valuable clues for better understanding of molecular mechanism of active TB. Copyright © 2017 Elsevier Ltd. All rights reserved.
Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology
2010-01-01
Background In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships. Results The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches. Conclusions The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms. PMID:21172053
Yu, Liang; Wang, Bingbo; Ma, Xiaoke; Gao, Lin
2016-12-23
Extracting drug-disease correlations is crucial in unveiling disease mechanisms, as well as discovering new indications of available drugs, or drug repositioning. Both the interactome and the knowledge of disease-associated and drug-associated genes remain incomplete. We present a new method to predict the associations between drugs and diseases. Our method is based on a module distance, which is originally proposed to calculate distances between modules in incomplete human interactome. We first map all the disease genes and drug genes to a combined protein interaction network. Then based on the module distance, we calculate the distances between drug gene sets and disease gene sets, and take the distances as the relationships of drug-disease pairs. We also filter possible false positive drug-disease correlations by p-value. Finally, we validate the top-100 drug-disease associations related to six drugs in the predicted results. The overlapping between our predicted correlations with those reported in Comparative Toxicogenomics Database (CTD) and literatures, and their enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways demonstrate our approach can not only effectively identify new drug indications, but also provide new insight into drug-disease discovery.
Zhang, Wensheng; Edwards, Andrea; Fan, Wei; Zhu, Dongxiao; Zhang, Kun
2010-06-22
Comparative analysis of gene expression profiling of multiple biological categories, such as different species of organisms or different kinds of tissue, promises to enhance the fundamental understanding of the universality as well as the specialization of mechanisms and related biological themes. Grouping genes with a similar expression pattern or exhibiting co-expression together is a starting point in understanding and analyzing gene expression data. In recent literature, gene module level analysis is advocated in order to understand biological network design and system behaviors in disease and life processes; however, practical difficulties often lie in the implementation of existing methods. Using the singular value decomposition (SVD) technique, we developed a new computational tool, named svdPPCS (SVD-based Pattern Pairing and Chart Splitting), to identify conserved and divergent co-expression modules of two sets of microarray experiments. In the proposed methods, gene modules are identified by splitting the two-way chart coordinated with a pair of left singular vectors factorized from the gene expression matrices of the two biological categories. Importantly, the cutoffs are determined by a data-driven algorithm using the well-defined statistic, SVD-p. The implementation was illustrated on two time series microarray data sets generated from the samples of accessory gland (ACG) and malpighian tubule (MT) tissues of the line W118 of M. drosophila. Two conserved modules and six divergent modules, each of which has a unique characteristic profile across tissue kinds and aging processes, were identified. The number of genes contained in these models ranged from five to a few hundred. Three to over a hundred GO terms were over-represented in individual modules with FDR < 0.1. One divergent module suggested the tissue-specific relationship between the expressions of mitochondrion-related genes and the aging process. This finding, together with others, may be of biological significance. The validity of the proposed SVD-based method was further verified by a simulation study, as well as the comparisons with regression analysis and cubic spline regression analysis plus PAM based clustering. svdPPCS is a novel computational tool for the comparative analysis of transcriptional profiling. It especially fits the comparison of time series data of related organisms or different tissues of the same organism under equivalent or similar experimental conditions. The general scheme can be directly extended to the comparisons of multiple data sets. It also can be applied to the integration of data sets from different platforms and of different sources.
Ficklin, Stephen P.; Luo, Feng; Feltus, F. Alex
2010-01-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes. PMID:20668062
Ficklin, Stephen P; Luo, Feng; Feltus, F Alex
2010-09-01
Discovering gene sets underlying the expression of a given phenotype is of great importance, as many phenotypes are the result of complex gene-gene interactions. Gene coexpression networks, built using a set of microarray samples as input, can help elucidate tightly coexpressed gene sets (modules) that are mixed with genes of known and unknown function. Functional enrichment analysis of modules further subdivides the coexpressed gene set into cofunctional gene clusters that may coexist in the module with other functionally related gene clusters. In this study, 45 coexpressed gene modules and 76 cofunctional gene clusters were discovered for rice (Oryza sativa) using a global, knowledge-independent paradigm and the combination of two network construction methodologies. Some clusters were enriched for previously characterized mutant phenotypes, providing evidence for specific gene sets (and their annotated molecular functions) that underlie specific phenotypes.
Shannon, Casey P; Chen, Virginia; Takhar, Mandeep; Hollander, Zsuzsanna; Balshaw, Robert; McManus, Bruce M; Tebbutt, Scott J; Sin, Don D; Ng, Raymond T
2016-11-14
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms. The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases. The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
Kringel, Dario; Lippmann, Catharina; Parnham, Michael J; Kalso, Eija; Ultsch, Alfred; Lötsch, Jörn
2018-06-19
Human genetic research has implicated functional variants of more than one hundred genes in the modulation of persisting pain. Artificial intelligence and machine learning techniques may combine this knowledge with results of genetic research gathered in any context, which permits the identification of the key biological processes involved in chronic sensitization to pain. Based on published evidence, a set of 110 genes carrying variants reported to be associated with modulation of the clinical phenotype of persisting pain in eight different clinical settings was submitted to unsupervised machine-learning aimed at functional clustering. Subsequently, a mathematically supported subset of genes, comprising those most consistently involved in persisting pain, was analyzed by means of computational functional genomics in the Gene Ontology knowledgebase. Clustering of genes with evidence for a modulation of persisting pain elucidated a functionally heterogeneous set. The situation cleared when the focus was narrowed to a genetic modulation consistently observed throughout several clinical settings. On this basis, two groups of biological processes, the immune system and nitric oxide signaling, emerged as major players in sensitization to persisting pain, which is biologically highly plausible and in agreement with other lines of pain research. The present computational functional genomics-based approach provided a computational systems-biology perspective on chronic sensitization to pain. Human genetic control of persisting pain points to the immune system as a source of potential future targets for drugs directed against persisting pain. Contemporary machine-learned methods provide innovative approaches to knowledge discovery from previous evidence. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Clique-based data mining for related genes in a biomedical database.
Matsunaga, Tsutomu; Yonemori, Chikara; Tomita, Etsuji; Muramatsu, Masaaki
2009-07-01
Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.
Van Loo, Peter; Aerts, Stein; Thienpont, Bernard; De Moor, Bart; Moreau, Yves; Marynen, Peter
2008-01-01
We present ModuleMiner, a novel algorithm for computationally detecting cis-regulatory modules (CRMs) in a set of co-expressed genes. ModuleMiner outperforms other methods for CRM detection on benchmark data, and successfully detects CRMs in tissue-specific microarray clusters and in embryonic development gene sets. Interestingly, CRM predictions for differentiated tissues exhibit strong enrichment close to the transcription start site, whereas CRM predictions for embryonic development gene sets are depleted in this region. PMID:18394174
Liu, Rong; Guo, Cheng-Xian; Zhou, Hong-Hao
2015-01-01
This study aims to identify effective gene networks and prognostic biomarkers associated with estrogen receptor positive (ER+) breast cancer using human mRNA studies. Weighted gene coexpression network analysis was performed with a complex ER+ breast cancer transcriptome to investigate the function of networks and key genes in the prognosis of breast cancer. We found a significant correlation of an expression module with distant metastasis-free survival (HR = 2.25; 95% CI .21.03-4.88 in discovery set; HR = 1.78; 95% CI = 1.07-2.93 in validation set). This module contained genes enriched in the biological process of the M phase. From this module, we further identified and validated 5 hub genes (CDK1, DLGAP5, MELK, NUSAP1, and RRM2), the expression levels of which were strongly associated with poor survival. Highly expressed MELK indicated poor survival in luminal A and luminal B breast cancer molecular subtypes. This gene was also found to be associated with tamoxifen resistance. Results indicated that a network-based approach may facilitate the discovery of biomarkers for the prognosis of ER+ breast cancer and may also be used as a basis for establishing personalized therapies. Nevertheless, before the application of this approach in clinical settings, in vivo and in vitro experiments and multi-center randomized controlled clinical trials are still needed.
Comparison of co-expression measures: mutual information, correlation, and model based indices.
Song, Lin; Langfelder, Peter; Horvath, Steve
2012-12-09
Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships. Our results indicate that MI networks can safely be replaced by correlation networks when it comes to measuring co-expression relationships in stationary data.
Recognition of digital characteristics based new improved genetic algorithm
NASA Astrophysics Data System (ADS)
Wang, Meng; Xu, Guoqiang; Lin, Zihao
2017-08-01
In the field of digital signal processing, Estimating the characteristics of signal modulation parameters is an significant research direction. The paper determines the set of eigenvalue which can show the difference of the digital signal modulation based on the deep research of the new improved genetic algorithm. Firstly take them as the best gene pool; secondly, The best gene pool will be changed in the genetic evolvement by selecting, overlapping and eliminating each other; Finally, Adapting the strategy of futher enhance competition and punishment to more optimizer the gene pool and ensure each generation are of high quality gene. The simulation results show that this method not only has the global convergence, stability and faster convergence speed.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Akram, Pakeeza; Liao, Li
2017-12-06
Identification of common genes associated with comorbid diseases can be critical in understanding their pathobiological mechanism. This work presents a novel method to predict missing common genes associated with a disease pair. Searching for missing common genes is formulated as an optimization problem to minimize network based module separation from two subgraphs produced by mapping genes associated with disease onto the interactome. Using cross validation on more than 600 disease pairs, our method achieves significantly higher average receiver operating characteristic ROC Score of 0.95 compared to a baseline ROC score 0.60 using randomized data. Missing common genes prediction is aimed to complete gene set associated with comorbid disease for better understanding of biological intervention. It will also be useful for gene targeted therapeutics related to comorbid diseases. This method can be further considered for prediction of missing edges to complete the subgraph associated with disease pair.
Costas, Javier; Paramo, Mario; Arrojo, Manuel
2018-01-01
Abstract Background Genomic research has revealed that schizophrenia is a highly polygenic disease. Recent estimates indicate that at least 71% of genomic segments of 1 Mb include one or more risk loci for schizophrenia (Loh et al., Nature Genet 2015). This extremely high polygenicity represents a challenge to decipher the biological basis of schizophrenia, as it is expected that any set of SNPs with enough size will be associated with the disorder. Among the different gene sets available for study (such as those from Gene Ontology, KEGG pathway, Reactome pathways or protein protein interaction datasets), those based on brain co-expression networks represent putative functional relationships in the relevant tissue. The aim of this work was to identify brain co-expression networks that contribute disproportionately to the common polygenic risk for schizophrenia to get more insight on schizophrenia etiopathology. Methods We analyzed a case -control dataset consisting of 582 schizophrenia patients from Galicia, NW Spain, and 591 ancestrally matched controls, genotyped with the Illumina PsychArray. Using as discovery sample the summary results from the largest GWAS of schizophrenia to date (Psychiatric Genomics Consortium, SCZ2), we generated polygenic risk scores (PRS) in our sample based on SNPs located at genes belonging to brain co-expression modules determined by the CommonMind Consortium (Fromer et al., Nature Neurosci 2016). PRS were generated using the clumping procedure of PLINK, considering several different thresholds to select SNPs from the discovery sample. In order to test if any specific module increased risk to schizophrenia more than expected by their size, we generated up to 10,000 random permutations of the same number of SNPs, matched by frequency, distance to nearest gene, number of SNPs in LD and gene density, using SNPsnap. Results As expected, most modules with enough number of independent SNPs belonging to them showed a significant increase in Nagelkerke’s R2 in our case-control sample after the addition of the module-specific PRS in a logistic regression model. Our permutation strategy revealed that most modules did not show an excess of risk, measured by increase in Nagelkerke’s R2, in comparison to equal number of SNPs with similar characteristics. But one module, M2c from Fromer et al., remained highly significant after multiple tests’ correction. Reactome pathways analysis revealed an over-representation of genes involved in “Neuronal System” and “Axon guidance” among genes from this module. Using the same protocol, we detected that the 84 genes from the neuronal system pathway at this module, representing less than 6% of the genes from the module, explained a higher level of risk than expected. “Voltage-gated Potassium channels” and “Neurexins and neuroligins” are overrepresented among the Neuronal System genes from module M2c. Discussion Here, we show that, in spite of the high polygenicity of schizophrenia, it is possible to identify gene sets contributing disproportionately to total risk, as it was the case for the M2c module from Fromer et al. These authors have previously reported that the M2c module was enriched in GWAS signals, as well as CNVs and rare variants associated with schizophrenia. Therefore, this module shows a disproportionately contribution to schizophrenia risk. Study supported by Grant PI14/01020 from Instituto de Salud Carlos III, Ministry of Health, Spanish Government.
When is hub gene selection better than standard meta-analysis?
Langfelder, Peter; Mischel, Paul S; Horvath, Steve
2013-01-01
Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to gene expression data and presents novel R functions for carrying out consensus network analysis, network based screening, and meta analysis.
MAVTgsa: An R Package for Gene Set (Enrichment) Analysis
Chien, Chih-Yi; Chang, Ching-Wei; Tsai, Chen-An; ...
2014-01-01
Gene semore » t analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the P values and FDR (false discovery rate) q -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.« less
A human functional protein interaction network and its application to cancer data analysis
2010-01-01
Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
Uddin, Raihan; Singh, Shiva M.
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning. PMID:29066959
Uddin, Raihan; Singh, Shiva M
2017-01-01
As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they provide a new insight and generate new hypotheses into the molecular mechanisms responsible for age associated learning impairment, including spatial learning.
Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu
2017-01-01
Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663
Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong
2014-05-15
In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.
Discovering functional modules by topic modeling RNA-Seq based toxicogenomic data.
Yu, Ke; Gong, Binsheng; Lee, Mikyung; Liu, Zhichao; Xu, Joshua; Perkins, Roger; Tong, Weida
2014-09-15
Toxicogenomics (TGx) endeavors to elucidate the underlying molecular mechanisms through exploring gene expression profiles in response to toxic substances. Recently, RNA-Seq is increasingly regarded as a more powerful alternative to microarrays in TGx studies. However, realizing RNA-Seq's full potential requires novel approaches to extracting information from the complex TGx data. Considering read counts as the number of times a word occurs in a document, gene expression profiles from RNA-Seq are analogous to a word by document matrix used in text mining. Topic modeling aiming at to discover the latent structures in text corpora would be helpful to explore RNA-Seq based TGx data. In this study, topic modeling was applied on a typical RNA-Seq based TGx data set to discover hidden functional modules. The RNA-Seq based gene expression profiles were transformed into "documents", on which latent Dirichlet allocation (LDA) was used to build a topic model. We found samples treated by the compounds with the same modes of actions (MoAs) could be clustered based on topic similarities. The topic most relevant to each cluster was identified as a "marker" topic, which was interpreted by gene enrichment analysis with MoAs then confirmed by compound and pathways associations mined from literature. To further validate the "marker" topics, we tested topic transferability from RNA-Seq to microarrays. The RNA-Seq based gene expression profile of a topic specifically associated with peroxisome proliferator-activated receptors (PPAR) signaling pathway was used to query samples with similar expression profiles in two different microarray data sets, yielding accuracy of about 85%. This proof-of-concept study demonstrates the applicability of topic modeling to discover functional modules in RNA-Seq data and suggests a valuable computational tool for leveraging information within TGx data in RNA-Seq era.
Han, Junwei; Shang, Desi; Zhang, Yunpeng; Zhang, Wei; Yao, Qianlan; Han, Lei; Xu, Yanjun; Yan, Wei; Bao, Zhaoshi; You, Gan; Jiang, Tao; Kang, Chunsheng; Li, Xia
2014-01-01
The prognosis of glioma patients is usually poor, especially in patients with glioblastoma (World Health Organization (WHO) grade IV). The regulatory functions of microRNA (miRNA) on genes have important implications in glioma cell survival. However, there are not many studies that have investigated glioma survival by integrating miRNAs and genes while also considering pathway structure. In this study, we performed sample-matched miRNA and mRNA expression profilings to systematically analyze glioma patient survival. During this analytical process, we developed pathway-based random walk to identify a glioma core miRNA-gene module, simultaneously considering pathway structure information and multi-level involvement of miRNAs and genes. The core miRNA-gene module we identified was comprised of four apparent sub-modules; all four sub-modules displayed a significant correlation with patient survival in the testing set (P-values≤0.001). Notably, one sub-module that consisted of 6 miRNAs and 26 genes also correlated with survival time in the high-grade subgroup (WHO grade III and IV), P-value = 0.0062. Furthermore, the 26-gene expression signature from this sub-module had robust predictive power in four independent, publicly available glioma datasets. Our findings suggested that the expression signatures, which were identified by integration of miRNA and gene level, were closely associated with overall survival among the glioma patients with various grades. PMID:24809850
Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong
2015-01-01
Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.
Seok, Junhee; Davis, Ronald W.; Xiao, Wenzhong
2015-01-01
Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn’t been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge. PMID:25933378
GOMA: functional enrichment analysis tool based on GO modules
Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun
2013-01-01
Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results. PMID:23237213
Kakati, Tulika; Kashyap, Hirak; Bhattacharyya, Dhruba K
2016-11-30
There exist many tools and methods for construction of co-expression network from gene expression data and for extraction of densely connected gene modules. In this paper, a method is introduced to construct co-expression network and to extract co-expressed modules having high biological significance. The proposed method has been validated on several well known microarray datasets extracted from a diverse set of species, using statistical measures, such as p and q values. The modules obtained in these studies are found to be biologically significant based on Gene Ontology enrichment analysis, pathway analysis, and KEGG enrichment analysis. Further, the method was applied on an Alzheimer's disease dataset and some interesting genes are found, which have high semantic similarity among them, but are not significantly correlated in terms of expression similarity. Some of these interesting genes, such as MAPT, CASP2, and PSEN2, are linked with important aspects of Alzheimer's disease, such as dementia, increase cell death, and deposition of amyloid-beta proteins in Alzheimer's disease brains. The biological pathways associated with Alzheimer's disease, such as, Wnt signaling, Apoptosis, p53 signaling, and Notch signaling, incorporate these interesting genes. The proposed method is evaluated in regard to existing literature.
Kakati, Tulika; Kashyap, Hirak; Bhattacharyya, Dhruba K.
2016-01-01
There exist many tools and methods for construction of co-expression network from gene expression data and for extraction of densely connected gene modules. In this paper, a method is introduced to construct co-expression network and to extract co-expressed modules having high biological significance. The proposed method has been validated on several well known microarray datasets extracted from a diverse set of species, using statistical measures, such as p and q values. The modules obtained in these studies are found to be biologically significant based on Gene Ontology enrichment analysis, pathway analysis, and KEGG enrichment analysis. Further, the method was applied on an Alzheimer’s disease dataset and some interesting genes are found, which have high semantic similarity among them, but are not significantly correlated in terms of expression similarity. Some of these interesting genes, such as MAPT, CASP2, and PSEN2, are linked with important aspects of Alzheimer’s disease, such as dementia, increase cell death, and deposition of amyloid-beta proteins in Alzheimer’s disease brains. The biological pathways associated with Alzheimer’s disease, such as, Wnt signaling, Apoptosis, p53 signaling, and Notch signaling, incorporate these interesting genes. The proposed method is evaluated in regard to existing literature. PMID:27901073
Pre-Clinical Drug Prioritization via Prognosis-Guided Genetic Interaction Networks
Xiong, Jianghui; Liu, Juan; Rayner, Simon; Tian, Ze; Li, Yinghui; Chen, Shanguang
2010-01-01
The high rates of failure in oncology drug clinical trials highlight the problems of using pre-clinical data to predict the clinical effects of drugs. Patient population heterogeneity and unpredictable physiology complicate pre-clinical cancer modeling efforts. We hypothesize that gene networks associated with cancer outcome in heterogeneous patient populations could serve as a reference for identifying drug effects. Here we propose a novel in vivo genetic interaction which we call ‘synergistic outcome determination’ (SOD), a concept similar to ‘Synthetic Lethality’. SOD is defined as the synergy of a gene pair with respect to cancer patients' outcome, whose correlation with outcome is due to cooperative, rather than independent, contributions of genes. The method combines microarray gene expression data with cancer prognostic information to identify synergistic gene-gene interactions that are then used to construct interaction networks based on gene modules (a group of genes which share similar function). In this way, we identified a cluster of important epigenetically regulated gene modules. By projecting drug sensitivity-associated genes on to the cancer-specific inter-module network, we defined a perturbation index for each drug based upon its characteristic perturbation pattern on the inter-module network. Finally, by calculating this index for compounds in the NCI Standard Agent Database, we significantly discriminated successful drugs from a broad set of test compounds, and further revealed the mechanisms of drug combinations. Thus, prognosis-guided synergistic gene-gene interaction networks could serve as an efficient in silico tool for pre-clinical drug prioritization and rational design of combinatorial therapies. PMID:21085674
In silico pathway analysis in cervical carcinoma reveals potential new targets for treatment
van Dam, Peter A.; van Dam, Pieter-Jan H. H.; Rolfo, Christian; Giallombardo, Marco; van Berckelaer, Christophe; Trinh, Xuan Bich; Altintas, Sevilay; Huizing, Manon; Papadimitriou, Kostas; Tjalma, Wiebren A. A.; van Laere, Steven
2016-01-01
An in silico pathway analysis was performed in order to improve current knowledge on the molecular drivers of cervical cancer and detect potential targets for treatment. Three publicly available Affymetrix gene expression data-sets (GSE5787, GSE7803, GSE9750) were retrieved, vouching for a total of 9 cervical cancer cell lines (CCCLs), 39 normal cervical samples, 7 CIN3 samples and 111 cervical cancer samples (CCSs). Predication analysis of microarrays was performed in the Affymetrix sets to identify cervical cancer biomarkers. To select cancer cell-specific genes the CCSs were compared to the CCCLs. Validated genes were submitted to a gene set enrichment analysis (GSEA) and Expression2Kinases (E2K). In the CCSs a total of 1,547 probe sets were identified that were overexpressed (FDR < 0.1). Comparing to CCCLs 560 probe sets (481 unique genes) had a cancer cell-specific expression profile, and 315 of these genes (65%) were validated. GSEA identified 5 cancer hallmarks enriched in CCSs (P < 0.01 and FDR < 0.25) showing that deregulation of the cell cycle is a major component of cervical cancer biology. E2K identified a protein-protein interaction (PPI) network of 162 nodes (including 20 drugable kinases) and 1626 edges. This PPI-network consists of 5 signaling modules associated with MYC signaling (Module 1), cell cycle deregulation (Module 2), TGFβ-signaling (Module 3), MAPK signaling (Module 4) and chromatin modeling (Module 5). Potential targets for treatment which could be identified were CDK1, CDK2, ABL1, ATM, AKT1, MAPK1, MAPK3 among others. The present study identified important driver pathways in cervical carcinogenesis which should be assessed for their potential therapeutic drugability. PMID:26701206
Co, Aila L.; Hay, Ariel M.; MacDonald, James W.; Bammler, Theo K.; Farin, Federico M.; Costa, Lucio G.; Furlong, Clement E.
2014-01-01
Chlorpyrifos oxon (CPO), the toxic metabolite of the organophosphorus (OP) insecticide chlorpyrifos, causes developmental neurotoxicity in humans and rodents. CPO is hydrolyzed by paraoxonase-1 (PON1), with protection determined by PON1 levels and the human Q192R polymorphism. To examine how the Q192R polymorphism influences fetal toxicity associated with gestational CPO exposure, we measured enzyme inhibition and fetal-brain gene expression in wild-type (PON1+/+), PON1-knockout (PON1−/−), and tgHuPON1R192 and tgHuPON1Q192 transgenic mice. Pregnant mice exposed dermally to 0, 0.50, 0.75, or 0.85 mg/kg/d CPO from gestational day (GD) 6 through 17 were sacrificed on GD18. Biomarkers of CPO exposure inhibited in maternal tissues included brain acetylcholinesterase (AChE), red blood cell acylpeptide hydrolase (APH), and plasma butyrylcholinesterase (BChE) and carboxylesterase (CES). Fetal plasma BChE was inhibited in PON1−/− and tgHuPON1Q192, but not PON1+/+ or tgHuPON1R192 mice. Fetal brain AChE and plasma CES were inhibited in PON1−/− mice, but not in other genotypes. Weighted gene co-expression network analysis identified five gene modules based on clustering of the correlations among their fetal-brain expression values, allowing for correlation of module membership with the phenotypic data on enzyme inhibition. One module that correlated highly with maternal brain AChE activity had a large representation of homeobox genes. Gene set enrichment analysis revealed multiple gene sets affected by gestational CPO exposure in tgHuPON1Q192 but not tgHuPON1R192 mice, including gene sets involved in protein export, lipid metabolism, and neurotransmission. These data indicate that maternal PON1 status modulates the effects of repeated gestational CPO exposure on fetal-brain gene expression and on inhibition of both maternal and fetal biomarker enzymes. PMID:25070982
The promises and pitfalls of RNA-interference-based therapeutics
Castanotto, Daniela; Rossi, John J.
2009-01-01
The discovery that gene expression can be controlled by the Watson–Crick base-pairing of small RNAs with messenger RNAs containing complementary sequence — a process known as RNA interference — has markedly advanced our understanding of eukaryotic gene regulation and function. The ability of short RNA sequences to modulate gene expression has provided a powerful tool with which to study gene function and is set to revolutionize the treatment of disease. Remarkably, despite being just one decade from its discovery, the phenomenon is already being used therapeutically in human clinical trials, and biotechnology companies that focus on RNA-interference-based therapeutics are already publicly traded. PMID:19158789
Fu, X; Sun, Y; Wang, J; Xing, Q; Zou, J; Li, R; Wang, Z; Wang, S; Hu, X; Zhang, L; Bao, Z
2014-01-01
Marine organisms are commonly exposed to variable environmental conditions, and many of them are under threat from increased sea temperatures caused by global climate change. Generating transcriptomic resources under different stress conditions are crucial for understanding molecular mechanisms underlying thermal adaptation. In this study, we conducted transcriptome-wide gene expression profiling of the scallop Chlamys farreri challenged by acute and chronic heat stress. Of the 13 953 unique tags, more than 850 were significantly differentially expressed at each time point after acute heat stress, which was more than the number of tags differentially expressed (320-350) under chronic heat stress. To obtain a systemic view of gene expression alterations during thermal stress, a weighted gene coexpression network was constructed. Six modules were identified as acute heat stress-responsive modules. Among them, four modules involved in apoptosis regulation, mRNA binding, mitochondrial envelope formation and oxidation reduction were downregulated. The remaining two modules were upregulated. One was enriched with chaperone and the other with microsatellite sequences, whose coexpression may originate from a transcription factor binding site. These results indicated that C. farreri triggered several cellular processes to acclimate to elevated temperature. No modules responded to chronic heat stress, suggesting that the scallops might have acclimated to elevated temperature within 3 days. This study represents the first sequencing-based gene network analysis in a nonmodel aquatic species and provides valuable gene resources for the study of thermal adaptation, which should assist in the development of heat-tolerant scallop lines for aquaculture. © 2013 John Wiley & Sons Ltd.
Conley, P B; Lemaux, P G; Lomax, T L; Grossman, A R
1986-01-01
The polypeptide composition of the phycobilisome, the major light-harvesting complex of prokaryotic cyanobacteria and certain eukaryotic algae, can be modulated by different light qualities in cyanobacteria exhibiting chromatic adaptation. We have identified genomic fragments encoding a cluster of phycobilisome polypeptides (phycobiliproteins) from the chromatically adapting cyanobacterium Fremyella diplosiphon using previously characterized DNA fragments of phycobiliprotein genes from the eukaryotic alga Cyanophora paradoxa and from F. diplosiphon. Characterization of two lambda-EMBL3 clones containing overlapping genomic fragments indicates that three sets of phycobiliprotein genes--the alpha- and beta-allophycocyanin genes plus two sets of alpha- and beta-phycocyanin genes--are clustered within 13 kilobases on the cyanobacterial genome and transcribed off the same strand. The gene order (alpha-allophycocyanin followed by beta-allophycocyanin and beta-phycocyanin followed by alpha-phycocyanin) appears to be a conserved arrangement found previously in a eukaryotic alga and another cyanobacterium. We have reported that one set of phycocyanin genes is transcribed as two abundant red light-induced mRNAs (1600 and 3800 bases). We now present data showing that the allophycocyanin genes and a second set of phycocyanin genes are transcribed into major mRNAs of 1400 and 1600 bases, respectively. These transcripts are present in RNA isolated from cultures grown in red and green light, although lower levels of the 1600-base phycocyanin transcript are present in cells grown in green light. Furthermore, a larger transcript of 1750 bases hybridizes to the allophycocyanin genes and may be a precursor to the 1400-base species. Images PMID:3086870
Tian, Honglai; Guan, Donghui; Li, Jianmin
2018-06-01
Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.
Borowsky, Alexander T.
2017-01-01
Plants produce diverse specialized metabolites (SMs), but the genes responsible for their production and regulation remain largely unknown, hindering efforts to tap plant pharmacopeia. Given that genes comprising SM pathways exhibit environmentally dependent coregulation, we hypothesized that genes within a SM pathway would form tight associations (modules) with each other in coexpression networks, facilitating their identification. To evaluate this hypothesis, we used 10 global coexpression data sets, each a meta-analysis of hundreds to thousands of experiments, across eight plant species to identify hundreds of coexpressed gene modules per data set. In support of our hypothesis, 15.3 to 52.6% of modules contained two or more known SM biosynthetic genes, and module genes were enriched in SM functions. Moreover, modules recovered many experimentally validated SM pathways, including all six known to form biosynthetic gene clusters (BGCs). In contrast, bioinformatically predicted BGCs (i.e., those lacking an associated metabolite) were no more coexpressed than the null distribution for neighboring genes. These results suggest that most predicted plant BGCs are not genuine SM pathways and argue that BGCs are not a hallmark of plant specialized metabolism. We submit that global gene coexpression is a rich, largely untapped resource for discovering the genetic basis and architecture of plant natural products. PMID:28408660
ISAAC - InterSpecies Analysing Application using Containers.
Baier, Herbert; Schultz, Jörg
2014-01-15
Information about genes, transcripts and proteins is spread over a wide variety of databases. Different tools have been developed using these databases to identify biological signals in gene lists from large scale analysis. Mostly, they search for enrichments of specific features. But, these tools do not allow an explorative walk through different views and to change the gene lists according to newly upcoming stories. To fill this niche, we have developed ISAAC, the InterSpecies Analysing Application using Containers. The central idea of this web based tool is to enable the analysis of sets of genes, transcripts and proteins under different biological viewpoints and to interactively modify these sets at any point of the analysis. Detailed history and snapshot information allows tracing each action. Furthermore, one can easily switch back to previous states and perform new analyses. Currently, sets can be viewed in the context of genomes, protein functions, protein interactions, pathways, regulation, diseases and drugs. Additionally, users can switch between species with an automatic, orthology based translation of existing gene sets. As todays research usually is performed in larger teams and consortia, ISAAC provides group based functionalities. Here, sets as well as results of analyses can be exchanged between members of groups. ISAAC fills the gap between primary databases and tools for the analysis of large gene lists. With its highly modular, JavaEE based design, the implementation of new modules is straight forward. Furthermore, ISAAC comes with an extensive web-based administration interface including tools for the integration of third party data. Thus, a local installation is easily feasible. In summary, ISAAC is tailor made for highly explorative interactive analyses of gene, transcript and protein sets in a collaborative environment.
Computational dissection of human episodic memory reveals mental process-specific genetic profiles
Luksys, Gediminas; Fastenrath, Matthias; Coynel, David; Freytag, Virginie; Gschwind, Leo; Heck, Angela; Jessen, Frank; Maier, Wolfgang; Milnik, Annette; Riedel-Heller, Steffi G.; Scherer, Martin; Spalek, Klara; Vogler, Christian; Wagner, Michael; Wolfsgruber, Steffen; Papassotiropoulos, Andreas; de Quervain, Dominique J.-F.
2015-01-01
Episodic memory performance is the result of distinct mental processes, such as learning, memory maintenance, and emotional modulation of memory strength. Such processes can be effectively dissociated using computational models. Here we performed gene set enrichment analyses of model parameters estimated from the episodic memory performance of 1,765 healthy young adults. We report robust and replicated associations of the amine compound SLC (solute-carrier) transporters gene set with the learning rate, of the collagen formation and transmembrane receptor protein tyrosine kinase activity gene sets with the modulation of memory strength by negative emotional arousal, and of the L1 cell adhesion molecule (L1CAM) interactions gene set with the repetition-based memory improvement. Furthermore, in a large functional MRI sample of 795 subjects we found that the association between L1CAM interactions and memory maintenance revealed large clusters of differences in brain activity in frontal cortical areas. Our findings provide converging evidence that distinct genetic profiles underlie specific mental processes of human episodic memory. They also provide empirical support to previous theoretical and neurobiological studies linking specific neuromodulators to the learning rate and linking neural cell adhesion molecules to memory maintenance. Furthermore, our study suggests additional memory-related genetic pathways, which may contribute to a better understanding of the neurobiology of human memory. PMID:26261317
Computational dissection of human episodic memory reveals mental process-specific genetic profiles.
Luksys, Gediminas; Fastenrath, Matthias; Coynel, David; Freytag, Virginie; Gschwind, Leo; Heck, Angela; Jessen, Frank; Maier, Wolfgang; Milnik, Annette; Riedel-Heller, Steffi G; Scherer, Martin; Spalek, Klara; Vogler, Christian; Wagner, Michael; Wolfsgruber, Steffen; Papassotiropoulos, Andreas; de Quervain, Dominique J-F
2015-09-01
Episodic memory performance is the result of distinct mental processes, such as learning, memory maintenance, and emotional modulation of memory strength. Such processes can be effectively dissociated using computational models. Here we performed gene set enrichment analyses of model parameters estimated from the episodic memory performance of 1,765 healthy young adults. We report robust and replicated associations of the amine compound SLC (solute-carrier) transporters gene set with the learning rate, of the collagen formation and transmembrane receptor protein tyrosine kinase activity gene sets with the modulation of memory strength by negative emotional arousal, and of the L1 cell adhesion molecule (L1CAM) interactions gene set with the repetition-based memory improvement. Furthermore, in a large functional MRI sample of 795 subjects we found that the association between L1CAM interactions and memory maintenance revealed large clusters of differences in brain activity in frontal cortical areas. Our findings provide converging evidence that distinct genetic profiles underlie specific mental processes of human episodic memory. They also provide empirical support to previous theoretical and neurobiological studies linking specific neuromodulators to the learning rate and linking neural cell adhesion molecules to memory maintenance. Furthermore, our study suggests additional memory-related genetic pathways, which may contribute to a better understanding of the neurobiology of human memory.
Gu, Yunyan; Wang, Hongwei; Qin, Yao; Zhang, Yujing; Zhao, Wenyuan; Qi, Lishuang; Zhang, Yuannv; Wang, Chenguang; Guo, Zheng
2013-03-01
The heterogeneity of genetic alterations in human cancer genomes presents a major challenge to advancing our understanding of cancer mechanisms and identifying cancer driver genes. To tackle this heterogeneity problem, many approaches have been proposed to investigate genetic alterations and predict driver genes at the individual pathway level. However, most of these approaches ignore the correlation of alteration events between pathways and miss many genes with rare alterations collectively contributing to carcinogenesis. Here, we devise a network-based approach to capture the cooperative functional modules hidden in genome-wide somatic mutation and copy number alteration profiles of glioblastoma (GBM) from The Cancer Genome Atlas (TCGA), where a module is a set of altered genes with dense interactions in the protein interaction network. We identify 7 pairs of significantly co-altered modules that involve the main pathways known to be altered in GBM (TP53, RB and RTK signaling pathways) and highlight the striking co-occurring alterations among these GBM pathways. By taking into account the non-random correlation of gene alterations, the property of co-alteration could distinguish oncogenic modules that contain driver genes involved in the progression of GBM. The collaboration among cancer pathways suggests that the redundant models and aggravating models could shed new light on the potential mechanisms during carcinogenesis and provide new indications for the design of cancer therapeutic strategies.
DOSim: an R package for similarity between diseases based on Disease Ontology.
Li, Jiang; Gong, Binsheng; Chen, Xi; Liu, Tao; Wu, Chao; Zhang, Fan; Li, Chunquan; Li, Xiang; Rao, Shaoqi; Li, Xia
2011-06-29
The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and pathways. The DOSim package can also be used to visualise DO structure. DOSim can reflect the modular characteristic of disease related genes and promote our understanding of the complex pathogenesis of diseases. DOSim is available on the Comprehensive R Archive Network (CRAN) or http://bioinfo.hrbmu.edu.cn/dosim.
Aging effects on DNA methylation modules in human brain and blood tissue
2012-01-01
Background Several recent studies reported aging effects on DNA methylation levels of individual CpG dinucleotides. But it is not yet known whether aging-related consensus modules, in the form of clusters of correlated CpG markers, can be found that are present in multiple human tissues. Such a module could facilitate the understanding of aging effects on multiple tissues. Results We therefore employed weighted correlation network analysis of 2,442 Illumina DNA methylation arrays from brain and blood tissues, which enabled the identification of an age-related co-methylation module. Module preservation analysis confirmed that this module can also be found in diverse independent data sets. Biological evaluation showed that module membership is associated with Polycomb group target occupancy counts, CpG island status and autosomal chromosome location. Functional enrichment analysis revealed that the aging-related consensus module comprises genes that are involved in nervous system development, neuron differentiation and neurogenesis, and that it contains promoter CpGs of genes known to be down-regulated in early Alzheimer's disease. A comparison with a standard, non-module based meta-analysis revealed that selecting CpGs based on module membership leads to significantly increased gene ontology enrichment, thus demonstrating that studying aging effects via consensus network analysis enhances the biological insights gained. Conclusions Overall, our analysis revealed a robustly defined age-related co-methylation module that is present in multiple human tissues, including blood and brain. We conclude that blood is a promising surrogate for brain tissue when studying the effects of age on DNA methylation profiles. PMID:23034122
Wisecaver, Jennifer H; Borowsky, Alexander T; Tzin, Vered; Jander, Georg; Kliebenstein, Daniel J; Rokas, Antonis
2017-05-01
Plants produce diverse specialized metabolites (SMs), but the genes responsible for their production and regulation remain largely unknown, hindering efforts to tap plant pharmacopeia. Given that genes comprising SM pathways exhibit environmentally dependent coregulation, we hypothesized that genes within a SM pathway would form tight associations (modules) with each other in coexpression networks, facilitating their identification. To evaluate this hypothesis, we used 10 global coexpression data sets, each a meta-analysis of hundreds to thousands of experiments, across eight plant species to identify hundreds of coexpressed gene modules per data set. In support of our hypothesis, 15.3 to 52.6% of modules contained two or more known SM biosynthetic genes, and module genes were enriched in SM functions. Moreover, modules recovered many experimentally validated SM pathways, including all six known to form biosynthetic gene clusters (BGCs). In contrast, bioinformatically predicted BGCs (i.e., those lacking an associated metabolite) were no more coexpressed than the null distribution for neighboring genes. These results suggest that most predicted plant BGCs are not genuine SM pathways and argue that BGCs are not a hallmark of plant specialized metabolism. We submit that global gene coexpression is a rich, largely untapped resource for discovering the genetic basis and architecture of plant natural products. © 2017 American Society of Plant Biologists. All rights reserved.
Discovery of error-tolerant biclusters from noisy gene expression data.
Gupta, Rohit; Rao, Navneet; Kumar, Vipin
2011-11-24
An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, which limits its applicability in real-life data sets where the biclusters may be fragmented due to random noise/errors. Moreover, as they only work with binary or boolean attributes, their application on gene-expression data require transforming real-valued attributes to binary attributes, which often results in loss of information. Many past approaches have tried to address the issue of noise and handling real-valued attributes independently but there is no systematic approach that addresses both of these issues together. In this paper, we first propose a novel error-tolerant biclustering model, 'ET-bicluster', and then propose a bottom-up heuristic-based mining algorithm to sequentially discover error-tolerant biclusters directly from real-valued gene-expression data. The efficacy of our proposed approach is illustrated by comparing it with a recent approach RAP in the context of two biological problems: discovery of functional modules and discovery of biomarkers. For the first problem, two real-valued S.Cerevisiae microarray gene-expression data sets are used to demonstrate that the biclusters obtained from ET-bicluster approach not only recover larger set of genes as compared to those obtained from RAP approach but also have higher functional coherence as evaluated using the GO-based functional enrichment analysis. The statistical significance of the discovered error-tolerant biclusters as estimated by using two randomization tests, reveal that they are indeed biologically meaningful and statistically significant. For the second problem of biomarker discovery, we used four real-valued Breast Cancer microarray gene-expression data sets and evaluate the biomarkers obtained using MSigDB gene sets. The results obtained for both the problems: functional module discovery and biomarkers discovery, clearly signifies the usefulness of the proposed ET-bicluster approach and illustrate the importance of explicitly incorporating noise/errors in discovering coherent groups of genes from gene-expression data.
Vadigepalli, Rajanikanth; Chakravarthula, Praveen; Zak, Daniel E; Schwaber, James S; Gonye, Gregory E
2003-01-01
We have developed a bioinformatics tool named PAINT that automates the promoter analysis of a given set of genes for the presence of transcription factor binding sites. Based on coincidence of regulatory sites, this tool produces an interaction matrix that represents a candidate transcriptional regulatory network. This tool currently consists of (1) a database of promoter sequences of known or predicted genes in the Ensembl annotated mouse genome database, (2) various modules that can retrieve and process the promoter sequences for binding sites of known transcription factors, and (3) modules for visualization and analysis of the resulting set of candidate network connections. This information provides a substantially pruned list of genes and transcription factors that can be examined in detail in further experimental studies on gene regulation. Also, the candidate network can be incorporated into network identification methods in the form of constraints on feasible structures in order to render the algorithms tractable for large-scale systems. The tool can also produce output in various formats suitable for use in external visualization and analysis software. In this manuscript, PAINT is demonstrated in two case studies involving analysis of differentially regulated genes chosen from two microarray data sets. The first set is from a neuroblastoma N1E-115 cell differentiation experiment, and the second set is from neuroblastoma N1E-115 cells at different time intervals following exposure to neuropeptide angiotensin II. PAINT is available for use as an agent in BioSPICE simulation and analysis framework (www.biospice.org), and can also be accessed via a WWW interface at www.dbi.tju.edu/dbi/tools/paint/.
Chang, Xiao; Liu, Shuai; Yu, Yong-Tao; Li, Yi-Xue; Li, Yuan-Yuan
2010-08-12
The Saccharopolyspora erythraea genome sequence was released in 2007. In order to look at the gene regulations at whole transcriptome level, an expression microarray was specifically designed on the S. erythraea strain NRRL 2338 genome sequence. Based on these data, we set out to investigate the potential transcriptional regulatory networks and their organization. In view of the hierarchical structure of bacterial transcriptional regulation, we constructed a hierarchical coexpression network at whole transcriptome level. A total of 27 modules were identified from 1255 differentially expressed transcript units (TUs) across time course, which were further classified in to four groups. Functional enrichment analysis indicated the biological significance of our hierarchical network. It was indicated that primary metabolism is activated in the first rapid growth phase (phase A), and secondary metabolism is induced when the growth is slowed down (phase B). Among the 27 modules, two are highly correlated to erythromycin production. One contains all genes in the erythromycin-biosynthetic (ery) gene cluster and the other seems to be associated with erythromycin production by sharing common intermediate metabolites. Non-concomitant correlation between production and expression regulation was observed. Especially, by calculating the partial correlation coefficients and building the network based on Gaussian graphical model, intrinsic associations between modules were found, and the association between those two erythromycin production-correlated modules was included as expected. This work created a hierarchical model clustering transcriptome data into coordinated modules, and modules into groups across the time course, giving insight into the concerted transcriptional regulations especially the regulation corresponding to erythromycin production of S. erythraea. This strategy may be extendable to studies on other prokaryotic microorganisms.
No3CoGP: non-conserved and conserved coexpressed gene pairs.
Mal, Chittabrata; Aftabuddin, Md; Kundu, Sudip
2014-12-08
Analyzing the microarray data of different conditions, one can identify the conserved and condition-specific genes and gene modules, and thus can infer the underlying cellular activities. All the available tools based on Bioconductor and R packages differ in how they extract differential coexpression and at what level they study. There is a need for a user-friendly, flexible tool which can start analysis using raw or preprocessed microarray data and can report different levels of useful information. We present a GUI software, No3CoGP: Non-Conserved and Conserved Coexpressed Gene Pairs which takes Affymetrix microarray data (.CEL files or log2 normalized.txt files) along with annotation file (.csv file), Chip Definition File (CDF file) and probe file as inputs, utilizes the concept of network density cut-off and Fisher's z-test to extract biologically relevant information. It can identify four possible types of gene pairs based on their coexpression relationships. These are (i) gene pair showing coexpression in one condition but not in the other, (ii) gene pair which is positively coexpressed in one condition but negatively coexpressed in the other condition, (iii) positively and (iv) negatively coexpressed in both the conditions. Further, it can generate modules of coexpressed genes. Easy-to-use GUI interface enables researchers without knowledge in R language to use No3CoGP. Utilization of one or more CPU cores, depending on the availability, speeds up the program. The output files stored in the respective directories under the user-defined project offer the researchers to unravel condition-specific functionalities of gene, gene sets or modules.
Ficklin, Stephen P; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.
Ficklin, Stephen P.; Feltus, Frank Alex
2013-01-01
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance. PMID:23874666
Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu
2017-01-10
VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne
2014-01-01
Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules and GWAS results for providing novel and complementary approaches to investigate the molecular pathology of MDD and other complex brain disorders.
Network Approach to Disease Diagnosis
NASA Astrophysics Data System (ADS)
Sharma, Amitabh; Bashan, Amir; Barabasi, Alber-Laszlo
2014-03-01
Human diseases could be viewed as perturbations of the underlying biological system. A thorough understanding of the topological and dynamical properties of the biological system is crucial to explain the mechanisms of many complex diseases. Recently network-based approaches have provided a framework for integrating multi-dimensional biological data that results in a better understanding of the pathophysiological state of complex diseases. Here we provide a network-based framework to improve the diagnosis of complex diseases. This framework is based on the integration of transcriptomics and the interactome. We analyze the overlap between the differentially expressed (DE) genes and disease genes (DGs) based on their locations in the molecular interaction network (''interactome''). Disease genes and their protein products tend to be much more highly connected than random, hence defining a disease sub-graph (called disease module) in the interactome. DE genes, even though different from the known set of DGs, may be significantly associated with the disease when considering their closeness to the disease module in the interactome. This new network approach holds the promise to improve the diagnosis of patients who cannot be diagnosed using conventional tools. Support was provided by HL066289 and HL105339 grants from the U.S. National Institutes of Health.
A gene expression biomarker accurately predicts estrogen ...
The EPA’s vision for the Endocrine Disruptor Screening Program (EDSP) in the 21st Century (EDSP21) includes utilization of high-throughput screening (HTS) assays coupled with computational modeling to prioritize chemicals with the goal of eventually replacing current Tier 1 screening tests. The ToxCast program currently includes 18 HTS in vitro assays that evaluate the ability of chemicals to modulate estrogen receptor α (ERα), an important endocrine target. We propose microarray-based gene expression profiling as a complementary approach to predict ERα modulation and have developed computational methods to identify ERα modulators in an existing database of whole-genome microarray data. The ERα biomarker consisted of 46 ERα-regulated genes with consistent expression patterns across 7 known ER agonists and 3 known ER antagonists. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression data sets from experiments in MCF-7 cells. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% or 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) OECD ER reference chemicals including “very weak” agonists and replicated predictions based on 18 in vitro ER-associated HTS assays. For 114 chemicals present in both the HTS data and the MCF-7 c
WGCNA: an R package for weighted correlation network analysis.
Langfelder, Peter; Horvath, Steve
2008-12-29
Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA.
WGCNA: an R package for weighted correlation network analysis
Langfelder, Peter; Horvath, Steve
2008-01-01
Background Correlation networks are increasingly being used in bioinformatics applications. For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples. Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets. These methods have been successfully applied in various biological contexts, e.g. cancer, mouse genetics, yeast genetics, and analysis of brain imaging data. While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial. Results The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. Along with the R package we also present R software tutorials. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings. Conclusion The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data. The R package along with its source code and additional material are freely available at . PMID:19114008
He, Hao; Zhang, Lei; Li, Jian; Wang, Yu-Ping; Zhang, Ji-Gang; Shen, Jie; Guo, Yan-Fang
2014-01-01
Context: To date, few systems genetics studies in the bone field have been performed. We designed our study from a systems-level perspective by integrating genome-wide association studies (GWASs), human protein-protein interaction (PPI) network, and gene expression to identify gene modules contributing to osteoporosis risk. Methods: First we searched for modules significantly enriched with bone mineral density (BMD)-associated genes in human PPI network by using 2 large meta-analysis GWAS datasets through a dense module search algorithm. One included 7 individual GWAS samples (Meta7). The other was from the Genetic Factors for Osteoporosis Consortium (GEFOS2). One was assigned as a discovery dataset and the other as an evaluation dataset, and vice versa. Results: In total, 42 modules and 129 modules were identified significantly in both Meta7 and GEFOS2 datasets for femoral neck and spine BMD, respectively. There were 3340 modules identified for hip BMD only in Meta7. As candidate modules, they were assessed for the biological relevance to BMD by gene set enrichment analysis in 2 expression profiles generated from circulating monocytes in subjects with low versus high BMD values. Interestingly, there were 2 modules significantly enriched in monocytes from the low BMD group in both gene expression datasets (nominal P value <.05). Two modules had 16 nonredundant genes. Functional enrichment analysis revealed that both modules were enriched for genes involved in Wnt receptor signaling and osteoblast differentiation. Conclusion: We highlighted 2 modules and novel genes playing important roles in the regulation of bone mass, providing important clues for therapeutic approaches for osteoporosis. PMID:25119315
Grassi, Angela; Di Camillo, Barbara; Ciccarese, Francesco; Agnusdei, Valentina; Zanovello, Paola; Amadori, Alberto; Finesso, Lorenzo; Indraccolo, Stefano; Toffolo, Gianna Maria
2016-03-12
Inference of gene regulation from expression data may help to unravel regulatory mechanisms involved in complex diseases or in the action of specific drugs. A challenging task for many researchers working in the field of systems biology is to build up an experiment with a limited budget and produce a dataset suitable to reconstruct putative regulatory modules worth of biological validation. Here, we focus on small-scale gene expression screens and we introduce a novel experimental set-up and a customized method of analysis to make inference on regulatory modules starting from genetic perturbation data, e.g. knockdown and overexpression data. To illustrate the utility of our strategy, it was applied to produce and analyze a dataset of quantitative real-time RT-PCR data, in which interferon-α (IFN-α) transcriptional response in endothelial cells is investigated by RNA silencing of two candidate IFN-α modulators, STAT1 and IFIH1. A putative regulatory module was reconstructed by our method, revealing an intriguing feed-forward loop, in which STAT1 regulates IFIH1 and they both negatively regulate IFNAR1. STAT1 regulation on IFNAR1 was object of experimental validation at the protein level. Detailed description of the experimental set-up and of the analysis procedure is reported, with the intent to be of inspiration for other scientists who want to realize similar experiments to reconstruct gene regulatory modules starting from perturbations of possible regulators. Application of our approach to the study of IFN-α transcriptional response modulators in endothelial cells has led to many interesting novel findings and new biological hypotheses worth of validation.
Identifying prognostic signature in ovarian cancer using DirGenerank
Wang, Jian-Yong; Chen, Ling-Ling; Zhou, Xiong-Hui
2017-01-01
Identifying the prognostic genes in cancer is essential not only for the treatment of cancer patients, but also for drug discovery. However, it's still a big challenge to select the prognostic genes that can distinguish the risk of cancer patients across various data sets because of tumor heterogeneity. In this situation, the selected genes whose expression levels are statistically related to prognostic risks may be passengers. In this paper, based on gene expression data and prognostic data of ovarian cancer patients, we used conditional mutual information to construct gene dependency network in which the nodes (genes) with more out-degrees have more chances to be the modulators of cancer prognosis. After that, we proposed DirGenerank (Generank in direct netowrk) algorithm, which concerns both the gene dependency network and genes’ correlations to prognostic risks, to identify the gene signature that can predict the prognostic risks of ovarian cancer patients. Using ovarian cancer data set from TCGA (The Cancer Genome Atlas) as training data set, 40 genes with the highest importance were selected as prognostic signature. Survival analysis of these patients divided by the prognostic signature in testing data set and four independent data sets showed the signature can distinguish the prognostic risks of cancer patients significantly. Enrichment analysis of the signature with curated cancer genes and the drugs selected by CMAP showed the genes in the signature may be drug targets for therapy. In summary, we have proposed a useful pipeline to identify prognostic genes of cancer patients. PMID:28615526
Nandi, Sutanu; Subramanian, Abhishek; Sarkar, Ram Rup
2017-07-25
Prediction of essential genes helps to identify a minimal set of genes that are absolutely required for the appropriate functioning and survival of a cell. The available machine learning techniques for essential gene prediction have inherent problems, like imbalanced provision of training datasets, biased choice of the best model for a given balanced dataset, choice of a complex machine learning algorithm, and data-based automated selection of biologically relevant features for classification. Here, we propose a simple support vector machine-based learning strategy for the prediction of essential genes in Escherichia coli K-12 MG1655 metabolism that integrates a non-conventional combination of an appropriate sample balanced training set, a unique organism-specific genotype, phenotype attributes that characterize essential genes, and optimal parameters of the learning algorithm to generate the best machine learning model (the model with the highest accuracy among all the models trained for different sample training sets). For the first time, we also introduce flux-coupled metabolic subnetwork-based features for enhancing the classification performance. Our strategy proves to be superior as compared to previous SVM-based strategies in obtaining a biologically relevant classification of genes with high sensitivity and specificity. This methodology was also trained with datasets of other recent supervised classification techniques for essential gene classification and tested using reported test datasets. The testing accuracy was always high as compared to the known techniques, proving that our method outperforms known methods. Observations from our study indicate that essential genes are conserved among homologous bacterial species, demonstrate high codon usage bias, GC content and gene expression, and predominantly possess a tendency to form physiological flux modules in metabolism.
A cis-regulatory logic simulator.
Zeigler, Robert D; Gertz, Jason; Cohen, Barak A
2007-07-27
A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence. We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence. We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated gene expression data sets will facilitate the direct comparison of computational strategies to predict gene expression from promoter sequence. The source code is available online and as additional material. The test sets are available as additional material.
Lee, Mikyung; Kim, Yangseok
2009-12-16
Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.
Genomic Locus Modulating IOP in the BXD RI Mouse Strains
King, Rebecca; Li, Ying; Wang, Jiaxing; Struebing, Felix L.; Geisert, Eldon E.
2018-01-01
Intraocular pressure (IOP) is the primary risk factor for developing glaucoma, yet little is known about the contribution of genomic background to IOP regulation. The present study leverages an array of systems genetics tools to study genomic factors modulating normal IOP in the mouse. The BXD recombinant inbred (RI) strain set was used to identify genomic loci modulating IOP. We measured the IOP in a total of 506 eyes from 38 different strains. Strain averages were subjected to conventional quantitative trait analysis by means of composite interval mapping. Candidate genes were defined, and immunohistochemistry and quantitative PCR (qPCR) were used for validation. Of the 38 BXD strains examined the mean IOP ranged from a low of 13.2mmHg to a high of 17.1mmHg. The means for each strain were used to calculate a genome wide interval map. One significant quantitative trait locus (QTL) was found on Chr.8 (96 to 103 Mb). Within this 7 Mb region only 4 annotated genes were found: Gm15679, Cdh8, Cdh11 and Gm8730. Only two genes (Cdh8 and Cdh11) were candidates for modulating IOP based on the presence of non-synonymous SNPs. Further examination using SIFT (Sorting Intolerant From Tolerant) analysis revealed that the SNPs in Cdh8 (Cadherin 8) were predicted to not change protein function; while the SNPs in Cdh11 (Cadherin 11) would not be tolerated, affecting protein function. Furthermore, immunohistochemistry demonstrated that CDH11 is expressed in the trabecular meshwork of the mouse. We have examined the genomic regulation of IOP in the BXD RI strain set and found one significant QTL on Chr. 8. Within this QTL, there is one good candidate gene, Cdh11. PMID:29496776
Li, Yiping; Li, Yanhong; Bai, Zhenjiang; Pan, Jian; Wang, Jian; Fang, Fang
2017-12-13
Sepsis represents a complex disease with the dysregulated inflammatory response and high mortality rate. The goal of this study was to identify potential transcriptomic markers in developing pediatric sepsis by a co-expression module analysis of the transcriptomic dataset. Using the R software and Bioconductor packages, we performed a weighted gene co-expression network analysis to identify co-expression modules significantly associated with pediatric sepsis. Functional interpretation (gene ontology and pathway analysis) and enrichment analysis with known transcription factors and microRNAs of the identified candidate modules were then performed. In modules significantly associated with sepsis, the intramodular analysis was further performed and "hub genes" were identified and validated by quantitative real-time PCR (qPCR) in this study. 15 co-expression modules in total were detected, and four modules ("midnight blue", "cyan", "brown", and "tan") were most significantly associated with pediatric sepsis and suggested as potential sepsis-associated modules. Gene ontology analysis and pathway analysis revealed that these four modules strongly associated with immune response. Three of the four sepsis-associated modules were also enriched with known transcription factors (false discovery rate-adjusted P < 0.05). Hub genes were identified in each of the four modules. Four of the identified hub genes (MYB proto-oncogene like 1, killer cell lectin like receptor G1, stomatin, and membrane spanning 4-domains A4A) were further validated to be differentially expressed between septic children and controls by qPCR. Four pediatric sepsis-associated co-expression modules were identified in this study. qPCR results suggest that hub genes in these modules are potential transcriptomic markers for pediatric sepsis diagnosis. These results provide novel insights into the pathogenesis of pediatric sepsis and promote the generation of diagnostic gene sets.
Molecular mechanisms of floral organ specification by MADS domain proteins.
Yan, Wenhao; Chen, Dijun; Kaufmann, Kerstin
2016-02-01
Flower development is a model system to understand organ specification in plants. The identities of different types of floral organs are specified by homeotic MADS transcription factors that interact in a combinatorial fashion. Systematic identification of DNA-binding sites and target genes of these key regulators show that they have shared and unique sets of target genes. DNA binding by MADS proteins is not based on 'simple' recognition of a specific DNA sequence, but depends on DNA structure and combinatorial interactions. Homeotic MADS proteins regulate gene expression via alternative mechanisms, one of which may be to modulate chromatin structure and accessibility in their target gene promoters. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genomic approaches for the elucidation of genes and gene networks underlying cardiovascular traits.
Adriaens, M E; Bezzina, C R
2018-06-22
Genome-wide association studies have shed light on the association between natural genetic variation and cardiovascular traits. However, linking a cardiovascular trait associated locus to a candidate gene or set of candidate genes for prioritization for follow-up mechanistic studies is all but straightforward. Genomic technologies based on next-generation sequencing technology nowadays offer multiple opportunities to dissect gene regulatory networks underlying genetic cardiovascular trait associations, thereby aiding in the identification of candidate genes at unprecedented scale. RNA sequencing in particular becomes a powerful tool when combined with genotyping to identify loci that modulate transcript abundance, known as expression quantitative trait loci (eQTL), or loci modulating transcript splicing known as splicing quantitative trait loci (sQTL). Additionally, the allele-specific resolution of RNA-sequencing technology enables estimation of allelic imbalance, a state where the two alleles of a gene are expressed at a ratio differing from the expected 1:1 ratio. When multiple high-throughput approaches are combined with deep phenotyping in a single study, a comprehensive elucidation of the relationship between genotype and phenotype comes into view, an approach known as systems genetics. In this review, we cover key applications of systems genetics in the broad cardiovascular field.
HIT'nDRIVE: patient-specific multidriver gene prioritization for precision oncology
Hodzic, Ermin; Sauerwald, Thomas; Dao, Phuong; Wang, Kendric; Yeung, Jake; Anderson, Shawn; Vandin, Fabio; Haffari, Gholamreza; Collins, Colin C.; Sahinalp, S. Cenk
2017-01-01
Prioritizing molecular alterations that act as drivers of cancer remains a crucial bottleneck in therapeutic development. Here we introduce HIT'nDRIVE, a computational method that integrates genomic and transcriptomic data to identify a set of patient-specific, sequence-altered genes, with sufficient collective influence over dysregulated transcripts. HIT'nDRIVE aims to solve the “random walk facility location” (RWFL) problem in a gene (or protein) interaction network, which differs from the standard facility location problem by its use of an alternative distance measure: “multihitting time,” the expected length of the shortest random walk from any one of the set of sequence-altered genes to an expression-altered target gene. When applied to 2200 tumors from four major cancer types, HIT'nDRIVE revealed many potentially clinically actionable driver genes. We also demonstrated that it is possible to perform accurate phenotype prediction for tumor samples by only using HIT'nDRIVE-seeded driver gene modules from gene interaction networks. In addition, we identified a number of breast cancer subtype-specific driver modules that are associated with patients’ survival outcome. Furthermore, HIT'nDRIVE, when applied to a large panel of pan-cancer cell lines, accurately predicted drug efficacy using the driver genes and their seeded gene modules. Overall, HIT'nDRIVE may help clinicians contextualize massive multiomics data in therapeutic decision making, enabling widespread implementation of precision oncology. PMID:28768687
Kaushik, Abhinav; Bhatia, Yashuma; Ali, Shakir; Gupta, Dinesh
2015-01-01
Metastatic melanoma patients have a poor prognosis, mainly attributable to the underlying heterogeneity in melanoma driver genes and altered gene expression profiles. These characteristics of melanoma also make the development of drugs and identification of novel drug targets for metastatic melanoma a daunting task. Systems biology offers an alternative approach to re-explore the genes or gene sets that display dysregulated behaviour without being differentially expressed. In this study, we have performed systems biology studies to enhance our knowledge about the conserved property of disease genes or gene sets among mutually exclusive datasets representing melanoma progression. We meta-analysed 642 microarray samples to generate melanoma reconstructed networks representing four different stages of melanoma progression to extract genes with altered molecular circuitry wiring as compared to a normal cellular state. Intriguingly, a majority of the melanoma network-rewired genes are not differentially expressed and the disease genes involved in melanoma progression consistently modulate its activity by rewiring network connections. We found that the shortlisted disease genes in the study show strong and abnormal network connectivity, which enhances with the disease progression. Moreover, the deviated network properties of the disease gene sets allow ranking/prioritization of different enriched, dysregulated and conserved pathway terms in metastatic melanoma, in agreement with previous findings. Our analysis also reveals presence of distinct network hubs in different stages of metastasizing tumor for the same set of pathways in the statistically conserved gene sets. The study results are also presented as a freely available database at http://bioinfo.icgeb.res.in/m3db/. The web-based database resource consists of results from the analysis presented here, integrated with cytoscape web and user-friendly tools for visualization, retrieval and further analysis. PMID:26558755
Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network.
Qin, Tingting; Matmati, Nabil; Tsoi, Lam C; Mohanty, Bidyut K; Gao, Nan; Tang, Jijun; Lawson, Andrew B; Hannun, Yusuf A; Zheng, W Jim
2014-10-01
To enhance our knowledge regarding biological pathway regulation, we took an integrated approach, using the biomedical literature, ontologies, network analyses and experimental investigation to infer novel genes that could modulate biological pathways. We first constructed a novel gene network via a pairwise comparison of all yeast genes' Ontology Fingerprints--a set of Gene Ontology terms overrepresented in the PubMed abstracts linked to a gene along with those terms' corresponding enrichment P-values. The network was further refined using a Bayesian hierarchical model to identify novel genes that could potentially influence the pathway activities. We applied this method to the sphingolipid pathway in yeast and found that many top-ranked genes indeed displayed altered sphingolipid pathway functions, initially measured by their sensitivity to myriocin, an inhibitor of de novo sphingolipid biosynthesis. Further experiments confirmed the modulation of the sphingolipid pathway by one of these genes, PFA4, encoding a palmitoyl transferase. Comparative analysis showed that few of these novel genes could be discovered by other existing methods. Our novel gene network provides a unique and comprehensive resource to study pathway modulations and systems biology in general. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Nehme, A; Zibara, K; Cerutti, C; Bricca, G
2015-06-01
The implication of the renin-angiotensin-aldosterone system (RAAS) in atheroma development is well described. However, a complete view of the local RAAS in atheroma is still missing. In this study we aimed to reveal the organization of RAAS in atheroma at the transcriptomic level and identify the transcriptional regulators behind it. Extended RAAS (extRAAS) was defined as the set of 37 genes coding for classical and novel RAAS participants (Figure 1). Five microarray datasets containing overall 590 samples representing carotid and peripheral atheroma were downloaded from the GEO database. Correlation-based hierarchical clustering (R software) of extRAAS genes within each dataset allowed the identification of modules of co-expressed genes. Reproducible co-expression modules across datasets were then extracted. Transcription factors (TFs) having common binding sites (TFBSs) in the promoters of coordinated genes were identified using the Genomatix database tools and analyzed for their correlation with extRAAS genes in the microarray datasets. Expression data revealed the expressed extRAAS components and their relative abundance displaying the favored pathways in atheroma. Three co-expression modules with more than 80% reproducibility across datasets were extracted. Two of them (M1 and M2) contained genes coding for angiotensin metabolizing enzymes involved in different pathways: M1 included ACE, MME, RNPEP, and DPP3, in addition to 7 other genes; and M2 included CMA1, CTSG, and CPA3. The third module (M3) contained genes coding for receptors known to be implicated in atheroma (AGTR1, MR, GR, LNPEP, EGFR and GPER). M1 and M3 were negatively correlated in 3 of 5 datasets. We identified 19 TFs that have enriched TFBSs in the promoters of genes of M1, and two for M3, but none was found for M2. Among the extracted TFs, ELF1, MAX, and IRF5 showed significant positive correlations with peptidase-coding genes from M1 and negative correlations with receptors-coding genes from M3 (p < 0.05). The identified co-expression modules display the transcriptional organization of local extRAAS in human carotid atheroma. The identification of several TFs potentially associated to extRAAS genes may provide a frame for the discovery of atheroma-specific modulators of extRAAS activity.(Figure is included in full-text article.).
Disentangling the multigenic and pleiotropic nature of molecular function
2015-01-01
Background Biological processes at the molecular level are usually represented by molecular interaction networks. Function is organised and modularity identified based on network topology, however, this approach often fails to account for the dynamic and multifunctional nature of molecular components. For example, a molecule engaging in spatially or temporally independent functions may be inappropriately clustered into a single functional module. To capture biologically meaningful sets of interacting molecules, we use experimentally defined pathways as spatial/temporal units of molecular activity. Results We defined functional profiles of Saccharomyces cerevisiae based on a minimal set of Gene Ontology terms sufficient to represent each pathway's genes. The Gene Ontology terms were used to annotate 271 pathways, accounting for pathway multi-functionality and gene pleiotropy. Pathways were then arranged into a network, linked by shared functionality. Of the genes in our data set, 44% appeared in multiple pathways performing a diverse set of functions. Linking pathways by overlapping functionality revealed a modular network with energy metabolism forming a sparse centre, surrounded by several denser clusters comprised of regulatory and metabolic pathways. Signalling pathways formed a relatively discrete cluster connected to the centre of the network. Genetic interactions were enriched within the clusters of pathways by a factor of 5.5, confirming the organisation of our pathway network is biologically significant. Conclusions Our representation of molecular function according to pathway relationships enables analysis of gene/protein activity in the context of specific functional roles, as an alternative to typical molecule-centric graph-based methods. The pathway network demonstrates the cooperation of multiple pathways to perform biological processes and organises pathways into functionally related clusters with interdependent outcomes. PMID:26678917
Generation of oscillating gene regulatory network motifs
NASA Astrophysics Data System (ADS)
van Dorp, M.; Lannoo, B.; Carlon, E.
2013-07-01
Using an improved version of an evolutionary algorithm originally proposed by François and Hakim [Proc. Natl. Acad. Sci. USAPNASA60027-842410.1073/pnas.0304532101 101, 580 (2004)], we generated small gene regulatory networks in which the concentration of a target protein oscillates in time. These networks may serve as candidates for oscillatory modules to be found in larger regulatory networks and protein interaction networks. The algorithm was run for 105 times to produce a large set of oscillating modules, which were systematically classified and analyzed. The robustness of the oscillations against variations of the kinetic rates was also determined, to filter out the least robust cases. Furthermore, we show that the set of evolved networks can serve as a database of models whose behavior can be compared to experimentally observed oscillations. The algorithm found three smallest (core) oscillators in which nonlinearities and number of components are minimal. Two of those are two-gene modules: the mixed feedback loop, already discussed in the literature, and an autorepressed gene coupled with a heterodimer. The third one is a single gene module which is competitively regulated by a monomer and a dimer. The evolutionary algorithm also generated larger oscillating networks, which are in part extensions of the three core modules and in part genuinely new modules. The latter includes oscillators which do not rely on feedback induced by transcription factors, but are purely of post-transcriptional type. Analysis of post-transcriptional mechanisms of oscillation may provide useful information for circadian clock research, as recent experiments showed that circadian rhythms are maintained even in the absence of transcription.
Finding pathway-modulating genes from a novel Ontology Fingerprint-derived gene network
Qin, Tingting; Matmati, Nabil; Tsoi, Lam C.; Mohanty, Bidyut K.; Gao, Nan; Tang, Jijun; Lawson, Andrew B.; Hannun, Yusuf A.; Zheng, W. Jim
2014-01-01
To enhance our knowledge regarding biological pathway regulation, we took an integrated approach, using the biomedical literature, ontologies, network analyses and experimental investigation to infer novel genes that could modulate biological pathways. We first constructed a novel gene network via a pairwise comparison of all yeast genes’ Ontology Fingerprints—a set of Gene Ontology terms overrepresented in the PubMed abstracts linked to a gene along with those terms’ corresponding enrichment P-values. The network was further refined using a Bayesian hierarchical model to identify novel genes that could potentially influence the pathway activities. We applied this method to the sphingolipid pathway in yeast and found that many top-ranked genes indeed displayed altered sphingolipid pathway functions, initially measured by their sensitivity to myriocin, an inhibitor of de novo sphingolipid biosynthesis. Further experiments confirmed the modulation of the sphingolipid pathway by one of these genes, PFA4, encoding a palmitoyl transferase. Comparative analysis showed that few of these novel genes could be discovered by other existing methods. Our novel gene network provides a unique and comprehensive resource to study pathway modulations and systems biology in general. PMID:25063300
GSCALite: A Web Server for Gene Set Cancer Analysis.
Liu, Chun-Jie; Hu, Fei-Fei; Xia, Mengxuan; Han, Leng; Zhang, Qiong; Guo, An-Yuan
2018-05-22
The availability of cancer genomic data makes it possible to analyze genes related to cancer. Cancer is usually the result of a set of genes and the signal of a single gene could be covered by background noise. Here, we present a web server named Gene Set Cancer Analysis (GSCALite) to analyze a set of genes in cancers with the following functional modules. (i) Differential expression in tumor vs normal, and the survival analysis; (ii) Genomic variations and their survival analysis; (iii) Gene expression associated cancer pathway activity; (iv) miRNA regulatory network for genes; (v) Drug sensitivity for genes; (vi) Normal tissue expression and eQTL for genes. GSCALite is a user-friendly web server for dynamic analysis and visualization of gene set in cancer and drug sensitivity correlation, which will be of broad utilities to cancer researchers. GSCALite is available on http://bioinfo.life.hust.edu.cn/web/GSCALite/. guoay@hust.edu.cn or zhangqiong@hust.edu.cn. Supplementary data are available at Bioinformatics online.
Interactions in the microbiome: communities of organisms and communities of genes
Boon, Eva; Meehan, Conor J; Whidden, Chris; Wong, Dennis H-J; Langille, Morgan GI; Beiko, Robert G
2014-01-01
A central challenge in microbial community ecology is the delineation of appropriate units of biodiversity, which can be taxonomic, phylogenetic, or functional in nature. The term ‘community’ is applied ambiguously; in some cases, the term refers simply to a set of observed entities, while in other cases, it requires that these entities interact with one another. Microorganisms can rapidly gain and lose genes, potentially decoupling community roles from taxonomic and phylogenetic groupings. Trait-based approaches offer a useful alternative, but many traits can be defined based on gene functions, metabolic modules, and genomic properties, and the optimal set of traits to choose is often not obvious. An analysis that considers taxon assignment and traits in concert may be ideal, with the strengths of each approach offsetting the weaknesses of the other. Individual genes also merit consideration as entities in an ecological analysis, with characteristics such as diversity, turnover, and interactions modeled using genes rather than organisms as entities. We identify some promising avenues of research that are likely to yield a deeper understanding of microbial communities that shift from observation-based questions of ‘Who is there?’ and ‘What are they doing?’ to the mechanistically driven question of ‘How will they respond?’ PMID:23909933
Cho, Hyun-Soo; Kang, Jeong Gu; Lee, Jae-Hye; Lee, Jeong-Ju; Jeon, Seong Kook; Ko, Jeong-Heon; Kim, Dae-Soo; Park, Kun-Hyang; Kim, Yong-Sam; Kim, Nam-Soon
2015-09-15
TALE-nuclease chimeras (TALENs) can bind to and cleave specific genomic loci and, are used to engineer gene knockouts and additions. Recently, instead of using the FokI domain, epigenetically active domains, such as TET1 and LSD1, have been combined with TAL effector domains to regulate targeted gene expression via DNA and histone demethylation. However, studies of histone methylation in the TALE system have not been performed. Therefore, in this study, we established a novel targeted regulation system with a TAL effector domain and a histone methylation domain. To construct a TALE-methylation fusion protein, we combined a TAL effector domain containing an E-Box region to act as a Snail binding site and the SET domain of EHMT 2 to allow for histone methylation. The constructed TALE-SET module (TSET) repressed the expression of E-cadherin via by increasing H3K9 dimethylation. Moreover, the cells that overexpressed TSET showed increased cell migration and invasion. This is the first phenotype-based study of targeted histone methylation by the TALE module, and this new system can be applied in new cancer therapies to reduce side effects.
A system view and analysis of essential hypertension.
Botzer, Alon; Grossman, Ehud; Moult, John; Unger, Ron
2018-05-01
The goal of this study was to investigate genes associated with essential hypertension from a system perspective, making use of bioinformatic tools to gain insights that are not evident when focusing at a detail-based resolution. Using various databases (pathways, Genome Wide Association Studies, knockouts etc.), we compiled a set of about 200 genes that play a major role in hypertension and identified the interactions between them. This enabled us to create a protein-protein interaction network graph, from which we identified key elements, based on graph centrality analysis. Enriched gene regulatory elements (transcription factors and microRNAs) were extracted by motif finding techniques and knowledge-based tools. We found that the network is composed of modules associated with functions such as water retention, endothelial vasoconstriction, sympathetic activity and others. We identified the transcription factor SP1 and the two microRNAs miR27 (a and b) and miR548c-3p that seem to play a major role in regulating the network as they exert their control over several modules and are not restricted to specific functions. We also noticed that genes involved in metabolic diseases (e.g. insulin) are central to the network. We view the blood-pressure regulation mechanism as a system-of-systems, composed of several contributing subsystems and pathways rather than a single module. The system is regulated by distributed elements. Understanding this mode of action can lead to a more precise treatment and drug target discovery. Our analysis suggests that insulin plays a primary role in hypertension, highlighting the tight link between essential hypertension and diseases associated with the metabolic syndrome.
Searching for statistically significant regulatory modules.
Bailey, Timothy L; Noble, William Stafford
2003-10-01
The regulatory machinery controlling gene expression is complex, frequently requiring multiple, simultaneous DNA-protein interactions. The rate at which a gene is transcribed may depend upon the presence or absence of a collection of transcription factors bound to the DNA near the gene. Locating transcription factor binding sites in genomic DNA is difficult because the individual sites are small and tend to occur frequently by chance. True binding sites may be identified by their tendency to occur in clusters, sometimes known as regulatory modules. We describe an algorithm for detecting occurrences of regulatory modules in genomic DNA. The algorithm, called mcast, takes as input a DNA database and a collection of binding site motifs that are known to operate in concert. mcast uses a motif-based hidden Markov model with several novel features. The model incorporates motif-specific p-values, thereby allowing scores from motifs of different widths and specificities to be compared directly. The p-value scoring also allows mcast to only accept motif occurrences with significance below a user-specified threshold, while still assigning better scores to motif occurrences with lower p-values. mcast can search long DNA sequences, modeling length distributions between motifs within a regulatory module, but ignoring length distributions between modules. The algorithm produces a list of predicted regulatory modules, ranked by E-value. We validate the algorithm using simulated data as well as real data sets from fruitfly and human. http://meme.sdsc.edu/MCAST/paper
htsint: a Python library for sequencing pipelines that combines data through gene set generation.
Richards, Adam J; Herrel, Anthony; Bonneaud, Camille
2015-09-24
Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
2012-01-01
Background Transcript profiling of differentiating secondary xylem has allowed us to draw a general picture of the genes involved in wood formation. However, our knowledge is still limited about the regulatory mechanisms that coordinate and modulate the different pathways providing substrates during xylogenesis. The development of compression wood in conifers constitutes an exceptional model for these studies. Although differential expression of a few genes in differentiating compression wood compared to normal or opposite wood has been reported, the broad range of features that distinguish this reaction wood suggest that the expression of a larger set of genes would be modified. Results By combining the construction of different cDNA libraries with microarray analyses we have identified a total of 496 genes in maritime pine (Pinus pinaster, Ait.) that change in expression during differentiation of compression wood (331 up-regulated and 165 down-regulated compared to opposite wood). Samples from different provenances collected in different years and geographic locations were integrated into the analyses to mitigate the effects of multiple sources of variability. This strategy allowed us to define a group of genes that are consistently associated with compression wood formation. Correlating with the deposition of a thicker secondary cell wall that characterizes compression wood development, the expression of a number of genes involved in synthesis of cellulose, hemicellulose, lignin and lignans was up-regulated. Further analysis of a set of these genes involved in S-adenosylmethionine metabolism, ammonium recycling, and lignin and lignans biosynthesis showed changes in expression levels in parallel to the levels of lignin accumulation in cells undergoing xylogenesis in vivo and in vitro. Conclusions The comparative transcriptomic analysis reported here have revealed a broad spectrum of coordinated transcriptional modulation of genes involved in biosynthesis of different cell wall polymers associated with within-tree variations in pine wood structure and composition. In particular, we demonstrate the coordinated modulation at transcriptional level of a gene set involved in S-adenosylmethionine synthesis and ammonium assimilation with increased demand for coniferyl alcohol for lignin and lignan synthesis, enabling a better understanding of the metabolic requirements in cells undergoing lignification. PMID:22747794
Analysis of bHLH coding genes using gene co-expression network approach.
Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok
2016-07-01
Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.
An Iterative Time Windowed Signature Algorithm for Time Dependent Transcription Module Discovery
Meng, Jia; Gao, Shou-Jiang; Huang, Yufei
2010-01-01
An algorithm for the discovery of time varying modules using genome-wide expression data is present here. When applied to large-scale time serious data, our method is designed to discover not only the transcription modules but also their timing information, which is rarely annotated by the existing approaches. Rather than assuming commonly defined time constant transcription modules, a module is depicted as a set of genes that are co-regulated during a specific period of time, i.e., a time dependent transcription module (TDTM). A rigorous mathematical definition of TDTM is provided, which is serve as an objective function for retrieving modules. Based on the definition, an effective signature algorithm is proposed that iteratively searches the transcription modules from the time series data. The proposed method was tested on the simulated systems and applied to the human time series microarray data during Kaposi's sarcoma-associated herpesvirus (KSHV) infection. The result has been verified by Expression Analysis Systematic Explorer. PMID:21552463
Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H
2017-11-01
Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.
PyPathway: Python Package for Biological Network Analysis and Visualization.
Xu, Yang; Luo, Xiao-Chun
2018-05-01
Life science studies represent one of the biggest generators of large data sets, mainly because of rapid sequencing technological advances. Biological networks including interactive networks and human curated pathways are essential to understand these high-throughput data sets. Biological network analysis offers a method to explore systematically not only the molecular complexity of a particular disease but also the molecular relationships among apparently distinct phenotypes. Currently, several packages for Python community have been developed, such as BioPython and Goatools. However, tools to perform comprehensive network analysis and visualization are still needed. Here, we have developed PyPathway, an extensible free and open source Python package for functional enrichment analysis, network modeling, and network visualization. The network process module supports various interaction network and pathway databases such as Reactome, WikiPathway, STRING, and BioGRID. The network analysis module implements overrepresentation analysis, gene set enrichment analysis, network-based enrichment, and de novo network modeling. Finally, the visualization and data publishing modules enable users to share their analysis by using an easy web application. For package availability, see the first Reference.
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation
Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.
2016-01-01
Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038
Software-assisted stacking of gene modules using GoldenBraid 2.0 DNA-assembly framework.
Vazquez-Vilar, Marta; Sarrion-Perdigones, Alejandro; Ziarsolo, Peio; Blanca, Jose; Granell, Antonio; Orzaez, Diego
2015-01-01
GoldenBraid (GB) is a modular DNA assembly technology for plant multigene engineering based on type IIS restriction enzymes. GB speeds up the assembly of transcriptional units from standard genetic parts and facilitates the stacking of several genes within the same T-DNA in few days. GBcloning is software-assisted with a set of online tools. The GBDomesticator tool assists in the adaptation of DNA parts to the GBstandard. The combination of GB-adapted parts to build new transcriptional units is assisted by the GB TU Assembler tool. Finally, the assembly of multigene modules is simulated by the GB Binary Assembler. All the software tools are available at www.gbcloning.org . Here, we describe in detail the assembly methodology to create a multigene construct with three transcriptional units for polyphenol metabolic engineering in plants.
GARNET--gene set analysis with exploration of annotation relations.
Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu
2011-02-15
Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).
Verdier, Jerome; Lalanne, David; Pelletier, Sandra; Torres-Jerez, Ivone; Righetti, Karima; Bandyopadhyay, Kaustav; Leprince, Olivier; Chatelain, Emilie; Vu, Benoit Ly; Gouzy, Jerome; Gamas, Pascal; Udvardi, Michael K; Buitink, Julia
2013-10-01
In seeds, desiccation tolerance (DT) and the ability to survive the dry state for prolonged periods of time (longevity) are two essential traits for seed quality that are consecutively acquired during maturation. Using transcriptomic and metabolomic profiling together with a conditional-dependent network of global transcription interactions, we dissected the maturation events from the end of seed filling to final maturation drying during the last 3 weeks of seed development in Medicago truncatula. The network revealed distinct coexpression modules related to the acquisition of DT, longevity, and pod abscission. The acquisition of DT and dormancy module was associated with abiotic stress response genes, including late embryogenesis abundant (LEA) genes. The longevity module was enriched in genes involved in RNA processing and translation. Concomitantly, LEA polypeptides accumulated, displaying an 18-d delayed accumulation compared with transcripts. During maturation, gulose and stachyose levels increased and correlated with longevity. A seed-specific network identified known and putative transcriptional regulators of DT, including ABSCISIC ACID-INSENSITIVE3 (MtABI3), MtABI4, MtABI5, and APETALA2/ ETHYLENE RESPONSE ELEMENT BINDING PROTEIN (AtAP2/EREBP) transcription factor as major hubs. These transcriptional activators were highly connected to LEA genes. Longevity genes were highly connected to two MtAP2/EREBP and two basic leucine zipper transcription factors. A heat shock factor was found at the transition of DT and longevity modules, connecting to both gene sets. Gain- and loss-of-function approaches of MtABI3 confirmed 80% of its predicted targets, thereby experimentally validating the network. This study captures the coordinated regulation of seed maturation and identifies distinct regulatory networks underlying the preparation for the dry and quiescent states.
Verdier, Jerome; Lalanne, David; Pelletier, Sandra; Torres-Jerez, Ivone; Righetti, Karima; Bandyopadhyay, Kaustav; Leprince, Olivier; Chatelain, Emilie; Vu, Benoit Ly; Gouzy, Jerome; Gamas, Pascal; Udvardi, Michael K.; Buitink, Julia
2013-01-01
In seeds, desiccation tolerance (DT) and the ability to survive the dry state for prolonged periods of time (longevity) are two essential traits for seed quality that are consecutively acquired during maturation. Using transcriptomic and metabolomic profiling together with a conditional-dependent network of global transcription interactions, we dissected the maturation events from the end of seed filling to final maturation drying during the last 3 weeks of seed development in Medicago truncatula. The network revealed distinct coexpression modules related to the acquisition of DT, longevity, and pod abscission. The acquisition of DT and dormancy module was associated with abiotic stress response genes, including late embryogenesis abundant (LEA) genes. The longevity module was enriched in genes involved in RNA processing and translation. Concomitantly, LEA polypeptides accumulated, displaying an 18-d delayed accumulation compared with transcripts. During maturation, gulose and stachyose levels increased and correlated with longevity. A seed-specific network identified known and putative transcriptional regulators of DT, including ABSCISIC ACID-INSENSITIVE3 (MtABI3), MtABI4, MtABI5, and APETALA2/ ETHYLENE RESPONSE ELEMENT BINDING PROTEIN (AtAP2/EREBP) transcription factor as major hubs. These transcriptional activators were highly connected to LEA genes. Longevity genes were highly connected to two MtAP2/EREBP and two basic leucine zipper transcription factors. A heat shock factor was found at the transition of DT and longevity modules, connecting to both gene sets. Gain- and loss-of-function approaches of MtABI3 confirmed 80% of its predicted targets, thereby experimentally validating the network. This study captures the coordinated regulation of seed maturation and identifies distinct regulatory networks underlying the preparation for the dry and quiescent states. PMID:23929721
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin
2017-01-01
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Differential network as an indicator of osteoporosis with network entropy.
Ma, Lili; Du, Hongmei; Chen, Guangdong
2018-07-01
Osteoporosis is a common skeletal disorder characterized by a decrease in bone mass and density. The peak bone mass (PBM) is a significant determinant of osteoporosis. To gain insights into the indicating effect of PBM to osteoporosis, this study focused on characterizing the PBM networks and identifying key genes. One biological data set with 12 monocyte low PBM samples and 11 high PBM samples was derived to construct protein-protein interaction networks (PPINs). Based on clique-merging, module-identification algorithm was used to identify modules from PPINs. The systematic calculation and comparison were performed to test whether the network entropy can discriminate the low PBM network from high PBM network. We constructed 32 destination networks with 66 modules divided from monocyte low and high PBM networks. Among them, network 11 was the only significantly differential one (P<0.05) with 8 nodes and 28 edges. All genes belonged to precursors of osteoclasts, which were related to calcium transport as well as blood monocytes. In conclusion, based on the entropy in PBM PPINs, the differential network appears to be a novel therapeutic indicator for osteoporosis during the bone monocyte progression; these findings are helpful in disclosing the pathogenetic mechanisms of osteoporosis.
Pan-phylum Comparison of Nematode Metabolic Potential
Tyagi, Rahul; Rosa, Bruce A.; Lewis, Warren G.; Mitreva, Makedonka
2015-01-01
Nematodes are among the most important causative pathogens of neglected tropical diseases. The increased availability of genomic and transcriptomic data for many understudied nematode species provides a great opportunity to investigate different aspects of their biology. Increasingly, metabolic potential of pathogens is recognized as a critical determinant governing their development, growth and pathogenicity. Comparing metabolic potential among species with distinct trophic ecologies can provide insights on overall biology or molecular adaptations. Furthermore, ascertaining gene expression at pathway level can help in understanding metabolic dynamics over development. Comparison of biochemical pathways (or subpathways, i.e. pathway modules) among related species can also retrospectively indicate potential mistakes in gene-calling and functional annotation. We show with numerous illustrative case studies that comparisons at the level of pathway modules have the potential to uncover biological insights while remaining computationally tractable. Here, we reconstruct and compare metabolic modules found in the deduced proteomes of 13 nematodes and 10 non-nematode species (including hosts of the parasitic nematode species). We observed that the metabolic potential is, in general, concomitant with phylogenetic and/or ecological similarity. Varied metabolic strategies are required among the nematodes, with only 8 out of 51 pathway modules being completely conserved. Enzyme comparison based on topology of metabolic modules uncovered diversification between parasite and host that can potentially guide therapeutic intervention. Gene expression data from 4 nematode species were used to study metabolic dynamics over their life cycles. We report unexpected differential metabolism between immature and mature microfilariae of the human filarial parasite Brugia malayi. A set of genes potentially important for parasitism is also reported, based on an analysis of gene expression in C. elegans and the human hookworm Necator americanus. We illustrate how analyzing and comparing metabolism at the level of pathway modules can improve existing knowledge of nematode metabolic potential and can provide parasitism related insights. Our reconstruction and comparison of nematode metabolic pathways at a pan-phylum and inter-phylum level enabled determination of phylogenetic restrictions and differential expression of pathways. A visualization of our results is available at http://nematode.net and the program for identification of module completeness (modDFS) is freely available at SourceForge. The methods reported will help biologists to predict biochemical potential of any organism with available deduced proteome, to direct experiments and test hypotheses. PMID:26000881
Conserved Non-Coding Regulatory Signatures in Arabidopsis Co-Expressed Gene Modules
Spangler, Jacob B.; Ficklin, Stephen P.; Luo, Feng; Freeling, Michael; Feltus, F. Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome. PMID:23024789
Conserved non-coding regulatory signatures in Arabidopsis co-expressed gene modules.
Spangler, Jacob B; Ficklin, Stephen P; Luo, Feng; Freeling, Michael; Feltus, F Alex
2012-01-01
Complex traits and other polygenic processes require coordinated gene expression. Co-expression networks model mRNA co-expression: the product of gene regulatory networks. To identify regulatory mechanisms underlying coordinated gene expression in a tissue-enriched context, ten Arabidopsis thaliana co-expression networks were constructed after manually sorting 4,566 RNA profiling datasets into aerial, flower, leaf, root, rosette, seedling, seed, shoot, whole plant, and global (all samples combined) groups. Collectively, the ten networks contained 30% of the measurable genes of Arabidopsis and were circumscribed into 5,491 modules. Modules were scrutinized for cis regulatory mechanisms putatively encoded in conserved non-coding sequences (CNSs) previously identified as remnants of a whole genome duplication event. We determined the non-random association of 1,361 unique CNSs to 1,904 co-expression network gene modules. Furthermore, the CNS elements were placed in the context of known gene regulatory networks (GRNs) by connecting 250 CNS motifs with known GRN cis elements. Our results provide support for a regulatory role of some CNS elements and suggest the functional consequences of CNS activation of co-expression in specific gene sets dispersed throughout the genome.
Benitez, Cecil M.; Qu, Kun; Sugiyama, Takuya; Pauerstein, Philip T.; Liu, Yinghua; Tsai, Jennifer; Gu, Xueying; Ghodasara, Amar; Arda, H. Efsun; Zhang, Jiajing; Dekker, Joseph D.; Tucker, Haley O.; Chang, Howard Y.; Kim, Seung K.
2014-01-01
The regulatory logic underlying global transcriptional programs controlling development of visceral organs like the pancreas remains undiscovered. Here, we profiled gene expression in 12 purified populations of fetal and adult pancreatic epithelial cells representing crucial progenitor cell subsets, and their endocrine or exocrine progeny. Using probabilistic models to decode the general programs organizing gene expression, we identified co-expressed gene sets in cell subsets that revealed patterns and processes governing progenitor cell development, lineage specification, and endocrine cell maturation. Purification of Neurog3 mutant cells and module network analysis linked established regulators such as Neurog3 to unrecognized gene targets and roles in pancreas development. Iterative module network analysis nominated and prioritized transcriptional regulators, including diabetes risk genes. Functional validation of a subset of candidate regulators with corresponding mutant mice revealed that the transcription factors Etv1, Prdm16, Runx1t1 and Bcl11a are essential for pancreas development. Our integrated approach provides a unique framework for identifying regulatory genes and functional gene sets underlying pancreas development and associated diseases such as diabetes mellitus. PMID:25330008
Busch, Robert; Qiu, Weiliang; Lasky-Su, Jessica; Morrow, Jarrett; Criner, Gerard; DeMeo, Dawn
2016-11-05
Chronic obstructive pulmonary disease (COPD) is the third-leading cause of death worldwide. Identifying COPD-associated DNA methylation marks in African-Americans may contribute to our understanding of racial disparities in COPD susceptibility. We determined differentially methylated genes and co-methylation network modules associated with COPD in African-Americans recruited during exacerbations of COPD and smoking controls from the Pennsylvania Study of Chronic Obstructive Pulmonary Exacerbations (PA-SCOPE) cohort. We assessed DNA methylation from whole blood samples in 362 African-American smokers in the PA-SCOPE cohort using the Illumina Infinium HumanMethylation27 BeadChip Array. Final analysis included 19302 CpG probes annotated to the nearest gene transcript after quality control. We tested methylation associations with COPD case-control status using mixed linear models. Weighted gene comethylation networks were constructed using weighted gene coexpression network analysis (WGCNA) and network modules were analyzed for association with COPD. There were five differentially methylated CpG probes significantly associated with COPD among African-Americans at an FDR less than 5 %, and seven additional probes that approached significance at an FDR less than 10 %. The top ranked gene association was MAML1, which has been shown to affect NOTCH-dependent angiogenesis in murine lung. Network modeling yielded the "yellow" and "blue" comethylation modules which were significantly associated with COPD (p-value 4 × 10 -10 and 4 × 10 -9 , respectively). The yellow module was enriched for gene sets related to inflammatory pathways known to be relevant to COPD. The blue module contained the top ranked genes in the concurrent differential methylation analysis (FXYD1/LGI4, gene significance p-value 1.2 × 10 -26 ; MAML1, p-value 2.0 × 10 -26 ; CD72, p-value 2.1 × 10 -25 ; and LPO, p-value 7.2 × 10 -25 ), and was significantly associated with lung development processes in Gene Ontology gene-set enrichment analysis. We identified 12 differentially methylated CpG sites associated with COPD that mapped to biologically plausible genes. Network module comethylation patterns have identified candidate genes that may be contributing to racial differences in COPD susceptibility and severity. COPD-associated comethylation modules contained genes previously associated with lung disease and inflammation and recapitulated known COPD-associated genes. The genes implicated by differential methylation and WGCNA analysis may provide mechanistic targets contributing to COPD susceptibility, exacerbations, and outcomes among African-Americans. Trial Registration: NCT00774176 , Registry: ClinicalTrials.gov, URL: www.clinicaltrials.gov , Date of Enrollment of First Participant: June 2004, Date Registered: 04 January 2008 (retrospectively registered).
A Gene Module-Based eQTL Analysis Prioritizing Disease Genes and Pathways in Kidney Cancer.
Yang, Mary Qu; Li, Dan; Yang, William; Zhang, Yifan; Liu, Jun; Tong, Weida
2017-01-01
Clear cell renal cell carcinoma (ccRCC) is the most common and most aggressive form of renal cell cancer (RCC). The incidence of RCC has increased steadily in recent years. The pathogenesis of renal cell cancer remains poorly understood. Many of the tumor suppressor genes, oncogenes, and dysregulated pathways in ccRCC need to be revealed for improvement of the overall clinical outlook of the disease. Here, we developed a systems biology approach to prioritize the somatic mutated genes that lead to dysregulation of pathways in ccRCC. The method integrated multi-layer information to infer causative mutations and disease genes. First, we identified differential gene modules in ccRCC by coupling transcriptome and protein-protein interactions. Each of these modules consisted of interacting genes that were involved in similar biological processes and their combined expression alterations were significantly associated with disease type. Then, subsequent gene module-based eQTL analysis revealed somatic mutated genes that had driven the expression alterations of differential gene modules. Our study yielded a list of candidate disease genes, including several known ccRCC causative genes such as BAP1 and PBRM1 , as well as novel genes such as NOD2, RRM1, CSRNP1, SLC4A2, TTLL1 and CNTN1. The differential gene modules and their driver genes revealed by our study provided a new perspective for understanding the molecular mechanisms underlying the disease. Moreover, we validated the results in independent ccRCC patient datasets. Our study provided a new method for prioritizing disease genes and pathways.
Evolutionary trends and functional anatomy of the human expanded autophagy network
Till, Andreas; Saito, Rintaro; Merkurjev, Daria; Liu, Jing-Jing; Syed, Gulam Hussain; Kolnik, Martin; Siddiqui, Aleem; Glas, Martin; Scheffler, Björn; Ideker, Trey; Subramani, Suresh
2015-01-01
All eukaryotic cells utilize autophagy for protein and organelle turnover, thus assuring subcellular quality control, homeostasis, and survival. In order to address recent advances in identification of human autophagy associated genes, and to describe autophagy on a system-wide level, we established an autophagy-centered gene interaction network by merging various primary data sets and by retrieving respective interaction data. The resulting network (‘AXAN’) was analyzed with respect to subnetworks, e.g. the prime gene subnetwork (including the core machinery, signaling pathways and autophagy receptors) and the transcription subnetwork. To describe aspects of evolution within this network, we assessed the presence of protein orthologs across 99 eukaryotic model organisms. We visualized evolutionary trends for prime gene categories and evolutionary tracks for selected AXAN genes. This analysis confirms the eukaryotic origin of autophagy core genes while it points to a diverse evolutionary history of autophagy receptors. Next, we used module identification to describe the functional anatomy of the network at the level of pathway modules. In addition to obvious pathways (e.g., lysosomal degradation, insulin signaling) our data unveil the existence of context-related modules such as Rho GTPase signaling. Last, we used a tripartite, image-based RNAi – screen to test candidate genes predicted to play a role in regulation of autophagy. We verified the Rho GTPase, CDC42, as a novel regulator of autophagy-related signaling. This study emphasizes the applicability of system-wide approaches to gain novel insights into a complex biological process and to describe the human autophagy pathway at a hitherto unprecedented level of detail. PMID:26103419
bc-GenExMiner 3.0: new mining module computes breast cancer gene expression correlation analyses.
Jézéquel, Pascal; Frénel, Jean-Sébastien; Campion, Loïc; Guérin-Charbonnel, Catherine; Gouraud, Wilfried; Ricolleau, Gabriel; Campone, Mario
2013-01-01
We recently developed a user-friendly web-based application called bc-GenExMiner (http://bcgenex.centregauducheau.fr), which offered the possibility to evaluate prognostic informativity of genes in breast cancer by means of a 'prognostic module'. In this study, we develop a new module called 'correlation module', which includes three kinds of gene expression correlation analyses. The first one computes correlation coefficient between 2 or more (up to 10) chosen genes. The second one produces two lists of genes that are most correlated (positively and negatively) to a 'tested' gene. A gene ontology (GO) mining function is also proposed to explore GO 'biological process', 'molecular function' and 'cellular component' terms enrichment for the output lists of most correlated genes. The third one explores gene expression correlation between the 15 telomeric and 15 centromeric genes surrounding a 'tested' gene. These correlation analyses can be performed in different groups of patients: all patients (without any subtyping), in molecular subtypes (basal-like, HER2+, luminal A and luminal B) and according to oestrogen receptor status. Validation tests based on published data showed that these automatized analyses lead to results consistent with studies' conclusions. In brief, this new module has been developed to help basic researchers explore molecular mechanisms of breast cancer. DATABASE URL: http://bcgenex.centregauducheau.fr
An integrative approach to inferring biologically meaningful gene modules.
Cho, Ji-Hoon; Wang, Kai; Galas, David J
2011-07-26
The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level.
Levin-Karp, Ayelet; Barenholz, Uri; Bareia, Tasneem; Dayagi, Michal; Zelcbuch, Lior; Antonovsky, Niv; Noor, Elad; Milo, Ron
2013-06-21
Translational coupling is the interdependence of translation efficiency of neighboring genes encoded within an operon. The degree of coupling may be quantified by measuring how the translation rate of a gene is modulated by the translation rate of its upstream gene. Translational coupling was observed in prokaryotic operons several decades ago, but the quantitative range of modulation translational coupling leads to and the factors governing this modulation were only partially characterized. In this study, we systematically quantify and characterize translational coupling in E. coli synthetic operons using a library of plasmids carrying fluorescent reporter genes that are controlled by a set of different ribosome binding site (RBS) sequences. The downstream gene expression level is found to be enhanced by the upstream gene expression via translational coupling with the enhancement level varying from almost no coupling to over 10-fold depending on the upstream gene's sequence. Additionally, we find that the level of translational coupling in our system is similar between the second and third locations in the operon. The coupling depends on the distance between the stop codon of the upstream gene and the start codon of the downstream gene. This study is the first to systematically and quantitatively characterize translational coupling in a synthetic E. coli operon. Our analysis will be useful in accurate manipulation of gene expression in synthetic biology and serves as a step toward understanding the mechanisms involved in translational expression modulation.
de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.
2012-01-01
Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806
The Double-Stranded DNA Virosphere as a Modular Hierarchical Network of Gene Sharing
Iranzo, Jaime
2016-01-01
ABSTRACT Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order “Megavirales” with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections. PMID:27486193
NASA Astrophysics Data System (ADS)
Guo, Jingyu; Tian, Dehua; McKinney, Brett A.; Hartman, John L.
2010-06-01
Interactions between genetic and/or environmental factors are ubiquitous, affecting the phenotypes of organisms in complex ways. Knowledge about such interactions is becoming rate-limiting for our understanding of human disease and other biological phenomena. Phenomics refers to the integrative analysis of how all genes contribute to phenotype variation, entailing genome and organism level information. A systems biology view of gene interactions is critical for phenomics. Unfortunately the problem is intractable in humans; however, it can be addressed in simpler genetic model systems. Our research group has focused on the concept of genetic buffering of phenotypic variation, in studies employing the single-cell eukaryotic organism, S. cerevisiae. We have developed a methodology, quantitative high throughput cellular phenotyping (Q-HTCP), for high-resolution measurements of gene-gene and gene-environment interactions on a genome-wide scale. Q-HTCP is being applied to the complete set of S. cerevisiae gene deletion strains, a unique resource for systematically mapping gene interactions. Genetic buffering is the idea that comprehensive and quantitative knowledge about how genes interact with respect to phenotypes will lead to an appreciation of how genes and pathways are functionally connected at a systems level to maintain homeostasis. However, extracting biologically useful information from Q-HTCP data is challenging, due to the multidimensional and nonlinear nature of gene interactions, together with a relative lack of prior biological information. Here we describe a new approach for mining quantitative genetic interaction data called recursive expectation-maximization clustering (REMc). We developed REMc to help discover phenomic modules, defined as sets of genes with similar patterns of interaction across a series of genetic or environmental perturbations. Such modules are reflective of buffering mechanisms, i.e., genes that play a related role in the maintenance of physiological homeostasis. To develop the method, 297 gene deletion strains were selected based on gene-drug interactions with hydroxyurea, an inhibitor of ribonucleotide reductase enzyme activity, which is critical for DNA synthesis. To partition the gene functions, these 297 deletion strains were challenged with growth inhibitory drugs known to target different genes and cellular pathways. Q-HTCP-derived growth curves were used to quantify all gene interactions, and the data were used to test the performance of REMc. Fundamental advantages of REMc include objective assessment of total number of clusters and assignment to each cluster a log-likelihood value, which can be considered an indicator of statistical quality of clusters. To assess the biological quality of clusters, we developed a method called gene ontology information divergence z-score (GOid_z). GOid_z summarizes total enrichment of GO attributes within individual clusters. Using these and other criteria, we compared the performance of REMc to hierarchical and K-means clustering. The main conclusion is that REMc provides distinct efficiencies for mining Q-HTCP data. It facilitates identification of phenomic modules, which contribute to buffering mechanisms that underlie cellular homeostasis and the regulation of phenotypic expression.
Bråte, Jon; Adamski, Marcin; Neumann, Ralf S; Shalchian-Tabrizi, Kamran; Adamska, Maja
2015-12-22
Long non-coding RNAs (lncRNAs) play important regulatory roles during animal development, and it has been hypothesized that an RNA-based gene regulation was important for the evolution of developmental complexity in animals. However, most studies of lncRNA gene regulation have been performed using model animal species, and very little is known about this type of gene regulation in non-bilaterians. We have therefore analysed RNA-Seq data derived from a comprehensive set of embryogenesis stages in the calcareous sponge Sycon ciliatum and identified hundreds of developmentally expressed intergenic lncRNAs (lincRNAs) in this species. In situ hybridization of selected lincRNAs revealed dynamic spatial and temporal expression during embryonic development. More than 600 lincRNAs constitute integral parts of differentially expressed gene modules, which also contain known developmental regulatory genes, e.g. transcription factors and signalling molecules. This study provides insights into the non-coding gene repertoire of one of the earliest evolved animal lineages, and suggests that RNA-based gene regulation was probably present in the last common ancestor of animals. © 2015 The Authors.
Srivastava, Mousami; Khurana, Pankaj; Sugadev, Ragumani
2012-11-02
The tissue-specific Unigene Sets derived from more than one million expressed sequence tags (ESTs) in the NCBI, GenBank database offers a platform for identifying significantly and differentially expressed tissue-specific genes by in-silico methods. Digital differential display (DDD) rapidly creates transcription profiles based on EST comparisons and numerically calculates, as a fraction of the pool of ESTs, the relative sequence abundance of known and novel genes. However, the process of identifying the most likely tissue for a specific disease in which to search for candidate genes from the pool of differentially expressed genes remains difficult. Therefore, we have used 'Gene Ontology semantic similarity score' to measure the GO similarity between gene products of lung tissue-specific candidate genes from control (normal) and disease (cancer) sets. This semantic similarity score matrix based on hierarchical clustering represents in the form of a dendrogram. The dendrogram cluster stability was assessed by multiple bootstrapping. Multiple bootstrapping also computes a p-value for each cluster and corrects the bias of the bootstrap probability. Subsequent hierarchical clustering by the multiple bootstrapping method (α = 0.95) identified seven clusters. The comparative, as well as subtractive, approach revealed a set of 38 biomarkers comprising four distinct lung cancer signature biomarker clusters (panel 1-4). Further gene enrichment analysis of the four panels revealed that each panel represents a set of lung cancer linked metastasis diagnostic biomarkers (panel 1), chemotherapy/drug resistance biomarkers (panel 2), hypoxia regulated biomarkers (panel 3) and lung extra cellular matrix biomarkers (panel 4). Expression analysis reveals that hypoxia induced lung cancer related biomarkers (panel 3), HIF and its modulating proteins (TGM2, CSNK1A1, CTNNA1, NAMPT/Visfatin, TNFRSF1A, ETS1, SRC-1, FN1, APLP2, DMBT1/SAG, AIB1 and AZIN1) are significantly down regulated. All down regulated genes in this panel were highly up regulated in most other types of cancers. These panels of proteins may represent signature biomarkers for lung cancer and will aid in lung cancer diagnosis and disease monitoring as well as in the prediction of responses to therapeutics.
Yu, Fu-Dong; Yang, Shao-You; Li, Yuan-Yuan; Hu, Wei
2013-04-10
Malaria continues to be one of the most severe global infectious diseases, as a major threat to human health and economic development. Network-based biological analysis is a promising approach to uncover key genes and biological processes from a network viewpoint, which could not be recognized from individual gene-based signatures. We integrated gene co-expression profile with protein-protein interaction and transcriptional regulation information to construct a comprehensive gene co-expression network of Plasmodium falciparum. Based on this network, we identified 10 core modules by using ICE (Iterative Clique Enumeration) algorithm, which were essential for malaria parasite development in intraerythrocytic developmental cycle (IDC) stages. In each module, all genes were highly correlated probably due to co-regulation or formation of a protein complex. Some of these genes were recognized to be differentially coexpressed among three close-by IDC stages. The gene of prpf8 (PFD0265w) encoding pre-mRNA processing splicing factor 8 product was identified as DCGs (differentially co-expressed genes) among IDC stages, although this gene function was seldom reported in previous researches. Integrating the species-specific gene prediction and differential co-expression gene detection, we found some modules could perform species-specific functions according to some of genes in these modules were species-specific genes, like the module 10. Furthermore, in order to reveal the underlying mechanisms of the erythrocyte invasion by P. falciparum, Steiner Tree algorithm was employed to identify the invasion subnetwork from our gene co-expression network. The subnetwork-based analysis indicated that some important Plasmodium parasite specific genes could corporate with each other and be co-regulated during the parasite invasion process, which including a head-to-head gene pair of PfRH2a (PF13_0198) and PfRH2b (MAL13P1.176). This study based on gene co-expression network could shed new insights on the mechanisms of pathogenesis, even virulence and P. falciparum development. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
Detection of type 2 diabetes related modules and genes based on epigenetic networks
2014-01-01
Background Type 2 diabetes (T2D) is one of the most common chronic metabolic diseases characterized by insulin resistance and the decrease of insulin secretion. Genetic variation can only explain part of the heritability of T2D, so there need new methods to detect the susceptibility genes of the disease. Epigenetics could establish the interface between the environmental factor and the T2D Pathological mechanism. Results Based on the network theory and by combining epigenetic characteristics with human interactome, the weighted human DNA methylation network (WMPN) was constructed, and a T2D-related subnetwork (TMSN) was obtained through T2D-related differentially methylated genes. It is found that TMSN had a T2D specific network structure that non-fatal metabolic disease causing genes were often located in the topological and functional periphery of network. Combined with chromatin modifications, the weighted chromatin modification network (WCPN) was built, and a T2D-related chromatin modification pattern subnetwork was obtained by the TMSN gene set. TCSN had a densely connected network community, indicating that TMSN and TCSN could represent a collection of T2D-related epigenetic dysregulated sub-pathways. Using the cumulative hypergeometric test, 24 interplay modules of DNA methylation and chromatin modifications were identified. By the analysis of gene expression in human T2D islet tissue, it is found that there existed genes with the variant expression level caused by the aberrant DNA methylation and (or) chromatin modifications, which might affect and promote the development of T2D. Conclusions Here we have detected the potential interplay modules of DNA methylation and chromatin modifications for T2D. The study of T2D epigenetic networks provides a new way for understanding the pathogenic mechanism of T2D caused by epigenetic disorders. PMID:24565181
Farber, Charles R
2010-11-01
Bone mineral density (BMD) is influenced by a complex network of gene interactions; therefore, elucidating the relationships between genes and how those genes, in turn, influence BMD is critical for developing a comprehensive understanding of osteoporosis. To investigate the role of transcriptional networks in the regulation of BMD, we performed a weighted gene coexpression network analysis (WGCNA) using microarray expression data on monocytes from young individuals with low or high BMD. WGCNA groups genes into modules based on patterns of gene coexpression. and our analysis identified 11 gene modules. We observed that the overall expression of one module (referred to as module 9) was significantly higher in the low-BMD group (p = .03). Module 9 was highly enriched for genes belonging to the immune system-related gene ontology (GO) category "response to virus" (p = 7.6 × 10(-11)). Using publically available genome-wide association study data, we independently validated the importance of module 9 by demonstrating that highly connected module 9 hubs were more likely, relative to less highly connected genes, to be genetically associated with BMD. This study highlights the advantages of systems-level analyses to uncover coexpression modules associated with bone mass and suggests that particular monocyte expression patterns may mediate differences in BMD. © 2010 American Society for Bone and Mineral Research.
Mason, Mike J; Fan, Guoping; Plath, Kathrin; Zhou, Qing; Horvath, Steve
2009-01-01
Background Recent work has revealed that a core group of transcription factors (TFs) regulates the key characteristics of embryonic stem (ES) cells: pluripotency and self-renewal. Current efforts focus on identifying genes that play important roles in maintaining pluripotency and self-renewal in ES cells and aim to understand the interactions among these genes. To that end, we investigated the use of unsigned and signed network analysis to identify pluripotency and differentiation related genes. Results We show that signed networks provide a better systems level understanding of the regulatory mechanisms of ES cells than unsigned networks, using two independent murine ES cell expression data sets. Specifically, using signed weighted gene co-expression network analysis (WGCNA), we found a pluripotency module and a differentiation module, which are not identified in unsigned networks. We confirmed the importance of these modules by incorporating genome-wide TF binding data for key ES cell regulators. Interestingly, we find that the pluripotency module is enriched with genes related to DNA damage repair and mitochondrial function in addition to transcriptional regulation. Using a connectivity measure of module membership, we not only identify known regulators of ES cells but also show that Mrpl15, Msh6, Nrf1, Nup133, Ppif, Rbpj, Sh3gl2, and Zfp39, among other genes, have important roles in maintaining ES cell pluripotency and self-renewal. We also report highly significant relationships between module membership and epigenetic modifications (histone modifications and promoter CpG methylation status), which are known to play a role in controlling gene expression during ES cell self-renewal and differentiation. Conclusion Our systems biologic re-analysis of gene expression, transcription factor binding, epigenetic and gene ontology data provides a novel integrative view of ES cell biology. PMID:19619308
Devlin, Joseph C; Battaglia, Thomas; Blaser, Martin J; Ruggles, Kelly V
2018-06-25
Exploration of large data sets, such as shotgun metagenomic sequence or expression data, by biomedical experts and medical professionals remains as a major bottleneck in the scientific discovery process. Although tools for this purpose exist for 16S ribosomal RNA sequencing analysis, there is a growing but still insufficient number of user-friendly interactive visualization workflows for easy data exploration and figure generation. The development of such platforms for this purpose is necessary to accelerate and streamline microbiome laboratory research. We developed the Workflow Hub for Automated Metagenomic Exploration (WHAM!) as a web-based interactive tool capable of user-directed data visualization and statistical analysis of annotated shotgun metagenomic and metatranscriptomic data sets. WHAM! includes exploratory and hypothesis-based gene and taxa search modules for visualizing differences in microbial taxa and gene family expression across experimental groups, and for creating publication quality figures without the need for command line interface or in-house bioinformatics. WHAM! is an interactive and customizable tool for downstream metagenomic and metatranscriptomic analysis providing a user-friendly interface allowing for easy data exploration by microbiome and ecological experts to facilitate discovery in multi-dimensional and large-scale data sets.
An integrative approach to inferring biologically meaningful gene modules
2011-01-01
Background The ability to construct biologically meaningful gene networks and modules is critical for contemporary systems biology. Though recent studies have demonstrated the power of using gene modules to shed light on the functioning of complex biological systems, most modules in these networks have shown little association with meaningful biological function. We have devised a method which directly incorporates gene ontology (GO) annotation in construction of gene modules in order to gain better functional association. Results We have devised a method, Semantic Similarity-Integrated approach for Modularization (SSIM) that integrates various gene-gene pairwise similarity values, including information obtained from gene expression, protein-protein interactions and GO annotations, in the construction of modules using affinity propagation clustering. We demonstrated the performance of the proposed method using data from two complex biological responses: 1. the osmotic shock response in Saccharomyces cerevisiae, and 2. the prion-induced pathogenic mouse model. In comparison with two previously reported algorithms, modules identified by SSIM showed significantly stronger association with biological functions. Conclusions The incorporation of semantic similarity based on GO annotation with gene expression and protein-protein interaction data can greatly enhance the functional relevance of inferred gene modules. In addition, the SSIM approach can also reveal the hierarchical structure of gene modules to gain a broader functional view of the biological system. Hence, the proposed method can facilitate comprehensive and in-depth analysis of high throughput experimental data at the gene network level. PMID:21791051
From Saccharomyces cerevisiae to human: The important gene co-expression modules.
Liu, Wei; Li, Li; Ye, Hua; Chen, Haiwei; Shen, Weibiao; Zhong, Yuexian; Tian, Tian; He, Huaqin
2017-08-01
Network-based systems biology has become an important method for analyzing high-throughput gene expression data and gene function mining. Yeast has long been a popular model organism for biomedical research. In the current study, a weighted gene co-expression network analysis algorithm was applied to construct a gene co-expression network in Saccharomyces cerevisiae . Seventeen stable gene co-expression modules were detected from 2,814 S. cerevisiae microarray data. Further characterization of these modules with the Database for Annotation, Visualization and Integrated Discovery tool indicated that these modules were associated with certain biological processes, such as heat response, cell cycle, translational regulation, mitochondrion oxidative phosphorylation, amino acid metabolism and autophagy. Hub genes were also screened by intra-modular connectivity. Finally, the module conservation was evaluated in a human disease microarray dataset. Functional modules were identified in budding yeast, some of which are associated with patient survival. The current study provided a paradigm for single cell microorganisms and potentially other organisms.
A global interaction network maps a wiring diagram of cellular function
Costanzo, Michael; VanderSluis, Benjamin; Koch, Elizabeth N.; Baryshnikova, Anastasia; Pons, Carles; Tan, Guihong; Wang, Wen; Usaj, Matej; Hanchard, Julia; Lee, Susan D.; Pelechano, Vicent; Styles, Erin B.; Billmann, Maximilian; van Leeuwen, Jolanda; van Dyk, Nydia; Lin, Zhen-Yuan; Kuzmin, Elena; Nelson, Justin; Piotrowski, Jeff S.; Srikumar, Tharan; Bahr, Sondra; Chen, Yiqun; Deshpande, Raamesh; Kurat, Christoph F.; Li, Sheena C.; Li, Zhijian; Usaj, Mojca Mattiazzi; Okada, Hiroki; Pascoe, Natasha; Luis, Bryan-Joseph San; Sharifpoor, Sara; Shuteriqi, Emira; Simpkins, Scott W.; Snider, Jamie; Suresh, Harsha Garadi; Tan, Yizhao; Zhu, Hongwei; Malod-Dognin, Noel; Janjic, Vuk; Przulj, Natasa; Troyanskaya, Olga G.; Stagljar, Igor; Xia, Tian; Ohya, Yoshikazu; Gingras, Anne-Claude; Raught, Brian; Boutros, Michael; Steinmetz, Lars M.; Moore, Claire L.; Rosebrock, Adam P.; Caudy, Amy A.; Myers, Chad L.; Andrews, Brenda; Boone, Charles
2017-01-01
We generated a global genetic interaction network for Saccharomyces cerevisiae, constructing over 23 million double mutants, identifying ~550,000 negative and ~350,000 positive genetic interactions. This comprehensive network maps genetic interactions for essential gene pairs, highlighting essential genes as densely connected hubs. Genetic interaction profiles enabled assembly of a hierarchical model of cell function, including modules corresponding to protein complexes and pathways, biological processes, and cellular compartments. Negative interactions connected functionally related genes, mapped core bioprocesses, and identified pleiotropic genes, whereas positive interactions often mapped general regulatory connections among gene pairs, rather than shared functionality. The global network illustrates how coherent sets of genetic interactions connect protein complex and pathway modules to map a functional wiring diagram of the cell. PMID:27708008
2014-05-16
native uncharacterized genes for characterized genes from Bacillus subtilis , that is presented in a constitutive expression module. If the B... subtilis gene containing M. mycoides mutant is viable than the function of the conserved hypothetical gene is the same as the input B. subtilis gene...Characterized genes from B. subtilis were swapped with similar, but not so similar as to be clearly the same, essential genes from M. mycoides. The B. subtilis
Thimgan, Matthew S.; Seugnet, Laurent; Turk, John; Shaw, Paul J.
2015-01-01
Background and Study Objectives: Flies mutant for the canonical clock protein cycle (cyc01) exhibit a sleep rebound that is ∼10 times larger than wild-type flies and die after only 10 h of sleep deprivation. Surprisingly, when starved, cyc01 mutants can remain awake for 28 h without demonstrating negative outcomes. Thus, we hypothesized that identifying transcripts that are differentially regulated between waking induced by sleep deprivation and waking induced by starvation would identify genes that underlie the deleterious effects of sleep deprivation and/or protect flies from the negative consequences of waking. Design: We used partial complementary DNA microarrays to identify transcripts that are differentially expressed between cyc01 mutants that had been sleep deprived or starved for 7 h. We then used genetics to determine whether disrupting genes involved in lipid metabolism would exhibit alterations in their response to sleep deprivation. Setting: Laboratory. Patients or Participants: Drosophila melanogaster. Interventions: Sleep deprivation and starvation. Measurements and Results: We identified 84 genes with transcript levels that were differentially modulated by 7 h of sleep deprivation and starvation in cyc01 mutants and were confirmed in independent samples using quantitative polymerase chain reaction. Several of these genes were predicted to be lipid metabolism genes, including bubblegum, cueball, and CG4500, which based on our data we have renamed heimdall (hll). Using lipidomics we confirmed that knockdown of hll using RNA interference significantly decreased lipid stores. Importantly, genetically modifying bubblegum, cueball, or hll resulted in sleep rebound alterations following sleep deprivation compared to genetic background controls. Conclusions: We have identified a set of genes that may confer resilience/vulnerability to sleep deprivation and demonstrate that genes involved in lipid metabolism modulate sleep homeostasis. Citation: Thimgan MS, Seugnet L, Turk J, Shaw PJ. Identification of genes associated with resilience/vulnerability to sleep deprivation and starvation in Drosophila. SLEEP 2015;38(5):801–814. PMID:25409104
The GMOD Drupal bioinformatic server framework.
Papanicolaou, Alexie; Heckel, David G
2010-12-15
Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com.
GoldenBraid: An Iterative Cloning System for Standardized Assembly of Reusable Genetic Modules
Sarrion-Perdigones, Alejandro; Falconi, Erica Elvira; Zandalinas, Sara I.; Juárez, Paloma; Fernández-del-Carmen, Asun; Granell, Antonio; Orzaez, Diego
2011-01-01
Synthetic Biology requires efficient and versatile DNA assembly systems to facilitate the building of new genetic modules/pathways from basic DNA parts in a standardized way. Here we present GoldenBraid (GB), a standardized assembly system based on type IIS restriction enzymes that allows the indefinite growth of reusable gene modules made of standardized DNA pieces. The GB system consists of a set of four destination plasmids (pDGBs) designed to incorporate multipartite assemblies made of standard DNA parts and to combine them binarily to build increasingly complex multigene constructs. The relative position of type IIS restriction sites inside pDGB vectors introduces a double loop (“braid”) topology in the cloning strategy that allows the indefinite growth of composite parts through the succession of iterative assembling steps, while the overall simplicity of the system is maintained. We propose the use of GoldenBraid as an assembly standard for Plant Synthetic Biology. For this purpose we have GB-adapted a set of binary plasmids for A. tumefaciens-mediated plant transformation. Fast GB-engineering of several multigene T-DNAs, including two alternative modules made of five reusable devices each, and comprising a total of 19 basic parts are also described. PMID:21750718
GoldenBraid: an iterative cloning system for standardized assembly of reusable genetic modules.
Sarrion-Perdigones, Alejandro; Falconi, Erica Elvira; Zandalinas, Sara I; Juárez, Paloma; Fernández-del-Carmen, Asun; Granell, Antonio; Orzaez, Diego
2011-01-01
Synthetic Biology requires efficient and versatile DNA assembly systems to facilitate the building of new genetic modules/pathways from basic DNA parts in a standardized way. Here we present GoldenBraid (GB), a standardized assembly system based on type IIS restriction enzymes that allows the indefinite growth of reusable gene modules made of standardized DNA pieces. The GB system consists of a set of four destination plasmids (pDGBs) designed to incorporate multipartite assemblies made of standard DNA parts and to combine them binarily to build increasingly complex multigene constructs. The relative position of type IIS restriction sites inside pDGB vectors introduces a double loop ("braid") topology in the cloning strategy that allows the indefinite growth of composite parts through the succession of iterative assembling steps, while the overall simplicity of the system is maintained. We propose the use of GoldenBraid as an assembly standard for Plant Synthetic Biology. For this purpose we have GB-adapted a set of binary plasmids for A. tumefaciens-mediated plant transformation. Fast GB-engineering of several multigene T-DNAs, including two alternative modules made of five reusable devices each, and comprising a total of 19 basic parts are also described.
Klauser, Benedikt; Atanasov, Janina; Siewert, Lena K; Hartig, Jörg S
2015-05-15
Systems for conditional gene expression are powerful tools in basic research as well as in biotechnology. For future applications, it is of great importance to engineer orthogonal genetic switches that function reliably in diverse contexts. RNA-based switches have the advantage that effector molecules interact immediately with regulatory modules inserted into the target RNAs, getting rid of the need of transcription factors usually mediating genetic control. Artificial riboswitches are characterized by their simplicity and small size accompanied by a high degree of modularity. We have recently reported a series of hammerhead ribozyme-based artificial riboswitches that allow for post-transcriptional regulation of gene expression via switching mRNA, tRNA, or rRNA functions. A more widespread application was so far hampered by moderate switching performances and a limited set of effector molecules available. Here, we report the re-engineering of hammerhead ribozymes in order to respond efficiently to aminoglycoside antibiotics. We first established an in vivo selection protocol in Saccharomyces cerevisiae that enabled us to search large sequence spaces for optimized switches. We then envisioned and characterized a novel strategy of attaching the aptamer to the ribozyme catalytic core, increasing the design options for rendering the ribozyme ligand-dependent. These innovations enabled the development of neomycin-dependent RNA modules that switch gene expression up to 25-fold. The presented aminoglycoside-responsive riboswitches belong to the best-performing RNA-based genetic regulators reported so far. The developed in vivo selection protocol should allow for sampling of large sequence spaces for engineering of further optimized riboswitches.
Bai, Gaobo; Zheng, Wenling; Ma, Wenli
2018-05-01
Hepatitis C virus (HCV)-induced human hepatocellular carcinoma (HCC) progression may be due to a complex multi-step processes. The developmental mechanism of these processes is worth investigating for the prevention, diagnosis and therapy of HCC. The aim of the present study was to investigate the molecular mechanism underlying the progression of HCV-induced hepatocarcinogenesis. First, the dynamic gene module, consisting of key genes associated with progression between the normal stage and HCC, was identified using the Weighted Gene Co-expression Network Analysis tool from R language. By defining those genes in the module as seeds, the change of co-expression in differentially expressed gene sets in two consecutive stages of pathological progression was examined. Finally, interaction pairs of HCV viral proteins and their directly targeted proteins in the identified module were extracted from the literature and a comprehensive interaction dataset from yeast two-hybrid experiments. By combining the interactions between HCV and their targets, and protein-protein interactions in the Search Tool for the Retrieval of Interacting Genes database (STRING), the HCV-key genes interaction network was constructed and visualized using Cytoscape software 3.2. As a result, a module containing 44 key genes was identified to be associated with HCC progression, due to the dynamic features and functions of those genes in the module. Several important differentially co-expressed gene pairs were identified between non-HCC and HCC stages. In the key genes, cyclin dependent kinase 1 (CDK1), NDC80, cyclin A2 (CCNA2) and rac GTPase activating protein 1 (RACGAP1) were shown to be targeted by the HCV nonstructural proteins NS5A, NS3 and NS5B, respectively. The four genes perform an intermediary role between the HCV viral proteins and the dysfunctional module in the HCV key genes interaction network. These findings provided valuable information for understanding the mechanism of HCV-induced HCC progression and for seeking drug targets for the therapy and prevention of HCC.
Comparative modular analysis of gene expression in vertebrate organs.
Piasecka, Barbara; Kutalik, Zoltán; Roux, Julien; Bergmann, Sven; Robinson-Rechavi, Marc
2012-03-29
The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.
2014-08-15
characterized genes from Bacillus subtilis , that is presented in a constitutive expression module. If the B. subtilis gene containing M. mycoides mutant is...essential gene MMYC_0361 with the rlmH gene from Bacillus subtilis . Mycoplasma mycoides containing the B. subtilis rlmH was viable. This tells us the...viable than the function of the conserved hypothetical gene is the same as the input B. subtilis gene. Table of Contents: Section
Binder, Andreas; Lambert, Jayne; Morbitzer, Robert; Popp, Claudia; Ott, Thomas; Lahaye, Thomas; Parniske, Martin
2014-01-01
The Golden Gate (GG) modular assembly approach offers a standardized, inexpensive and reliable way to ligate multiple DNA fragments in a pre-defined order in a single-tube reaction. We developed a GG based toolkit for the flexible construction of binary plasmids for transgene expression in plants. Starting from a common set of modules, such as promoters, protein tags and transcribed regions of interest, synthetic genes are assembled, which can be further combined to multigene constructs. As an example, we created T-DNA constructs encoding multiple fluorescent proteins targeted to distinct cellular compartments (nucleus, cytosol, plastids) and demonstrated simultaneous expression of all genes in Nicotiana benthamiana, Lotus japonicus and Arabidopsis thaliana. We assembled an RNA interference (RNAi) module for the construction of intron-spliced hairpin RNA constructs and demonstrated silencing of GFP in N. benthamiana. By combination of the silencing construct together with a codon adapted rescue construct into one vector, our system facilitates genetic complementation and thus confirmation of the causative gene responsible for a given RNAi phenotype. As proof of principle, we silenced a destabilized GFP gene (dGFP) and restored GFP fluorescence by expression of a recoded version of dGFP, which was not targeted by the silencing construct. PMID:24551083
MONGKIE: an integrated tool for network analysis and visualization for multi-omics data.
Jang, Yeongjun; Yu, Namhee; Seo, Jihae; Kim, Sun; Lee, Sanghyuk
2016-03-18
Network-based integrative analysis is a powerful technique for extracting biological insights from multilayered omics data such as somatic mutations, copy number variations, and gene expression data. However, integrated analysis of multi-omics data is quite complicated and can hardly be done in an automated way. Thus, a powerful interactive visual mining tool supporting diverse analysis algorithms for identification of driver genes and regulatory modules is much needed. Here, we present a software platform that integrates network visualization with omics data analysis tools seamlessly. The visualization unit supports various options for displaying multi-omics data as well as unique network models for describing sophisticated biological networks such as complex biomolecular reactions. In addition, we implemented diverse in-house algorithms for network analysis including network clustering and over-representation analysis. Novel functions include facile definition and optimized visualization of subgroups, comparison of a series of data sets in an identical network by data-to-visual mapping and subsequent overlaying function, and management of custom interaction networks. Utility of MONGKIE for network-based visual data mining of multi-omics data was demonstrated by analysis of the TCGA glioblastoma data. MONGKIE was developed in Java based on the NetBeans plugin architecture, thus being OS-independent with intrinsic support of module extension by third-party developers. We believe that MONGKIE would be a valuable addition to network analysis software by supporting many unique features and visualization options, especially for analysing multi-omics data sets in cancer and other diseases. .
Identifying module biomarkers from gastric cancer by differential correlation network
Liu, Xiaoping; Chang, Xiao
2016-01-01
Gastric cancer (stomach cancer) is a severe disease caused by dysregulation of many functionally correlated genes or pathways instead of the mutation of individual genes. Systematic identification of gastric cancer biomarkers can provide insights into the mechanisms underlying this deadly disease and help in the development of new drugs. In this paper, we present a novel network-based approach to predict module biomarkers of gastric cancer that can effectively distinguish the disease from normal samples. Specifically, by assuming that gastric cancer has mainly resulted from dysfunction of biomolecular networks rather than individual genes in an organism, the genes in the module biomarkers are potentially related to gastric cancer. Finally, we identified a module biomarker with 27 genes, and by comparing the module biomarker with known gastric cancer biomarkers, we found that our module biomarker exhibited a greater ability to diagnose the samples with gastric cancer. PMID:27703371
Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing
Hua, Xing; Zeller, Georg; Sunagawa, Shinichi; Voigt, Anita Y.; Hercog, Rajna; Goedert, James J.; Shi, Jianxin; Bork, Peer; Sinha, Rashmi
2016-01-01
Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect associations that are reproducible and significant after correction for multiple testing. PMID:27171425
Chen, Rui; Davis, Lea K; Guter, Stephen; Wei, Qiang; Jacob, Suma; Potter, Melissa H; Cox, Nancy J; Cook, Edwin H; Sutcliffe, James S; Li, Bingshan
2017-01-01
Autism spectrum disorder (ASD) is one of the most highly heritable neuropsychiatric disorders, but underlying molecular mechanisms are still unresolved due to extreme locus heterogeneity. Leveraging meaningful endophenotypes or biomarkers may be an effective strategy to reduce heterogeneity to identify novel ASD genes. Numerous lines of evidence suggest a link between hyperserotonemia, i.e., elevated serotonin (5-hydroxytryptamine or 5-HT) in whole blood, and ASD. However, the genetic determinants of blood 5-HT level and their relationship to ASD are largely unknown. In this study, pursuing the hypothesis that de novo variants (DNVs) and rare risk alleles acting in a recessive mode may play an important role in predisposition of hyperserotonemia in people with ASD, we carried out whole exome sequencing (WES) in 116 ASD parent-proband trios with most (107) probands having 5-HT measurements. Combined with published ASD DNVs, we identified USP15 as having recurrent de novo loss of function mutations and discovered evidence supporting two other known genes with recurrent DNVs ( FOXP1 and KDM5B ). Genes harboring functional DNVs significantly overlap with functional/disease gene sets known to be involved in ASD etiology, including FMRP targets and synaptic formation and transcriptional regulation genes. We grouped the probands into High-5HT and Normal-5HT groups based on normalized serotonin levels, and used network-based gene set enrichment analysis (NGSEA) to identify novel hyperserotonemia-related ASD genes based on LoF and missense DNVs. We found enrichment in the High-5HT group for a gene network module (DAWN-1) previously implicated in ASD, and this points to the TGF-β pathway and cell junction processes. Through analysis of rare recessively acting variants (RAVs), we also found that rare compound heterozygotes (CHs) in the High-5HT group were enriched for loci in an ASD-associated gene set. Finally, we carried out rare variant group-wise transmission disequilibrium tests (gTDT) and observed significant association of rare variants in genes encoding a subset of the serotonin pathway with ASD. Our study identified USP15 as a novel gene implicated in ASD based on recurrent DNVs. It also demonstrates the potential value of 5-HT as an effective endophenotype for gene discovery in ASD, and the effectiveness of this strategy needs to be further explored in studies of larger sample sizes.
Fu, Guifang; Dai, Xiaotian; Symanzik, Jürgen; Bushman, Shaun
2017-01-01
Leaf shape traits have long been a focus of many disciplines, but the complex genetic and environmental interactive mechanisms regulating leaf shape variation have not yet been investigated in detail. The question of the respective roles of genes and environment and how they interact to modulate leaf shape is a thorny evolutionary problem, and sophisticated methodology is needed to address it. In this study, we investigated a framework-level approach that inputs shape image photographs and genetic and environmental data, and then outputs the relative importance ranks of all variables after integrating shape feature extraction, dimension reduction, and tree-based statistical models. The power of the proposed framework was confirmed by simulation and a Populus szechuanica var. tibetica data set. This new methodology resulted in the detection of novel shape characteristics, and also confirmed some previous findings. The quantitative modeling of a combination of polygenetic, plastic, epistatic, and gene-environment interactive effects, as investigated in this study, will improve the discernment of quantitative leaf shape characteristics, and the methods are ready to be applied to other leaf morphology data sets. Unlike the majority of approaches in the quantitative leaf shape literature, this framework-level approach is data-driven, without assuming any pre-known shape attributes, landmarks, or model structures. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Aspler, Anne L; Bolshin, Carly; Vernon, Suzanne D; Broderick, Gordon
2008-09-26
Genomic profiling of peripheral blood reveals altered immunity in chronic fatigue syndrome (CFS) however interpretation remains challenging without immune demographic context. The object of this work is to identify modulation of specific immune functional components and restructuring of co-expression networks characteristic of CFS using the quantitative genomics of peripheral blood. Gene sets were constructed a priori for CD4+ T cells, CD8+ T cells, CD19+ B cells, CD14+ monocytes and CD16+ neutrophils from published data. A group of 111 women were classified using empiric case definition (U.S. Centers for Disease Control and Prevention) and unsupervised latent cluster analysis (LCA). Microarray profiles of peripheral blood were analyzed for expression of leukocyte-specific gene sets and characteristic changes in co-expression identified from topological evaluation of linear correlation networks. Median expression for a set of 6 genes preferentially up-regulated in CD19+ B cells was significantly lower in CFS (p = 0.01) due mainly to PTPRK and TSPAN3 expression. Although no other gene set was differentially expressed at p < 0.05, patterns of co-expression in each group differed markedly. Significant co-expression of CD14+ monocyte with CD16+ neutrophil (p = 0.01) and CD19+ B cell sets (p = 0.00) characterized CFS and fatigue phenotype groups. Also in CFS was a significant negative correlation between CD8+ and both CD19+ up-regulated (p = 0.02) and NK gene sets (p = 0.08). These patterns were absent in controls. Dissection of blood microarray profiles points to B cell dysfunction with coordinated immune activation supporting persistent inflammation and antibody-mediated NK cell modulation of T cell activity. This has clinical implications as the CD19+ genes identified could provide robust and biologically meaningful basis for the early detection and unambiguous phenotyping of CFS.
Almeida, Luciana O; Neto, Marinaldo P C; Sousa, Lucas O; Tannous, Maryna A; Curti, Carlos; Leopoldino, Andreia M
2017-04-18
Epigenetic modifications are essential in the control of normal cellular processes and cancer development. DNA methylation and histone acetylation are major epigenetic modifications involved in gene transcription and abnormal events driving the oncogenic process. SET protein accumulates in many cancer types, including head and neck squamous cell carcinoma (HNSCC); SET is a member of the INHAT complex that inhibits gene transcription associating with histones and preventing their acetylation. We explored how SET protein accumulation impacts on the regulation of gene expression, focusing on DNA methylation and histone acetylation. DNA methylation profile of 24 tumour suppressors evidenced that SET accumulation decreased DNA methylation in association with loss of 5-methylcytidine, formation of 5-hydroxymethylcytosine and increased TET1 levels, indicating an active DNA demethylation mechanism. However, the expression of some suppressor genes was lowered in cells with high SET levels, suggesting that loss of methylation is not the main mechanism modulating gene expression. SET accumulation also downregulated the expression of 32 genes of a panel of 84 transcription factors, and SET directly interacted with chromatin at the promoter of the downregulated genes, decreasing histone acetylation. Gene expression analysis after cell treatment with 5-aza-2'-deoxycytidine (5-AZA) and Trichostatin A (TSA) revealed that histone acetylation reversed transcription repression promoted by SET. These results suggest a new function for SET in the regulation of chromatin dynamics. In addition, TSA diminished both SET protein levels and SET capability to bind to gene promoter, suggesting that administration of epigenetic modifier agents could be efficient to reverse SET phenotype in cancer.
Roy, Sujoy; Yun, Daqing; Madahian, Behrouz; Berry, Michael W.; Deng, Lih-Yuan; Goldowitz, Daniel; Homayouni, Ramin
2017-01-01
In this study, we developed and evaluated a novel text-mining approach, using non-negative tensor factorization (NTF), to simultaneously extract and functionally annotate transcriptional modules consisting of sets of genes, transcription factors (TFs), and terms from MEDLINE abstracts. A sparse 3-mode term × gene × TF tensor was constructed that contained weighted frequencies of 106,895 terms in 26,781 abstracts shared among 7,695 genes and 994 TFs. The tensor was decomposed into sub-tensors using non-negative tensor factorization (NTF) across 16 different approximation ranks. Dominant entries of each of 2,861 sub-tensors were extracted to form term–gene–TF annotated transcriptional modules (ATMs). More than 94% of the ATMs were found to be enriched in at least one KEGG pathway or GO category, suggesting that the ATMs are functionally relevant. One advantage of this method is that it can discover potentially new gene–TF associations from the literature. Using a set of microarray and ChIP-Seq datasets as gold standard, we show that the precision of our method for predicting gene–TF associations is significantly higher than chance. In addition, we demonstrate that the terms in each ATM can be used to suggest new GO classifications to genes and TFs. Taken together, our results indicate that NTF is useful for simultaneous extraction and functional annotation of transcriptional regulatory networks from unstructured text, as well as for literature based discovery. A web tool called Transcriptional Regulatory Modules Extracted from Literature (TREMEL), available at http://binf1.memphis.edu/tremel, was built to enable browsing and searching of ATMs. PMID:28894735
Ferro, Myriam; Tardif, Marianne; Reguer, Erwan; Cahuzac, Romain; Bruley, Christophe; Vermat, Thierry; Nugues, Estelle; Vigouroux, Marielle; Vandenbrouck, Yves; Garin, Jérôme; Viari, Alain
2008-05-01
PepLine is a fully automated software which maps MS/MS fragmentation spectra of trypsic peptides to genomic DNA sequences. The approach is based on Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF MS/MS spectra (first module). PSTs are then mapped on the six-frame translations of genomic sequences (second module) giving hits. Hits are then clustered to detect potential coding regions (third module). Our work aimed at optimizing the algorithms of each component to allow the whole pipeline to proceed in a fully automated manner using raw nucleic acid sequences (i.e., genomes that have not been "reduced" to a database of ORFs or putative exons sequences). The whole pipeline was tested on controlled MS/MS spectra sets from standard proteins and from Arabidopsis thaliana envelope chloroplast samples. Our results demonstrate that PepLine competed with protein database searching softwares and was fast enough to potentially tackle large data sets and/or high size genomes. We also illustrate the potential of this approach for the detection of the intron/exon structure of genes.
Text processing through Web services: calling Whatizit.
Rebholz-Schuhmann, Dietrich; Arregui, Miguel; Gaudan, Sylvain; Kirsch, Harald; Jimeno, Antonio
2008-01-15
Text-mining (TM) solutions are developing into efficient services to researchers in the biomedical research community. Such solutions have to scale with the growing number and size of resources (e.g. available controlled vocabularies), with the amount of literature to be processed (e.g. about 17 million documents in PubMed) and with the demands of the user community (e.g. different methods for fact extraction). These demands motivated the development of a server-based solution for literature analysis. Whatizit is a suite of modules that analyse text for contained information, e.g. any scientific publication or Medline abstracts. Special modules identify terms and then link them to the corresponding entries in bioinformatics databases such as UniProtKb/Swiss-Prot data entries and gene ontology concepts. Other modules identify a set of selected annotation types like the set produced by the EBIMed analysis pipeline for proteins. In the case of Medline abstracts, Whatizit offers access to EBI's in-house installation via PMID or term query. For large quantities of the user's own text, the server can be operated in a streaming mode (http://www.ebi.ac.uk/webservices/whatizit).
Cinti, Alessandro; De Giorgi, Marco; Chisci, Elisa; Arena, Claudia; Galimberti, Gloria; Farina, Laura; Bugarin, Cristina; Rivolta, Ilaria; Gaipa, Giuseppe; Smolenski, Ryszard Tom; Cerrito, Maria Grazia; Lavitrano, Marialuisa; Giovannoni, Roberto
2015-01-01
Several biomedical applications, such as xenotransplantation, require multiple genes simultaneously expressed in eukaryotic cells. Advances in genetic engineering technologies have led to the development of efficient polycistronic vectors based on the use of the 2A self-processing oligopeptide. The aim of this work was to evaluate the protective effects of the simultaneous expression of a novel combination of anti-inflammatory human genes, ENTPD1, E5NT and HO-1, in eukaryotic cells. We produced an F2A system-based multicistronic construct to express three human proteins in NIH3T3 cells exposed to an inflammatory stimulus represented by tumor necrosis factor alpha (TNF-α), a pro-inflammatory cytokine which plays an important role during inflammation, cell proliferation, differentiation and apoptosis and in the inflammatory response during ischemia/reperfusion injury in several organ transplantation settings. The protective effects against TNF-α-induced cytotoxicity and cell death, mediated by HO-1, ENTPD1 and E5NT genes were better observed in cells expressing the combination of genes as compared to cells expressing each single gene and the effect was further improved by administrating enzymatic substrates of the human genes to the cells. Moreover, a gene expression analyses demonstrated that the expression of the three genes has a role in modulating key regulators of TNF-α signalling pathway, namely Nemo and Tnfaip3, that promoted pro-survival phenotype in TNF-α injured cells. These results could provide new insights in the research of protective mechanisms in transplantation settings. PMID:26513260
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia
2014-08-28
The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less
Increased entropy of signal transduction in the cancer metastasis phenotype.
Teschendorff, Andrew E; Severini, Simone
2010-07-30
The statistical study of biological networks has led to important novel biological insights, such as the presence of hubs and hierarchical modularity. There is also a growing interest in studying the statistical properties of networks in the context of cancer genomics. However, relatively little is known as to what network features differ between the cancer and normal cell physiologies, or between different cancer cell phenotypes. Based on the observation that frequent genomic alterations underlie a more aggressive cancer phenotype, we asked if such an effect could be detectable as an increase in the randomness of local gene expression patterns. Using a breast cancer gene expression data set and a model network of protein interactions we derive constrained weighted networks defined by a stochastic information flux matrix reflecting expression correlations between interacting proteins. Based on this stochastic matrix we propose and compute an entropy measure that quantifies the degree of randomness in the local pattern of information flux around single genes. By comparing the local entropies in the non-metastatic versus metastatic breast cancer networks, we here show that breast cancers that metastasize are characterised by a small yet significant increase in the degree of randomness of local expression patterns. We validate this result in three additional breast cancer expression data sets and demonstrate that local entropy better characterises the metastatic phenotype than other non-entropy based measures. We show that increases in entropy can be used to identify genes and signalling pathways implicated in breast cancer metastasis and provide examples of de-novo discoveries of gene modules with known roles in apoptosis, immune-mediated tumour suppression, cell-cycle and tumour invasion. Importantly, we also identify a novel gene module within the insulin growth factor signalling pathway, alteration of which may predispose the tumour to metastasize. These results demonstrate that a metastatic cancer phenotype is characterised by an increase in the randomness of the local information flux patterns. Measures of local randomness in integrated protein interaction mRNA expression networks may therefore be useful for identifying genes and signalling pathways disrupted in one phenotype relative to another. Further exploration of the statistical properties of such integrated cancer expression and protein interaction networks will be a fruitful endeavour.
In vivo delivery of miRNAs for cancer therapy: Challenges and strategies⋆
Chen, Yunching; Gao, Dong-Yu; Huang, Leaf
2016-01-01
MicroRNAs (miRNAs), small non-coding RNAs, can regulate post-transcriptional gene expressions and silence a broad set of target genes. miRNAs, aberrantly expressed in cancer cells, play an important role in modulating gene expressions, thereby regulating downstream signaling pathways and affecting cancer formation and progression. Oncogenes or tumor suppressor genes regulated by miRNAs mediate cell cycle progression, metabolism, cell death, angiogenesis, metastasis and immunosuppression in cancer. Recently, miRNAs have emerged as therapeutic targets or tools and biomarkers for diagnosis and therapy monitoring in cancer. Since miRNAs can regulate multiple cancer-related genes simultaneously, using miRNAs as a therapeutic approach plays an important role in cancer therapy. However, one of the major challenges of miRNA-based cancer therapy is to achieve specific, efficient and safe systemic delivery of therapeutic miRNAs In vivo. This review discusses the key challenges to the development of the carriers for miRNA-based therapy and explores current strategies to systemically deliver miRNAs to cancer without induction of toxicity. PMID:24859533
Investigation of candidate genes for osteoarthritis based on gene expression profiles.
Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei
2016-12-01
To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Stajdohar, Miha; Rosengarten, Rafael D; Kokosar, Janez; Jeran, Luka; Blenkus, Domen; Shaulsky, Gad; Zupan, Blaz
2017-06-02
Dictyostelium discoideum, a soil-dwelling social amoeba, is a model for the study of numerous biological processes. Research in the field has benefited mightily from the adoption of next-generation sequencing for genomics and transcriptomics. Dictyostelium biologists now face the widespread challenges of analyzing and exploring high dimensional data sets to generate hypotheses and discovering novel insights. We present dictyExpress (2.0), a web application designed for exploratory analysis of gene expression data, as well as data from related experiments such as Chromatin Immunoprecipitation sequencing (ChIP-Seq). The application features visualization modules that include time course expression profiles, clustering, gene ontology enrichment analysis, differential expression analysis and comparison of experiments. All visualizations are interactive and interconnected, such that the selection of genes in one module propagates instantly to visualizations in other modules. dictyExpress currently stores the data from over 800 Dictyostelium experiments and is embedded within a general-purpose software framework for management of next-generation sequencing data. dictyExpress allows users to explore their data in a broader context by reciprocal linking with dictyBase-a repository of Dictyostelium genomic data. In addition, we introduce a companion application called GenBoard, an intuitive graphic user interface for data management and bioinformatics analysis. dictyExpress and GenBoard enable broad adoption of next generation sequencing based inquiries by the Dictyostelium research community. Labs without the means to undertake deep sequencing projects can mine the data available to the public. The entire information flow, from raw sequence data to hypothesis testing, can be accomplished in an efficient workspace. The software framework is generalizable and represents a useful approach for any research community. To encourage more wide usage, the backend is open-source, available for extension and further development by bioinformaticians and data scientists.
Microarray analysis reveals key genes and pathways in Tetralogy of Fallot
He, Yue-E; Qiu, Hui-Xian; Jiang, Jian-Bing; Wu, Rong-Zhou; Xiang, Ru-Lian; Zhang, Yuan-Hai
2017-01-01
The aim of the present study was to identify key genes that may be involved in the pathogenesis of Tetralogy of Fallot (TOF) using bioinformatics methods. The GSE26125 microarray dataset, which includes cardiovascular tissue samples derived from 16 children with TOF and five healthy age-matched control infants, was downloaded from the Gene Expression Omnibus database. Differential expression analysis was performed between TOF and control samples to identify differentially expressed genes (DEGs) using Student's t-test, and the R/limma package, with a log2 fold-change of >2 and a false discovery rate of <0.01 set as thresholds. The biological functions of DEGs were analyzed using the ToppGene database. The ReactomeFIViz application was used to construct functional interaction (FI) networks, and the genes in each module were subjected to pathway enrichment analysis. The iRegulon plugin was used to identify transcription factors predicted to regulate the DEGs in the FI network, and the gene-transcription factor pairs were then visualized using Cytoscape software. A total of 878 DEGs were identified, including 848 upregulated genes and 30 downregulated genes. The gene FI network contained seven function modules, which were all comprised of upregulated genes. Genes enriched in Module 1 were enriched in the following three neurological disorder-associated signaling pathways: Parkinson's disease, Alzheimer's disease and Huntington's disease. Genes in Modules 0, 3 and 5 were dominantly enriched in pathways associated with ribosomes and protein translation. The Xbox binding protein 1 transcription factor was demonstrated to be involved in the regulation of genes encoding the subunits of cytoplasmic and mitochondrial ribosomes, as well as genes involved in neurodegenerative disorders. Therefore, dysfunction of genes involved in signaling pathways associated with neurodegenerative disorders, ribosome function and protein translation may contribute to the pathogenesis of TOF. PMID:28713939
Robustness, evolvability, and the logic of genetic regulation.
Payne, Joshua L; Moore, Jason H; Wagner, Andreas
2014-01-01
In gene regulatory circuits, the expression of individual genes is commonly modulated by a set of regulating gene products, which bind to a gene's cis-regulatory region. This region encodes an input-output function, referred to as signal-integration logic, that maps a specific combination of regulatory signals (inputs) to a particular expression state (output) of a gene. The space of all possible signal-integration functions is vast and the mapping from input to output is many-to-one: For the same set of inputs, many functions (genotypes) yield the same expression output (phenotype). Here, we exhaustively enumerate the set of signal-integration functions that yield identical gene expression patterns within a computational model of gene regulatory circuits. Our goal is to characterize the relationship between robustness and evolvability in the signal-integration space of regulatory circuits, and to understand how these properties vary between the genotypic and phenotypic scales. Among other results, we find that the distributions of genotypic robustness are skewed, so that the majority of signal-integration functions are robust to perturbation. We show that the connected set of genotypes that make up a given phenotype are constrained to specific regions of the space of all possible signal-integration functions, but that as the distance between genotypes increases, so does their capacity for unique innovations. In addition, we find that robust phenotypes are (i) evolvable, (ii) easily identified by random mutation, and (iii) mutationally biased toward other robust phenotypes. We explore the implications of these latter observations for mutation-based evolution by conducting random walks between randomly chosen source and target phenotypes. We demonstrate that the time required to identify the target phenotype is independent of the properties of the source phenotype.
Guo, Sheng-Min; Wang, Jian-Xiong; Li, Jin; Xu, Fang-Yuan; Wei, Quan; Wang, Hai-Ming; Huang, Hou-Qiang; Zheng, Si-Lin; Xie, Yu-Jie; Zhang, Chi
2018-06-15
Osteoarthritis (OA) significantly influences the quality life of people around the world. It is urgent to find an effective way to understand the genetic etiology of OA. We used weighted gene coexpression network analysis (WGCNA) to explore the key genes involved in the subchondral bone pathological process of OA. Fifty gene expression profiles of GSE51588 were downloaded from the Gene Expression Omnibus database. The OA-associated genes and gene ontologies were acquired from JuniorDoc. Weighted gene coexpression network analysis was used to find disease-related networks based on 21756 gene expression correlation coefficients, hub-genes with the highest connectivity in each module were selected, and the correlation between module eigengene and clinical traits was calculated. The genes in the traits-related gene coexpression modules were subject to functional annotation and pathway enrichment analysis using ClusterProfiler. A total of 73 gene modules were identified, of which, 12 modules were found with high connectivity with clinical traits. Five modules were found with enriched OA-associated genes. Moreover, 310 OA-associated genes were found, and 34 of them were among hub-genes in each module. Consequently, enrichment results indicated some key metabolic pathways, such as extracellular matrix (ECM)-receptor interaction (hsa04512), focal adhesion (hsa04510), the phosphatidylinositol 3'-kinase (PI3K)-Akt signaling pathway (PI3K-AKT) (hsa04151), transforming growth factor beta pathway, and Wnt pathway. We intended to identify some core genes, collagen (COL)6A3, COL6A1, ITGA11, BAMBI, and HCK, which could influence downstream signaling pathways once they were activated. In this study, we identified important genes within key coexpression modules, which associate with a pathological process of subchondral bone in OA. Functional analysis results could provide important information to understand the mechanism of OA. © 2018 Wiley Periodicals, Inc.
Kumar, Gulshan; Gupta, Khushboo; Pathania, Shivalika; Swarnkar, Mohit Kumar; Rattan, Usha Kumari; Singh, Gagandeep; Sharma, Ram Kumar; Singh, Anil Kumar
2017-01-01
The availability of sufficient chilling during bud dormancy plays an important role in the subsequent yield and quality of apple fruit, whereas, insufficient chilling availability negatively impacts the apple production. The transcriptome profiling during bud dormancy release and initial fruit set under low and high chill conditions was performed using RNA-seq. The comparative high number of differentially expressed genes during bud break and fruit set under high chill condition indicates that chilling availability was associated with transcriptional reorganization. The comparative analysis reveals the differential expression of genes involved in phytohormone metabolism, particularly for Abscisic acid, gibberellic acid, ethylene, auxin and cytokinin. The expression of Dormancy Associated MADS-box, Flowering Locus C-like, Flowering Locus T-like and Terminal Flower 1-like genes was found to be modulated under differential chilling. The co-expression network analysis indentified two high chill specific modules that were found to be enriched for “post-embryonic development” GO terms. The network analysis also identified hub genes including Early flowering 7, RAF10, ZEP4 and F-box, which may be involved in regulating chilling-mediated dormancy release and fruit set. The results of transcriptome and co-expression network analysis indicate that chilling availability majorly regulates phytohormone-related pathways and post-embryonic development during bud break. PMID:28198417
A statistical framework for biomedical literature mining.
Chung, Dongjun; Lawson, Andrew; Zheng, W Jim
2017-09-30
In systems biology, it is of great interest to identify new genes that were not previously reported to be associated with biological pathways related to various functions and diseases. Identification of these new pathway-modulating genes does not only promote understanding of pathway regulation mechanisms but also allow identification of novel targets for therapeutics. Recently, biomedical literature has been considered as a valuable resource to investigate pathway-modulating genes. While the majority of currently available approaches are based on the co-occurrence of genes within an abstract, it has been reported that these approaches show only sub-optimal performances because 70% of abstracts contain information only for a single gene. To overcome such limitation, we propose a novel statistical framework based on the concept of ontology fingerprint that uses gene ontology to extract information from large biomedical literature data. The proposed framework simultaneously identifies pathway-modulating genes and facilitates interpreting functions of these new genes. We also propose a computationally efficient posterior inference procedure based on Metropolis-Hastings within Gibbs sampler for parameter updates and the poor man's reversible jump Markov chain Monte Carlo approach for model selection. We evaluate the proposed statistical framework with simulation studies, experimental validation, and an application to studies of pathway-modulating genes in yeast. The R implementation of the proposed model is currently available at https://dongjunchung.github.io/bayesGO/. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Wang, Weijing; Jiang, Wenjie; Hou, Lin; Duan, Haiping; Wu, Yili; Xu, Chunsheng; Tan, Qihua; Li, Shuxia; Zhang, Dongfeng
2017-11-13
The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis and weighted gene co-expression network analysis (WGCNA) to identify significant genes and specific modules related to BMI based on gene expression profile data of 7 discordant monozygotic twins. In the differential gene expression analysis, it appeared that 32 differentially expressed genes (DEGs) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database and NF-kappa B signaling pathway within KEGG database. DEGs of NAMPT, TLR9, PTGS2, HBD, and PCSK1N might be associated with obesity. In the WGCNA, among the total 20 distinct co-expression modules identified, coral1 module (68 genes) had the strongest positive correlation with BMI (r = 0.56, P = 0.04) and disease status (r = 0.56, P = 0.04). Categories of positive regulation of phospholipase activity, high-density lipoprotein particle clearance, chylomicron remnant clearance, reverse cholesterol transport, intermediate-density lipoprotein particle, chylomicron, low-density lipoprotein particle, very-low-density lipoprotein particle, voltage-gated potassium channel complex, cholesterol transporter activity, and neuropeptide hormone activity were significantly enriched within GO database for this module. And alcoholism and cell adhesion molecules pathways were significantly enriched within KEGG database. Several hub genes, such as GAL, ASB9, NPPB, TBX2, IL17C, APOE, ABCG4, and APOC2 were also identified. The module eigengene of saddlebrown module (212 genes) was also significantly correlated with BMI (r = 0.56, P = 0.04), and hub genes of KCNN1 and AQP10 were differentially expressed. We identified significant genes and specific modules potentially related to BMI based on the gene expression profile data of monozygotic twins. The findings may help further elucidate the underlying mechanisms of obesity development and provide novel insights to research potential gene biomarkers and signaling pathways for obesity treatment. Further analysis and validation of the findings reported here are important and necessary when more sample size is acquired.
Ferrari, Raffaele; Forabosco, Paola; Vandrovcova, Jana; Botía, Juan A; Guelfi, Sebastian; Warren, Jason D; Momeni, Parastoo; Weale, Michael E; Ryten, Mina; Hardy, John
2016-02-24
In frontotemporal dementia (FTD) there is a critical lack in the understanding of biological and molecular mechanisms involved in disease pathogenesis. The heterogeneous genetic features associated with FTD suggest that multiple disease-mechanisms are likely to contribute to the development of this neurodegenerative condition. We here present a systems biology approach with the scope of i) shedding light on the biological processes potentially implicated in the pathogenesis of FTD and ii) identifying novel potential risk factors for FTD. We performed a gene co-expression network analysis of microarray expression data from 101 individuals without neurodegenerative diseases to explore regional-specific co-expression patterns in the frontal and temporal cortices for 12 genes (MAPT, GRN, CHMP2B, CTSC, HLA-DRA, TMEM106B, C9orf72, VCP, UBQLN2, OPTN, TARDBP and FUS) associated with FTD and we then carried out gene set enrichment and pathway analyses, and investigated known protein-protein interactors (PPIs) of FTD-genes products. Gene co-expression networks revealed that several FTD-genes (such as MAPT and GRN, CTSC and HLA-DRA, TMEM106B, and C9orf72, VCP, UBQLN2 and OPTN) were clustering in modules of relevance in the frontal and temporal cortices. Functional annotation and pathway analyses of such modules indicated enrichment for: i) DNA metabolism, i.e. transcription regulation, DNA protection and chromatin remodelling (MAPT and GRN modules); ii) immune and lysosomal processes (CTSC and HLA-DRA modules), and; iii) protein meta/catabolism (C9orf72, VCP, UBQLN2 and OPTN, and TMEM106B modules). PPI analysis supported the results of the functional annotation and pathway analyses. This work further characterizes known FTD-genes and elaborates on their biological relevance to disease: not only do we indicate likely impacted regional-specific biological processes driven by FTD-genes containing modules, but also do we suggest novel potential risk factors among the FTD-genes interactors as targets for further mechanistic characterization in hypothesis driven cell biology work.
Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand
2015-01-07
Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
OPATs: Omnibus P-value association tests.
Chen, Chia-Wei; Yang, Hsin-Chou
2017-07-10
Combining statistical significances (P-values) from a set of single-locus association tests in genome-wide association studies is a proof-of-principle method for identifying disease-associated genomic segments, functional genes and biological pathways. We review P-value combinations for genome-wide association studies and introduce an integrated analysis tool, Omnibus P-value Association Tests (OPATs), which provides popular analysis methods of P-value combinations. The software OPATs programmed in R and R graphical user interface features a user-friendly interface. In addition to analysis modules for data quality control and single-locus association tests, OPATs provides three types of set-based association test: window-, gene- and biopathway-based association tests. P-value combinations with or without threshold and rank truncation are provided. The significance of a set-based association test is evaluated by using resampling procedures. Performance of the set-based association tests in OPATs has been evaluated by simulation studies and real data analyses. These set-based association tests help boost the statistical power, alleviate the multiple-testing problem, reduce the impact of genetic heterogeneity, increase the replication efficiency of association tests and facilitate the interpretation of association signals by streamlining the testing procedures and integrating the genetic effects of multiple variants in genomic regions of biological relevance. In summary, P-value combinations facilitate the identification of marker sets associated with disease susceptibility and uncover missing heritability in association studies, thereby establishing a foundation for the genetic dissection of complex diseases and traits. OPATs provides an easy-to-use and statistically powerful analysis tool for P-value combinations. OPATs, examples, and user guide can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/genetics/association/OPATs.htm. © The Author 2017. Published by Oxford University Press.
On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions
NASA Astrophysics Data System (ADS)
Tarpine, Ryan; Istrail, Sorin
The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.
Yang, Jialiang; Qiu, Jing; Wang, Kejing; Zhu, Lijuan; Fan, Jingjing; Zheng, Deyin; Meng, Xiaodi; Yang, Jiasheng; Peng, Lihong; Fu, Yu; Zhang, Dahan; Peng, Shouneng; Huang, Haiyun; Zhang, Yi
2017-01-01
Obesity is a primary risk factor for many diseases such as certain cancers. In this study, we have developed three algorithms including a random-walk based method OBNet, a shortest-path based method OBsp and a direct-overlap method OBoverlap, to reveal obesity-disease connections at protein-interaction subnetworks corresponding to thousands of biological functions and pathways. Through literature mining, we also curated an obesity-associated disease list, by which we compared the methods. As a result, OBNet outperforms other two methods. OBNet can predict whether a disease is obesity-related based on its associated genes. Meanwhile, OBNet identifies extensive connections between obesity genes and genes associated with a few diseases at various functional modules and pathways. Using breast cancer and Type 2 diabetes as two examples, OBNet identifies meaningful genes that may play key roles in connecting obesity and the two diseases. For example, TGFB1 and VEGFA are inferred to be the top two key genes mediating obesity-breast cancer connection in modules associated with brain development. Finally, the top modules identified by OBNet in breast cancer significantly overlap with modules identified from TCGA breast cancer gene expression study, revealing the power of OBNet in identifying biological processes involved in the disease. PMID:29156709
The GMOD Drupal Bioinformatic Server Framework
Papanicolaou, Alexie; Heckel, David G.
2010-01-01
Motivation: Next-generation sequencing technologies have led to the widespread use of -omic applications. As a result, there is now a pronounced bioinformatic bottleneck. The general model organism database (GMOD) tool kit (http://gmod.org) has produced a number of resources aimed at addressing this issue. It lacks, however, a robust online solution that can deploy heterogeneous data and software within a Web content management system (CMS). Results: We present a bioinformatic framework for the Drupal CMS. It consists of three modules. First, GMOD-DBSF is an application programming interface module for the Drupal CMS that simplifies the programming of bioinformatic Drupal modules. Second, the Drupal Bioinformatic Software Bench (biosoftware_bench) allows for a rapid and secure deployment of bioinformatic software. An innovative graphical user interface (GUI) guides both use and administration of the software, including the secure provision of pre-publication datasets. Third, we present genes4all_experiment, which exemplifies how our work supports the wider research community. Conclusion: Given the infrastructure presented here, the Drupal CMS may become a powerful new tool set for bioinformaticians. The GMOD-DBSF base module is an expandable community resource that decreases development time of Drupal modules for bioinformatics. The biosoftware_bench module can already enhance biologists' ability to mine their own data. The genes4all_experiment module has already been responsible for archiving of more than 150 studies of RNAi from Lepidoptera, which were previously unpublished. Availability and implementation: Implemented in PHP and Perl. Freely available under the GNU Public License 2 or later from http://gmod-dbsf.googlecode.com Contact: alexie@butterflybase.org PMID:20971988
Man, Orna; Pilpel, Yitzhak
2007-03-01
A major challenge in comparative genomics is to understand how phenotypic differences between species are encoded in their genomes. Phenotypic divergence may result from differential transcription of orthologous genes, yet less is known about the involvement of differential translation regulation in species phenotypic divergence. In order to assess translation effects on divergence, we analyzed approximately 2,800 orthologous genes in nine yeast genomes. For each gene in each species, we predicted translation efficiency, using a measure of the adaptation of its codons to the organism's tRNA pool. Mining this data set, we found hundreds of genes and gene modules with correlated patterns of translational efficiency across the species. One signal encompassed entire modules that are either needed for oxidative respiration or fermentation and are efficiently translated in aerobic or anaerobic species, respectively. In addition, the efficiency of translation of the mRNA splicing machinery strongly correlates with the number of introns in the various genomes. Altogether, we found extensive selection on synonymous codon usage that modulates translation according to gene function and organism phenotype. We conclude that, like factors such as transcription regulation, translation efficiency affects and is affected by the process of species divergence.
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks
Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.
2013-01-01
Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602
Differential co-expression analysis reveals a novel prognostic gene module in ovarian cancer.
Gov, Esra; Arga, Kazim Yalcin
2017-07-10
Ovarian cancer is one of the most significant disease among gynecological disorders that women suffered from over the centuries. However, disease-specific and effective biomarkers were still not available, since studies have focused on individual genes associated with ovarian cancer, ignoring the interactions and associations among the gene products. Here, ovarian cancer differential co-expression networks were reconstructed via meta-analysis of gene expression data and co-expressed gene modules were identified in epithelial cells from ovarian tumor and healthy ovarian surface epithelial samples to propose ovarian cancer associated genes and their interactions. We propose a novel, highly interconnected, differentially co-expressed, and co-regulated gene module in ovarian cancer consisting of 84 prognostic genes. Furthermore, the specificity of the module to ovarian cancer was shown through analyses of datasets in nine other cancers. These observations underscore the importance of transcriptome based systems biomarkers research in deciphering the elusive pathophysiology of ovarian cancer, and here, we present reciprocal interplay between candidate ovarian cancer genes and their transcriptional regulatory dynamics. The corresponding gene module might provide new insights on ovarian cancer prognosis and treatment strategies that continue to place a significant burden on global health.
Wang, Baojun; Barahona, Mauricio; Buck, Martin
2013-01-01
Cells perceive a wide variety of cellular and environmental signals, which are often processed combinatorially to generate particular phenotypic responses. Here, we employ both single and mixed cell type populations, pre-programmed with engineered modular cell signalling and sensing circuits, as processing units to detect and integrate multiple environmental signals. Based on an engineered modular genetic AND logic gate, we report the construction of a set of scalable synthetic microbe-based biosensors comprising exchangeable sensory, signal processing and actuation modules. These cellular biosensors were engineered using distinct signalling sensory modules to precisely identify various chemical signals, and combinations thereof, with a quantitative fluorescent output. The genetic logic gate used can function as a biological filter and an amplifier to enhance the sensing selectivity and sensitivity of cell-based biosensors. In particular, an Escherichia coli consortium-based biosensor has been constructed that can detect and integrate three environmental signals (arsenic, mercury and copper ion levels) via either its native two-component signal transduction pathways or synthetic signalling sensors derived from other bacteria in combination with a cell-cell communication module. We demonstrate how a modular cell-based biosensor can be engineered predictably using exchangeable synthetic gene circuit modules to sense and integrate multiple-input signals. This study illustrates some of the key practical design principles required for the future application of these biosensors in broad environmental and healthcare areas. PMID:22981411
The NF-YC–RGL2 module integrates GA and ABA signalling to regulate seed germination in Arabidopsis
Liu, Xu; Hu, Pengwei; Huang, Mingkun; Tang, Yang; Li, Yuge; Li, Ling; Hou, Xingliang
2016-01-01
The antagonistic crosstalk between gibberellic acid (GA) and abscisic acid (ABA) plays a pivotal role in the modulation of seed germination. However, the molecular mechanism of such phytohormone interaction remains largely elusive. Here we show that three Arabidopsis NUCLEAR FACTOR-Y C (NF-YC) homologues NF-YC3, NF-YC4 and NF-YC9 redundantly modulate GA- and ABA-mediated seed germination. These NF-YCs interact with the DELLA protein RGL2, a key repressor of GA signalling. The NF-YC–RGL2 module targets ABI5, a gene encoding a core component of ABA signalling, via specific CCAAT elements and collectively regulates a set of GA- and ABA-responsive genes, thus controlling germination. These results suggest that the NF-YC–RGL2–ABI5 module integrates GA and ABA signalling pathways during seed germination. PMID:27624486
Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin
2010-10-25
Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.
Modulation of Cholesterol-Related Gene Expression by Dietary Fiber Fractions from Edible Mushrooms.
Caz, Víctor; Gil-Ramírez, Alicia; Largo, Carlota; Tabernero, María; Santamaría, Mónica; Martín-Hernández, Roberto; Marín, Francisco R; Reglero, Guillermo; Soler-Rivas, Cristina
2015-08-26
Mushrooms are a source of dietary fiber (DF) with a cholesterol-lowering effect. However, their underlying mechanisms are poorly understood. The effect of DF-enriched fractions from three mushrooms species on cholesterol-related expression was studied in vitro. The Pleurotus ostreatus DF fraction (PDF) was used in mice models to assess its potential palliative or preventive effect against hypercholesterolemia. PDF induced a transcriptional response in Caco-2 cells, suggesting a possible cholesterol-lowering effect. In the palliative setting, PDF reduced hepatic triglyceride likely because Dgat1 was downregulated. However, cholesterol-related biochemical data showed no changes and no relation with the observed transcriptional modulation. In the preventive setting, PDF modulated cholesterol-related genes expression in a manner similar to that of simvastatin and ezetimibe in the liver, although no changes in plasma and liver biochemical data were induced. Therefore, PDF may be useful reducing hepatic triglyceride accumulation. Because it induced a molecular response similar to hypocholesterolemic drugs in liver, further dose-dependent studies should be carried out.
2016-01-04
2016 (wileyonlinelibrary.com) DOI 10.1002/jat.3278Systems toxicology of chemically induced liver and kidney injuries: histopathology-associated gene...injuries that classify 11 liver and eight kidney histopathology endpoints based on dose-dependent activation of the identified modules. We showed that...well as determine whether the injury module activation was specific to the tissue of origin (liver and kidney ). The generated modules provide a link
Discovery and validation of a glioblastoma co-expressed gene module
Dunwoodie, Leland J.; Poehlman, William L.; Ficklin, Stephen P.; Feltus, Frank Alexander
2018-01-01
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network. PMID:29541392
Discovery and validation of a glioblastoma co-expressed gene module.
Dunwoodie, Leland J; Poehlman, William L; Ficklin, Stephen P; Feltus, Frank Alexander
2018-02-16
Tumors exhibit complex patterns of aberrant gene expression. Using a knowledge-independent, noise-reducing gene co-expression network construction software called KINC, we created multiple RNAseq-based gene co-expression networks relevant to brain and glioblastoma biology. In this report, we describe the discovery and validation of a glioblastoma-specific gene module that contains 22 co-expressed genes. The genes are upregulated in glioblastoma relative to normal brain and lower grade glioma samples; they are also hypo-methylated in glioblastoma relative to lower grade glioma tumors. Among the proneural, neural, mesenchymal, and classical glioblastoma subtypes, these genes are most-highly expressed in the mesenchymal subtype. Furthermore, high expression of these genes is associated with decreased survival across each glioblastoma subtype. These genes are of interest to glioblastoma biology and our gene interaction discovery and validation workflow can be used to discover and validate co-expressed gene modules derived from any co-expression network.
Predicting disease-related proteins based on clique backbone in protein-protein interaction network.
Yang, Lei; Zhao, Xudong; Tang, Xianglong
2014-01-01
Network biology integrates different kinds of data, including physical or functional networks and disease gene sets, to interpret human disease. A clique (maximal complete subgraph) in a protein-protein interaction network is a topological module and possesses inherently biological significance. A disease-related clique possibly associates with complex diseases. Fully identifying disease components in a clique is conductive to uncovering disease mechanisms. This paper proposes an approach of predicting disease proteins based on cliques in a protein-protein interaction network. To tolerate false positive and negative interactions in protein networks, extending cliques and scoring predicted disease proteins with gene ontology terms are introduced to the clique-based method. Precisions of predicted disease proteins are verified by disease phenotypes and steadily keep to more than 95%. The predicted disease proteins associated with cliques can partly complement mapping between genotype and phenotype, and provide clues for understanding the pathogenesis of serious diseases.
Wei, Hua; Hu, Bo; Tang, Suming; Zhao, Guojie; Guan, Yifu
2016-01-01
Small molecule metabolites and their allosterically regulated repressors play an important role in many gene expression and metabolic disorder processes. These natural sensors, though valuable as good logic switches, have rarely been employed without transcription machinery in cells. Here, two pairs of repressors, which function in opposite ways, were cloned, purified and used to control DNA replication in rolling circle amplification (RCA) in vitro. By using metabolites and repressors as inputs, RCA signals as outputs, four basic logic modules were constructed successfully. To achieve various logic computations based on these basic modules, we designed series and parallel strategies of circular templates, which can further assemble these repressor modules in an RCA platform to realize twelve two-input Boolean logic gates and a three-input logic gate. The RCA-output and RCA-assembled platform was proved to be easy and flexible for complex logic processes and might have application potential in molecular computing and synthetic biology. PMID:27869177
Walter, Ronald B; Boswell, Mikki; Chang, Jordan; Boswell, William T; Lu, Yuan; Navarro, Kaela; Walter, Sean M; Walter, Dylan J; Salinas, Raquel; Savage, Markita
2018-05-10
Evolution occurred exclusively under the full spectrum of sunlight. Conscription of narrow regions of the solar spectrum by specific photoreceptors suggests a common strategy for regulation of genetic pathways. Fluorescent light (FL) does not possess the complexity of the solar spectrum and has only been in service for about 60 years. If vertebrates evolved specific genetic responses regulated by light wavelengths representing the entire solar spectrum, there may be genetic consequences to reducing the spectral complexity of light. We utilized RNA-Seq to assess changes in the transcriptional profiles of Xiphophorus maculatus skin after exposure to FL ("cool white"), or narrow wavelength regions of light between 350 and 600 nm (i.e., 50 nm or 10 nm regions, herein termed "wavebands"). Exposure to each 50 nm waveband identified sets of genes representing discrete pathways that showed waveband specific transcriptional modulation. For example, 350-400 or 450-500 nm waveband exposures resulted in opposite regulation of gene sets marking necrosis and apoptosis (i.e., 350-400 nm; necrosis suppression, apoptosis activation, while 450-500 nm; apoptosis suppression, necrosis activation). Further investigation of specific transcriptional modulation employing successive 10 nm waveband exposures between 500 and 550 nm showed; (a) greater numbers of genes may be transcriptionally modulated after 10 nm exposures, than observed for 50 nm or FL exposures, (b) the 10 nm wavebands induced gene sets showing greater functional specificity than 50 nm or FL exposures, and (c) the genetic effects of FL are primarily due to 30 nm between 500 and 530 nm. Interestingly, many genetic pathways exhibited completely opposite transcriptional effects after different waveband exposures. For example, the epidermal growth factor (EGF) pathway exhibits transcriptional suppression after FL exposure, becomes highly active after 450-500 nm waveband exposure, and again, exhibits strong transcriptional suppression after exposure to the 520-530 nm waveband. Collectively, these results suggest one may manipulate transcription of specific genetic pathways in skin by exposure of the intact animal to specific wavebands of light. In addition, we identify genes transcriptionally modulated in a predictable manner by specific waveband exposures. Such genes, and their regulatory elements, may represent valuable tools for genetic engineering and gene therapy protocols.
On the role of sparseness in the evolution of modularity in gene regulatory networks
2018-01-01
Modularity is a widespread property in biological systems. It implies that interactions occur mainly within groups of system elements. A modular arrangement facilitates adjustment of one module without perturbing the rest of the system. Therefore, modularity of developmental mechanisms is a major factor for evolvability, the potential to produce beneficial variation from random genetic change. Understanding how modularity evolves in gene regulatory networks, that create the distinct gene activity patterns that characterize different parts of an organism, is key to developmental and evolutionary biology. One hypothesis for the evolution of modules suggests that interactions between some sets of genes become maladaptive when selection favours additional gene activity patterns. The removal of such interactions by selection would result in the formation of modules. A second hypothesis suggests that modularity evolves in response to sparseness, the scarcity of interactions within a system. Here I simulate the evolution of gene regulatory networks and analyse diverse experimentally sustained networks to study the relationship between sparseness and modularity. My results suggest that sparseness alone is neither sufficient nor necessary to explain modularity in gene regulatory networks. However, sparseness amplifies the effects of forms of selection that, like selection for additional gene activity patterns, already produce an increase in modularity. That evolution of new gene activity patterns is frequent across evolution also supports that it is a major factor in the evolution of modularity. That sparseness is widespread across gene regulatory networks indicates that it may have facilitated the evolution of modules in a wide variety of cases. PMID:29775459
On the role of sparseness in the evolution of modularity in gene regulatory networks.
Espinosa-Soto, Carlos
2018-05-01
Modularity is a widespread property in biological systems. It implies that interactions occur mainly within groups of system elements. A modular arrangement facilitates adjustment of one module without perturbing the rest of the system. Therefore, modularity of developmental mechanisms is a major factor for evolvability, the potential to produce beneficial variation from random genetic change. Understanding how modularity evolves in gene regulatory networks, that create the distinct gene activity patterns that characterize different parts of an organism, is key to developmental and evolutionary biology. One hypothesis for the evolution of modules suggests that interactions between some sets of genes become maladaptive when selection favours additional gene activity patterns. The removal of such interactions by selection would result in the formation of modules. A second hypothesis suggests that modularity evolves in response to sparseness, the scarcity of interactions within a system. Here I simulate the evolution of gene regulatory networks and analyse diverse experimentally sustained networks to study the relationship between sparseness and modularity. My results suggest that sparseness alone is neither sufficient nor necessary to explain modularity in gene regulatory networks. However, sparseness amplifies the effects of forms of selection that, like selection for additional gene activity patterns, already produce an increase in modularity. That evolution of new gene activity patterns is frequent across evolution also supports that it is a major factor in the evolution of modularity. That sparseness is widespread across gene regulatory networks indicates that it may have facilitated the evolution of modules in a wide variety of cases.
CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets
Li, Yang; Liu, Jun S.; Mootha, Vamsi K.
2017-01-01
In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active. PMID:28719601
Modular analysis of the probabilistic genetic interaction network.
Hou, Lin; Wang, Lin; Qian, Minping; Li, Dong; Tang, Chao; Zhu, Yunping; Deng, Minghua; Li, Fangting
2011-03-15
Epistatic Miniarray Profiles (EMAP) has enabled the mapping of large-scale genetic interaction networks; however, the quantitative information gained from EMAP cannot be fully exploited since the data are usually interpreted as a discrete network based on an arbitrary hard threshold. To address such limitations, we adopted a mixture modeling procedure to construct a probabilistic genetic interaction network and then implemented a Bayesian approach to identify densely interacting modules in the probabilistic network. Mixture modeling has been demonstrated as an effective soft-threshold technique of EMAP measures. The Bayesian approach was applied to an EMAP dataset studying the early secretory pathway in Saccharomyces cerevisiae. Twenty-seven modules were identified, and 14 of those were enriched by gold standard functional gene sets. We also conducted a detailed comparison with state-of-the-art algorithms, hierarchical cluster and Markov clustering. The experimental results show that the Bayesian approach outperforms others in efficiently recovering biologically significant modules.
Blevins, Tana; Aliev, Fazil; Adkins, Amy; Hack, Laura; Bigdeli, Tim; D. van der Vaart, Andrew; Web, Bradley Todd; Bacanu, Silviu-Alin; Kalsi, Gursharan; Kendler, Kenneth S.; Miles, Michael F.; Dick, Danielle; Riley, Brien P.; Dumur, Catherine; Vladimirov, Vladimir I.
2015-01-01
Alcohol consumption is known to lead to gene expression changes in the brain. After performing weighted gene co-expression network analyses (WGCNA) on genome-wide mRNA and microRNA (miRNA) expression in Nucleus Accumbens (NAc) of subjects with alcohol dependence (AD; N = 18) and of matched controls (N = 18), six mRNA and three miRNA modules significantly correlated with AD were identified (Bonferoni-adj. p≤ 0.05). Cell-type-specific transcriptome analyses revealed two of the mRNA modules to be enriched for neuronal specific marker genes and downregulated in AD, whereas the remaining four mRNA modules were enriched for astrocyte and microglial specific marker genes and upregulated in AD. Gene set enrichment analysis demonstrated that neuronal specific modules were enriched for genes involved in oxidative phosphorylation, mitochondrial dysfunction and MAPK signaling. Glial-specific modules were predominantly enriched for genes involved in processes related to immune functions, i.e. cytokine signaling (all adj. p≤ 0.05). In mRNA and miRNA modules, 461 and 25 candidate hub genes were identified, respectively. In contrast to the expected biological functions of miRNAs, correlation analyses between mRNA and miRNA hub genes revealed a higher number of positive than negative correlations (χ2 test p≤ 0.0001). Integration of hub gene expression with genome-wide genotypic data resulted in 591 mRNA cis-eQTLs and 62 miRNA cis-eQTLs. mRNA cis-eQTLs were significantly enriched for AD diagnosis and AD symptom counts (adj. p = 0.014 and p = 0.024, respectively) in AD GWAS signals in a large, independent genetic sample from the Collaborative Study on Genetics of Alcohol (COGA). In conclusion, our study identified putative gene network hubs coordinating mRNA and miRNA co-expression changes in the NAc of AD subjects, and our genetic (cis-eQTL) analysis provides novel insights into the etiological mechanisms of AD. PMID:26381263
Robustness, Evolvability, and the Logic of Genetic Regulation
Moore, Jason H.; Wagner, Andreas
2014-01-01
In gene regulatory circuits, the expression of individual genes is commonly modulated by a set of regulating gene products, which bind to a gene’s cis-regulatory region. This region encodes an input-output function, referred to as signal-integration logic, that maps a specific combination of regulatory signals (inputs) to a particular expression state (output) of a gene. The space of all possible signal-integration functions is vast and the mapping from input to output is many-to-one: for the same set of inputs, many functions (genotypes) yield the same expression output (phenotype). Here, we exhaustively enumerate the set of signal-integration functions that yield idential gene expression patterns within a computational model of gene regulatory circuits. Our goal is to characterize the relationship between robustness and evolvability in the signal-integration space of regulatory circuits, and to understand how these properties vary between the genotypic and phenotypic scales. Among other results, we find that the distributions of genotypic robustness are skewed, such that the majority of signal-integration functions are robust to perturbation. We show that the connected set of genotypes that make up a given phenotype are constrained to specific regions of the space of all possible signal-integration functions, but that as the distance between genotypes increases, so does their capacity for unique innovations. In addition, we find that robust phenotypes are (i) evolvable, (ii) easily identified by random mutation, and (iii) mutationally biased toward other robust phenotypes. We explore the implications of these latter observations for mutation-based evolution by conducting random walks between randomly chosen source and target phenotypes. We demonstrate that the time required to identify the target phenotype is independent of the properties of the source phenotype. PMID:23373974
Wang, Haiying; Zheng, Huiru; Browne, Fiona; Roehe, Rainer; Dewhurst, Richard J; Engel, Felix; Hemmje, Matthias; Lu, Xiangwu; Walsh, Paul
2017-07-15
Methane is one of the major contributors to global warming. The rumen microbiota is directly involved in methane production in cattle. The link between variation in rumen microbial communities and host genetics has important applications and implications in bioscience. Having the potential to reveal the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis holds great promise in this endeavour. This study investigates the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. Based on the relative abundance of 1570 microbial genes identified in a metagenomics analysis, the co-abundance network was constructed and functional modules of microbial genes were identified. One of the main contributions is to develop a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network. The resulting network, consisting of 549 microbial genes and 3349 connections, exhibits a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p<10 -11 ). One third of genes are involved in methane metabolism pathways. The further examination of abundance profiles across 8 samples of genes highlights that the revealed pattern of metagenomics abundance has a strong association with methane emissions. Furthermore, the module is significantly enriched with microbial genes encoding enzymes that are directly involved in methanogenesis (hypergeometric test, p<10 -9 ). Copyright © 2017 Elsevier Inc. All rights reserved.
Three-layered polyplex as a microRNA targeted delivery system for breast cancer gene therapy
NASA Astrophysics Data System (ADS)
Li, Yan; Dai, Yu; Zhang, Xiaojin; Chen, Jihua
2017-07-01
MicroRNAs (miRNAs), small non-coding RNAs, play an important role in modulating cell proliferation, migration, and differentiation. Since miRNAs can regulate multiple cancer-related genes simultaneously, regulating miRNAs could target a set of related oncogenic genes or pathways. Owing to their reduced immune response and low toxicity, miRNAs with small size and low molecular weight have become increasingly promising therapeutic drugs in cancer therapy. However, one of the major challenges of miRNAs-based cancer therapy is to achieve specific, effective, and safe delivery of therapeutic miRNAs into cancer cells. Here we provide a strategy using three-layered polyplex with folic acid as a targeting group to systemically deliver miR-210 into breast cancer cells, which results in breast cancer growth being inhibited.
Three-layered polyplex as a microRNA targeted delivery system for breast cancer gene therapy.
Li, Yan; Dai, Yu; Zhang, Xiaojin; Chen, Jihua
2017-07-14
MicroRNAs (miRNAs), small non-coding RNAs, play an important role in modulating cell proliferation, migration, and differentiation. Since miRNAs can regulate multiple cancer-related genes simultaneously, regulating miRNAs could target a set of related oncogenic genes or pathways. Owing to their reduced immune response and low toxicity, miRNAs with small size and low molecular weight have become increasingly promising therapeutic drugs in cancer therapy. However, one of the major challenges of miRNAs-based cancer therapy is to achieve specific, effective, and safe delivery of therapeutic miRNAs into cancer cells. Here we provide a strategy using three-layered polyplex with folic acid as a targeting group to systemically deliver miR-210 into breast cancer cells, which results in breast cancer growth being inhibited.
Tumor Trp53 status and genotype affect the bone marrow microenvironment in acute myeloid leukemia
Jacamo, Rodrigo; Davis, R. Eric; Ling, Xiaoyang; Sonnylal, Sonali; Wang, Zhiqiang; Ma, Wencai; Zhang, Min; Ruvolo, Peter; Ruvolo, Vivian; Wang, Rui-Yu; McQueen, Teresa; Lowe, Scott; Zuber, Johannes; Kornblau, Steven M.; Konopleva, Marina; Andreeff, Michael
2017-01-01
The genetic heterogeneity of acute myeloid leukemia (AML) and the variable responses of individual patients to therapy suggest that different AML genotypes may influence the bone marrow (BM) microenvironment in different ways. We performed gene expression profiling of bone marrow mesenchymal stromal cells (BM-MSC) isolated from normal C57BL/6 mice or mice inoculated with syngeneic murine leukemia cells carrying different human AML genotypes, developed in mice with Trp53 wild-type or nullgenetic backgrounds. We identified a set of genes whose expression in BM-MSC was modulated by all four AML genotypes tested. In addition, there were sets of differentially-expressed genes in AML-exposed BM-MSC that were unique to the particular AML genotype or Trp53 status. Our findings support the hypothesis that leukemia cells alter the transcriptome of surrounding BM stromal cells, in both common and genotype-specific ways. These changes are likely to be advantageous to AML cells, affecting disease progression and response to chemotherapy, and suggest opportunities for stroma-targeting therapy, including those based on AML genotype. PMID:29137349
Functional metabolomics as a tool to analyze Mediator function and structure in plants.
Davoine, Celine; Abreu, Ilka N; Khajeh, Khalil; Blomberg, Jeanette; Kidd, Brendan N; Kazan, Kemal; Schenk, Peer M; Gerber, Lorenz; Nilsson, Ove; Moritz, Thomas; Björklund, Stefan
2017-01-01
Mediator is a multiprotein transcriptional co-regulator complex composed of four modules; Head, Middle, Tail, and Kinase. It conveys signals from promoter-bound transcriptional regulators to RNA polymerase II and thus plays an essential role in eukaryotic gene regulation. We describe subunit localization and activities of Mediator in Arabidopsis through metabolome and transcriptome analyses from a set of Mediator mutants. Functional metabolomic analysis based on the metabolite profiles of Mediator mutants using multivariate statistical analysis and heat-map visualization shows that different subunit mutants display distinct metabolite profiles, which cluster according to the reported localization of the corresponding subunits in yeast. Based on these results, we suggest localization of previously unassigned plant Mediator subunits to specific modules. We also describe novel roles for individual subunits in development, and demonstrate changes in gene expression patterns and specific metabolite levels in med18 and med25, which can explain their phenotypes. We find that med18 displays levels of phytoalexins normally found in wild type plants only after exposure to pathogens. Our results indicate that different Mediator subunits are involved in specific signaling pathways that control developmental processes and tolerance to pathogen infections.
Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong
2017-12-15
Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.
McClellan, Michael J.; Wood, C. David; Ojeniyi, Opeoluwa; Cooper, Tim J.; Kanhere, Aditi; Arvey, Aaron; Webb, Helen M.; Palermo, Richard D.; Harth-Hertle, Marie L.; Kempkes, Bettina; Jenner, Richard G.; West, Michelle J.
2013-01-01
Epstein-Barr virus (EBV) epigenetically reprogrammes B-lymphocytes to drive immortalization and facilitate viral persistence. Host-cell transcription is perturbed principally through the actions of EBV EBNA 2, 3A, 3B and 3C, with cellular genes deregulated by specific combinations of these EBNAs through unknown mechanisms. Comparing human genome binding by these viral transcription factors, we discovered that 25% of binding sites were shared by EBNA 2 and the EBNA 3s and were located predominantly in enhancers. Moreover, 80% of potential EBNA 3A, 3B or 3C target genes were also targeted by EBNA 2, implicating extensive interplay between EBNA 2 and 3 proteins in cellular reprogramming. Investigating shared enhancer sites neighbouring two new targets (WEE1 and CTBP2) we discovered that EBNA 3 proteins repress transcription by modulating enhancer-promoter loop formation to establish repressive chromatin hubs or prevent assembly of active hubs. Re-ChIP analysis revealed that EBNA 2 and 3 proteins do not bind simultaneously at shared sites but compete for binding thereby modulating enhancer-promoter interactions. At an EBNA 3-only intergenic enhancer site between ADAM28 and ADAMDEC1 EBNA 3C was also able to independently direct epigenetic repression of both genes through enhancer-promoter looping. Significantly, studying shared or unique EBNA 3 binding sites at WEE1, CTBP2, ITGAL (LFA-1 alpha chain), BCL2L11 (Bim) and the ADAMs, we also discovered that different sets of EBNA 3 proteins bind regulatory elements in a gene and cell-type specific manner. Binding profiles correlated with the effects of individual EBNA 3 proteins on the expression of these genes, providing a molecular basis for the targeting of different sets of cellular genes by the EBNA 3s. Our results therefore highlight the influence of the genomic and cellular context in determining the specificity of gene deregulation by EBV and provide a paradigm for host-cell reprogramming through modulation of enhancer-promoter interactions by viral transcription factors. PMID:24068937
Costin, Blair N.; Wolen, Aaron R.; Fitting, Sylvia; Shelton, Keith L.; Miles, Michael F.
2012-01-01
Background Glucocorticoid hormones modulate acute and chronic behavioral and molecular responses to drugs of abuse including psychostimulants and opioids. There is growing evidence that glucocorticoids might also modulate behavioral responses to ethanol. Acute ethanol activates the HPA axis, causing release of adrenal glucocorticoid hormones. Our prior genomic studies suggest glucocorticoids play a role in regulating gene expression in the prefrontal cortex (PFC) of DBA2/J (D2) mice following acute ethanol administration. However, few studies have analyzed the role of glucocorticoid signaling in behavioral responses to acute ethanol. Such work could be significant, given the predictive value for level of response to acute ethanol in the risk for alcoholism. Methods We studied whether the glucocorticoid receptor (GR) antagonist, RU-486, or adrenalectomy (ADX) altered male D2 mouse behavioral responses to acute (locomotor activation, anxiolysis or loss-of-righting reflex (LORR)) or repeated (sensitization) ethanol treatment. Whole genome microarray analysis and bioinformatics approaches were used to identify PFC candidate genes possibly responsible for altered behavioral responses to ethanol following ADX. Results ADX and RU-486 both impaired acute ethanol (2 g/kg) induced locomotor activation in D2 mice without affecting basal locomotor activity. However, neither ADX nor RU-486 altered initiation of ethanol sensitization (locomotor activation or jump counts), ethanol-induced anxiolysis or LORR. ADX mice showed microarray gene expression changes in PFC that significantly overlapped with acute ethanol-responsive gene sets derived by our prior microarray studies. Q-rtPCR analysis verified that ADX decreased PFC expression of Fkbp5 while significantly increasing Gpr6 expression. In addition, high dose RU-486 pre-treatment blunted ethanol-induced Fkbp5 expression. Conclusions Our studies suggest that ethanol’s activation of adrenal glucocorticoid release and subsequent GR activation may partially modulate ethanol’s acute locomotor activation in male D2 mice. Furthermore, since adrenal glucocorticoid basal tone regulated PFC gene expression, including a significant set of acute ethanol-responsive genes, this suggests that glucocorticoid regulated PFC gene expression may be an important factor modulating acute behavioral responses to ethanol. PMID:22671426
Random forests-based differential analysis of gene sets for gene expression data.
Hsueh, Huey-Miin; Zhou, Da-Wei; Tsai, Chen-An
2013-04-10
In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. In this study, we propose a method of gene set analysis, in which gene sets are used to develop classifications of patients based on the Random Forest (RF) algorithm. The corresponding empirical p-value of an observed out-of-bag (OOB) error rate of the classifier is introduced to identify differentially expressed gene sets using an adequate resampling method. In addition, we discuss the impacts and correlations of genes within each gene set based on the measures of variable importance in the RF algorithm. Significant classifications are reported and visualized together with the underlying gene sets and their contribution to the phenotypes of interest. Numerical studies using both synthesized data and a series of publicly available gene expression data sets are conducted to evaluate the performance of the proposed methods. Compared with other hypothesis testing approaches, our proposed methods are reliable and successful in identifying enriched gene sets and in discovering the contributions of genes within a gene set. The classification results of identified gene sets can provide an valuable alternative to gene set testing to reveal the unknown, biologically relevant classes of samples or patients. In summary, our proposed method allows one to simultaneously assess the discriminatory ability of gene sets and the importance of genes for interpretation of data in complex biological systems. The classifications of biologically defined gene sets can reveal the underlying interactions of gene sets associated with the phenotypes, and provide an insightful complement to conventional gene set analyses. Copyright © 2012 Elsevier B.V. All rights reserved.
Hess, Jonathan L.; Tylee, Daniel S.; Barve, Rahul; de Jong, Simone; Ophoff, Roel A.; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J.; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T.; Glatt, Stephen J.
2016-01-01
The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n = 315) and from ex-vivo blood tissues (n = 578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. PMID:27450777
Hess, Jonathan L; Tylee, Daniel S; Barve, Rahul; de Jong, Simone; Ophoff, Roel A; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T; Glatt, Stephen J
2016-10-01
The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n=315) and from ex-vivo blood tissues (n=578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. Published by Elsevier B.V.
Salem, Saeed; Ozcaglar, Cagri
2014-01-01
Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways.
Exploring the Transcriptome of Ciliated Cells Using In Silico Dissection of Human Tissues
Ivliev, Alexander E.; 't Hoen, Peter A. C.; van Roon-Mom, Willeke M. C.; Peters, Dorien J. M.; Sergeeva, Marina G.
2012-01-01
Cilia are cell organelles that play important roles in cell motility, sensory and developmental functions and are involved in a range of human diseases, known as ciliopathies. Here, we search for novel human genes related to cilia using a strategy that exploits the previously reported tendency of cell type-specific genes to be coexpressed in the transcriptome of complex tissues. Gene coexpression networks were constructed using the noise-resistant WGCNA algorithm in 12 publicly available microarray datasets from human tissues rich in motile cilia: airways, fallopian tubes and brain. A cilia-related coexpression module was detected in 10 out of the 12 datasets. A consensus analysis of this module's gene composition recapitulated 297 known and predicted 74 novel cilia-related genes. 82% of the novel candidates were supported by tissue-specificity expression data from GEO and/or proteomic data from the Human Protein Atlas. The novel findings included a set of genes (DCDC2, DYX1C1, KIAA0319) related to a neurological disease dyslexia suggesting their potential involvement in ciliary functions. Furthermore, we searched for differences in gene composition of the ciliary module between the tissues. A multidrug-and-toxin extrusion transporter MATE2 (SLC47A2) was found as a brain-specific central gene in the ciliary module. We confirm the localization of MATE2 in cilia by immunofluorescence staining using MDCK cells as a model. While MATE2 has previously gained attention as a pharmacologically relevant transporter, its potential relation to cilia is suggested for the first time. Taken together, our large-scale analysis of gene coexpression networks identifies novel genes related to human cell cilia. PMID:22558177
Chai, Xiaoqiang; Han, Yanan; Yang, Jian; Zhao, Xianxian; Liu, Yewang; Hou, Xugang; Tang, Yiheng; Zhao, Shirong; Li, Xiao
2016-02-01
The molecular pathogenesis of infection by hepatitis B virus with human is extremely complex and heterogeneous. To date the molecular information is not clearly defined despite intensive research efforts. Thus, studies aimed at transcription and regulation during virus infection or combined researches of those already known to be beneficial are needed. With the purpose of identifying the transcriptional regulators related to infection of hepatitis B virus in gene level, the gene expression profiles from some normal individuals and hepatitis B patients were analyzed in our study. In this work, the differential expressed genes were selected primarily. The several genes among those were validated in an independent set by qRT-PCR. Then the differentially co-expression analysis was conducted to identify differentially co-expressed links and differential co-expressed genes. Next, the analysis of the regulatory impact factors was performed through mapping the links and regulatory data. In order to give a further insight to these regulators, the co-expression gene modules were identified using a threshold-based hierarchical clustering method. Incidentally, the construction of the regulatory network was generated using the computer software. A total of 137,284 differentially co-expressed links and 780 differential co-expressed genes were identified. These co-expressed genes were significantly enriched inflammatory response. The results of regulatory impact factors revealed several crucial regulators related to hepatocellular carcinoma and other high-rank regulators. Meanwhile, more than one hundred co-expression gene modules were identified using clustering method. In our study, some important transcriptional regulators were identified using a computational method, which may enhance the understanding of disease mechanisms and lead to an improved treatment of hepatitis B. However, further experimental studies are required to confirm these findings. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation
Li, Wenyuan; Liu, Chun-Chi; Zhang, Tong; Li, Haifeng; Waterman, Michael S.; Zhou, Xianghong Jasmine
2011-01-01
The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks. PMID:21698123
USDA-ARS?s Scientific Manuscript database
Expression of Bordetella pertussis virulence factors is activated by the BvgAS two-component system. Under modulating growth conditions BvgAS indirectly represses another set of genes through the action of BvgR, a bvg-activated protein. BvgR blocks activation of the response regulator RisA which is ...
Functional organization of the transcriptome in human brain
Oldham, Michael C; Konopka, Genevieve; Iwamoto, Kazuya; Langfelder, Peter; Kato, Tadafumi; Horvath, Steve; Geschwind, Daniel H
2009-01-01
The enormous complexity of the human brain ultimately derives from a finite set of molecular instructions encoded in the human genome. These instructions can be directly studied by exploring the organization of the brain’s transcriptome through systematic analysis of gene coexpression relationships. We analyzed gene coexpression relationships in microarray data generated from specific human brain regions and identified modules of coexpressed genes that correspond to neurons, oligodendrocytes, astrocytes and microglia. These modules provide an initial description of the transcriptional programs that distinguish the major cell classes of the human brain and indicate that cell type–specific information can be obtained from whole brain tissue without isolating homogeneous populations of cells. Other modules corresponded to additional cell types, organelles, synaptic function, gender differences and the subventricular neurogenic niche. We found that subventricular zone astrocytes, which are thought to function as neural stem cells in adults, have a distinct gene expression pattern relative to protoplasmic astrocytes. Our findings provide a new foundation for neurogenetic inquiries by revealing a robust and previously unrecognized organization to the human brain transcriptome. PMID:18849986
Crx broadly modulates the pineal transcriptome
Rovsing, Louise; Clokie, Samuel; Bustos, Diego M.; Rohde, Kristian; Coon, Steven L.; Litman, Thomas; Rath, Martin F.; Møller, Morten; Klein, David C.
2011-01-01
Cone-rod homeobox (Crx) encodes Crx, a transcription factor expressed selectively in retinal photoreceptors and pinealocytes, the major cell type of the pineal gland. Here, the influence of Crx on the mammalian pineal gland was studied by light and electron microscopy and by use of microarray and qRTPCR technology, thereby extending previous studies on selected genes (Furukawa et al. 1999). Deletion of Crx was not found to alter pineal morphology, but was found to broadly modulate the mouse pineal transcriptome, characterized by a >2-fold downregulation of 543 genes and a >2-fold upregulation of 745 genes (p < 0.05). Of these, one of the most highly upregulated (18-fold) is Hoxc4, a member of the Hox gene family, members of which are known to control gene expression cascades. During a 24-hour period, a set of 51 genes exhibited differential day/night expression in pineal glands of wild-type animals; only eight of these were also day/night expressed in the Crx−/− pineal gland. However, in the Crx−/− pineal gland 41 genes exhibit differential night/day expression that is not seen in wild-type animals. These findings indicate that Crx broadly modulates the pineal transcriptome and also influences differential night/day gene expression in this tissue. Some effects of Crx deletion on the pineal transcriptome might be mediated by Hoxc4 upregulation. PMID:21797868
GO-based functional dissimilarity of gene sets.
Díaz-Díaz, Norberto; Aguilar-Ruiz, Jesús S
2011-09-01
The Gene Ontology (GO) provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together. To implement this approach to functional assessment, we present GFD (GO-based Functional Dissimilarity), a novel dissimilarity measure for evaluating groups of genes based on the most relevant functions of the whole set. The measure assigns a numerical value to the gene set for each of the three GO sub-ontologies. Results show that GFD performs robustly when applied to gene set of known functionality (extracted from KEGG). It performs particularly well on randomly generated gene sets. An ROC analysis reveals that the performance of GFD in evaluating the functional dissimilarity of gene sets is very satisfactory. A comparative analysis against other functional measures, such as GS2 and those presented by Resnik and Wang, also demonstrates the robustness of GFD.
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
2016-01-01
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448
Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong
2016-01-11
Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
Wei, Liang; Xu, Ning; Wang, Yiran; Zhou, Wei; Han, Guoqiang; Ma, Yanhe; Liu, Jun
2018-05-01
Due to the lack of efficient control elements and tools, the fine-tuning of gene expression in the multi-gene metabolic pathways is still a great challenge for engineering microbial cell factories, especially for the important industrial microorganism Corynebacterium glutamicum. In this study, the promoter library-based module combination (PLMC) technology was developed to efficiently optimize the expression of genes in C. glutamicum. A random promoter library was designed to contain the putative - 10 (NNTANANT) and - 35 (NNGNCN) consensus motifs, and refined through a three-step screening procedure to achieve numerous genetic control elements with different strength levels, including fluorescence-activated cell sorting (FACS) screening, agar plate screening, and 96-well plate screening. Multiple conventional strategies were employed for further precise characterizations of the promoter library, such as real-time quantitative PCR, sodium dodecyl sulfate polyacrylamide gel electrophoresis, FACS analysis, and the lacZ reporter system. These results suggested that the established promoter elements effectively regulated gene expression and showed varying strengths over a wide range. Subsequently, a multi-module combination technology was created based on the efficient promoter elements for combination and optimization of modules in the multi-gene pathways. Using this technology, the threonine biosynthesis pathway was reconstructed and optimized by predictable tuning expression of five modules in C. glutamicum. The threonine titer of the optimized strain was significantly improved to 12.8 g/L, an approximate 6.1-fold higher than that of the control strain. Overall, the PLMC technology presented in this study provides a rapid and effective method for combination and optimization of multi-gene pathways in C. glutamicum.
Co-regulation analysis of co-expressed modules under cold and pathogen stress conditions in tomato.
Abedini, Davar; Rashidi Monfared, Sajad
2018-06-01
A primary mechanism for controlling the development of multicellular organisms is transcriptional regulation, which carried out by transcription factors (TFs) that recognize and bind to their binding sites on promoter region. The distance from translation start site, order, orientation, and spacing between cis elements are key factors in the concentration of active nuclear TFs and transcriptional regulation of target genes. In this study, overrepresented motifs in cold and pathogenesis responsive genes were scanned via Gibbs sampling method, this method is based on detection of overrepresented motifs by means of a stochastic optimization strategy that searches for all possible sets of short DNA segments. Then, identified motifs were checked by TRANSFAC, PLACE and Soft Berry databases in order to identify putative TFs which, interact to the motifs. Several cis/trans regulatory elements were found using these databases. Moreover, cross-talk between cold and pathogenesis responsive genes were confirmed. Statistical analysis was used to determine distribution of identified motifs on promoter region. In addition, co-regulation analysis results, illustrated genes in pathogenesis responsive module are divided into two main groups. Also, promoter region was crunched to six subareas in order to draw the pattern of distribution of motifs in promoter subareas. The result showed the majority of motifs are concentrated on 700 nucleotides upstream of the translational start site (ATG). In contrast, this result isn't true in another group. In other words, there was no difference between total and compartmentalized regions in cold responsive genes.
ERIC Educational Resources Information Center
Boogaert, John
This competency-based preservice home economics teacher education module on resources for the economically depressed area family is the third in a set of three modules on human development in economically depressed areas. (This set is part of a larger set of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking…
ERIC Educational Resources Information Center
Hennings, Patricia
This competency-based preservice home economics teacher education module on maintenance procedures for surfaces and appliances is the sixth in a set of six modules on consumer education related to housing. (This set is part of a larger set of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education [MATCHE]--see…
Thimgan, Matthew S; Seugnet, Laurent; Turk, John; Shaw, Paul J
2015-05-01
Flies mutant for the canonical clock protein cycle (cyc(01)) exhibit a sleep rebound that is ∼10 times larger than wild-type flies and die after only 10 h of sleep deprivation. Surprisingly, when starved, cyc(01) mutants can remain awake for 28 h without demonstrating negative outcomes. Thus, we hypothesized that identifying transcripts that are differentially regulated between waking induced by sleep deprivation and waking induced by starvation would identify genes that underlie the deleterious effects of sleep deprivation and/or protect flies from the negative consequences of waking. We used partial complementary DNA microarrays to identify transcripts that are differentially expressed between cyc(01) mutants that had been sleep deprived or starved for 7 h. We then used genetics to determine whether disrupting genes involved in lipid metabolism would exhibit alterations in their response to sleep deprivation. Laboratory. Drosophila melanogaster. Sleep deprivation and starvation. We identified 84 genes with transcript levels that were differentially modulated by 7 h of sleep deprivation and starvation in cyc(01) mutants and were confirmed in independent samples using quantitative polymerase chain reaction. Several of these genes were predicted to be lipid metabolism genes, including bubblegum, cueball, and CG4500, which based on our data we have renamed heimdall (hll). Using lipidomics we confirmed that knockdown of hll using RNA interference significantly decreased lipid stores. Importantly, genetically modifying bubblegum, cueball, or hll resulted in sleep rebound alterations following sleep deprivation compared to genetic background controls. We have identified a set of genes that may confer resilience/vulnerability to sleep deprivation and demonstrate that genes involved in lipid metabolism modulate sleep homeostasis. © 2015 Associated Professional Sleep Societies, LLC.
Suresh, Rahul; Li, Xing; Chiriac, Anca; Goel, Kashish; Terzic, Andre; Perez-Terzic, Carmen; Nelson, Timothy J
2014-09-01
Whole-genome gene expression analysis has been successfully utilized to diagnose, prognosticate, and identify potential therapeutic targets for high-risk cardiovascular diseases. However, the feasibility of this approach to identify outcome-related genes and dysregulated pathways following first-time myocardial infarction (AMI) remains unknown and may offer a novel strategy to detect affected expressome networks that predict long-term outcome. Whole-genome expression microarray on blood samples from normal cardiac function controls (n=21) and first-time AMI patients (n=31) within 48-hours post-MI revealed expected differential gene expression profiles enriched for inflammation and immune-response pathways. To determine molecular signatures at the time of AMI associated with long-term outcomes, transcriptional profiles from sub-groups of AMI patients with (n=5) or without (n=22) any recurrent events over an 18-month follow-up were compared. This analysis identified 559 differentially-expressed genes. Bioinformatic analysis of this differential gene-set for associated pathways revealed 1) increasing disease severity in AMI patients is associated with a decreased expression of genes involved in the developmental epithelial-to-mesenchymal transition pathway, and 2) modulation of cholesterol transport genes that include ABCA1, CETP, APOA1, and LDLR is associated with clinical outcome. Differentially regulated genes and modulated pathways were identified that were associated with recurrent cardiovascular outcomes in first-time AMI patients. This cell-based approach for risk stratification in AMI could represent a novel, non-invasive platform to anticipate modifiable pathways and therapeutic targets to optimize long-term outcome for AMI patients and warrants further study to determine the role of metabolic remodeling and regenerative processes required for optimal outcomes. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Modularity-Based Method Reveals Mixed Modules from Chemical-Gene Heterogeneous Network
Song, Jianglong; Tang, Shihuan; Liu, Xi; Gao, Yibo; Yang, Hongjun; Lu, Peng
2015-01-01
For a multicomponent therapy, molecular network is essential to uncover its specific mode of action from a holistic perspective. The molecular system of a Traditional Chinese Medicine (TCM) formula can be represented by a 2-class heterogeneous network (2-HN), which typically includes chemical similarities, chemical-target interactions and gene interactions. An important premise of uncovering the molecular mechanism is to identify mixed modules from complex chemical-gene heterogeneous network of a TCM formula. We thus proposed a novel method (MixMod) based on mixed modularity to detect accurate mixed modules from 2-HNs. At first, we compared MixMod with Clauset-Newman-Moore algorithm (CNM), Markov Cluster algorithm (MCL), Infomap and Louvain on benchmark 2-HNs with known module structure. Results showed that MixMod was superior to other methods when 2-HNs had promiscuous module structure. Then these methods were tested on a real drug-target network, in which 88 disease clusters were regarded as real modules. MixMod could identify the most accurate mixed modules from the drug-target 2-HN (normalized mutual information 0.62 and classification accuracy 0.4524). In the end, MixMod was applied to the 2-HN of Buchang naoxintong capsule (BNC) and detected 49 mixed modules. By using enrichment analysis, we investigated five mixed modules that contained primary constituents of BNC intestinal absorption liquid. As a matter of fact, the findings of in vitro experiments using BNC intestinal absorption liquid were found to highly accord with previous analysis. Therefore, MixMod is an effective method to detect accurate mixed modules from chemical-gene heterogeneous networks and further uncover the molecular mechanism of multicomponent therapies, especially TCM formulae. PMID:25927435
CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation.
Nikulova, Anna A; Favorov, Alexander V; Sutormin, Roman A; Makeev, Vsevolod J; Mironov, Andrey A
2012-07-01
Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory 'grammar', or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.
Pervasive, Coordinated Protein-Level Changes Driven by Transcript Isoform Switching during Meiosis.
Cheng, Ze; Otto, George Maxwell; Powers, Emily Nicole; Keskin, Abdurrahman; Mertins, Philipp; Carr, Steven Alfred; Jovanovic, Marko; Brar, Gloria Ann
2018-02-22
To better understand the gene regulatory mechanisms that program developmental processes, we carried out simultaneous genome-wide measurements of mRNA, translation, and protein through meiotic differentiation in budding yeast. Surprisingly, we observed that the levels of several hundred mRNAs are anti-correlated with their corresponding protein products. We show that rather than arising from canonical forms of gene regulatory control, the regulation of at least 380 such cases, or over 8% of all measured genes, involves temporally regulated switching between production of a canonical, translatable transcript and a 5' extended isoform that is not efficiently translated into protein. By this pervasive mechanism for the modulation of protein levels through a natural developmental program, a single transcription factor can coordinately activate and repress protein synthesis for distinct sets of genes. The distinction is not based on whether or not an mRNA is induced but rather on the type of transcript produced. Copyright © 2018 Elsevier Inc. All rights reserved.
Case-based retrieval framework for gene expression data.
Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R; Braytee, Ali; Kennedy, Paul J
2015-01-01
The process of retrieving similar cases in a case-based reasoning system is considered a big challenge for gene expression data sets. The huge number of gene expression values generated by microarray technology leads to complex data sets and similarity measures for high-dimensional data are problematic. Hence, gene expression similarity measurements require numerous machine-learning and data-mining techniques, such as feature selection and dimensionality reduction, to be incorporated into the retrieval process. This article proposes a case-based retrieval framework that uses a k-nearest-neighbor classifier with a weighted-feature-based similarity to retrieve previously treated patients based on their gene expression profiles. The herein-proposed methodology is validated on several data sets: a childhood leukemia data set collected from The Children's Hospital at Westmead, as well as the Colon cancer, the National Cancer Institute (NCI), and the Prostate cancer data sets. Results obtained by the proposed framework in retrieving patients of the data sets who are similar to new patients are as follows: 96% accuracy on the childhood leukemia data set, 95% on the NCI data set, 93% on the Colon cancer data set, and 98% on the Prostate cancer data set. The designed case-based retrieval framework is an appropriate choice for retrieving previous patients who are similar to a new patient, on the basis of their gene expression data, for better diagnosis and treatment of childhood leukemia. Moreover, this framework can be applied to other gene expression data sets using some or all of its steps.
Cooperation and coexpression: How coexpression networks shift in response to multiple mutualists.
Palakurty, Sathvik X; Stinchcombe, John R; Afkhami, Michelle E
2018-04-01
A mechanistic understanding of community ecology requires tackling the nonadditive effects of multispecies interactions, a challenge that necessitates integration of ecological and molecular complexity-namely moving beyond pairwise ecological interaction studies and the "gene at a time" approach to mechanism. Here, we investigate the consequences of multispecies mutualisms for the structure and function of genomewide differential coexpression networks for the first time, using the tractable and ecologically important interaction between legume Medicago truncatula, rhizobia and mycorrhizal fungi. First, we found that genes whose expression is affected nonadditively by multiple mutualists are more highly connected in gene networks than expected by chance and had 94% greater network centrality than genes showing additive effects, suggesting that nonadditive genes may be key players in the widespread transcriptomic responses to multispecies symbioses. Second, multispecies mutualisms substantially changed coexpression network structure of 18 modules of host plant genes and 22 modules of the fungal symbionts' genes, indicating that third-party mutualists can cause significant rewiring of plant and fungal molecular networks. Third, we found that 60% of the coexpressed gene sets that explained variation in plant performance had coexpression structures that were altered by interactive effects of rhizobia and fungi. Finally, an "across-symbiosis" approach identified sets of plant and mycorrhizal genes whose coexpression structure was unique to the multiple mutualist context and suggested coupled responses across the plant-mycorrhizal interaction to rhizobial mutualists. Taken together, these results show multispecies mutualisms have substantial effects on the molecular interactions in host plants, microbes and across symbiotic boundaries. © 2018 John Wiley & Sons Ltd.
Transcriptome Analysis of Gelatin Seed Treatment as a Biostimulant of Cucumber Plant Growth
Wilson, H. T.; Xu, K.; Taylor, A. G.
2015-01-01
The beneficial effects of gelatin capsule seed treatment on enhanced plant growth and tolerance to abiotic stress have been reported in a number of crops, but the molecular mechanisms underlying such effects are poorly understood. Using mRNA sequencing based approach, transcriptomes of one- and two-week-old cucumber plants from gelatin capsule treated and nontreated seeds were characterized. The gelatin treated plants had greater total leaf area, fresh weight, frozen weight, and nitrogen content. Pairwise comparisons of the RNA-seq data identified 620 differentially expressed genes between treated and control two-week-old plants, consistent with the timing when the growth related measurements also showed the largest differences. Using weighted gene coexpression network analysis, significant coexpression gene network module of 208 of the 620 differentially expressed genes was identified, which included 16 hub genes in the blue module, a NAC transcription factor, a MYB transcription factor, an amino acid transporter, an ammonium transporter, a xenobiotic detoxifier-glutathione S-transferase, and others. Based on the putative functions of these genes, the identification of the significant WGCNA module and the hub genes provided important insights into the molecular mechanisms of gelatin seed treatment as a biostimulant to enhance plant growth. PMID:26558288
Rotival, Maxime; Zeller, Tanja; Wild, Philipp S; Maouche, Seraya; Szymczak, Silke; Schillert, Arne; Castagné, Raphaele; Deiseroth, Arne; Proust, Carole; Brocheton, Jessy; Godefroy, Tiphaine; Perret, Claire; Germain, Marine; Eleftheriadis, Medea; Sinning, Christoph R; Schnabel, Renate B; Lubos, Edith; Lackner, Karl J; Rossmann, Heidi; Münzel, Thomas; Rendon, Augusto; Erdmann, Jeanette; Deloukas, Panos; Hengstenberg, Christian; Diemert, Patrick; Montalescot, Gilles; Ouwehand, Willem H; Samani, Nilesh J; Schunkert, Heribert; Tregouet, David-Alexandre; Ziegler, Andreas; Goodall, Alison H; Cambien, François; Tiret, Laurence; Blankenberg, Stefan
2011-12-01
One major expectation from the transcriptome in humans is to characterize the biological basis of associations identified by genome-wide association studies. So far, few cis expression quantitative trait loci (eQTLs) have been reliably related to disease susceptibility. Trans-regulating mechanisms may play a more prominent role in disease susceptibility. We analyzed 12,808 genes detected in at least 5% of circulating monocyte samples from a population-based sample of 1,490 European unrelated subjects. We applied a method of extraction of expression patterns-independent component analysis-to identify sets of co-regulated genes. These patterns were then related to 675,350 SNPs to identify major trans-acting regulators. We detected three genomic regions significantly associated with co-regulated gene modules. Association of these loci with multiple expression traits was replicated in Cardiogenics, an independent study in which expression profiles of monocytes were available in 758 subjects. The locus 12q13 (lead SNP rs11171739), previously identified as a type 1 diabetes locus, was associated with a pattern including two cis eQTLs, RPS26 and SUOX, and 5 trans eQTLs, one of which (MADCAM1) is a potential candidate for mediating T1D susceptibility. The locus 12q24 (lead SNP rs653178), which has demonstrated extensive disease pleiotropy, including type 1 diabetes, hypertension, and celiac disease, was associated to a pattern strongly correlating to blood pressure level. The strongest trans eQTL in this pattern was CRIP1, a known marker of cellular proliferation in cancer. The locus 12q15 (lead SNP rs11177644) was associated with a pattern driven by two cis eQTLs, LYZ and YEATS4, and including 34 trans eQTLs, several of them tumor-related genes. This study shows that a method exploiting the structure of co-expressions among genes can help identify genomic regions involved in trans regulation of sets of genes and can provide clues for understanding the mechanisms linking genome-wide association loci to disease.
NASA Astrophysics Data System (ADS)
Xia, Wei; Chen, Ying; Zhang, Rui; Yan, Zhuangzhi; Zhou, Xiaobo; Zhang, Bo; Gao, Xin
2018-02-01
Our objective was to identify prognostic imaging biomarkers for hepatocellular carcinoma in contrast-enhanced computed tomography (CECT) with biological interpretations by associating imaging features and gene modules. We retrospectively analyzed 371 patients who had gene expression profiles. For the 38 patients with CECT imaging data, automatic intra-tumor partitioning was performed, resulting in three spatially distinct subregions. We extracted a total of 37 quantitative imaging features describing intensity, geometry, and texture from each subregion. Imaging features were selected after robustness and redundancy analysis. Gene modules acquired from clustering were chosen for their prognostic significance. By constructing an association map between imaging features and gene modules with Spearman rank correlations, the imaging features that significantly correlated with gene modules were obtained. These features were evaluated with Cox’s proportional hazard models and Kaplan-Meier estimates to determine their prognostic capabilities for overall survival (OS). Eight imaging features were significantly correlated with prognostic gene modules, and two of them were associated with OS. Among these, the geometry feature volume fraction of the subregion, which was significantly correlated with all prognostic gene modules representing cancer-related interpretation, was predictive of OS (Cox p = 0.022, hazard ratio = 0.24). The texture feature cluster prominence in the subregion, which was correlated with the prognostic gene module representing lipid metabolism and complement activation, also had the ability to predict OS (Cox p = 0.021, hazard ratio = 0.17). Imaging features depicting the volume fraction and textural heterogeneity in subregions have the potential to be predictors of OS with interpretable biological meaning.
Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay
2004-01-01
Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175
NASA Astrophysics Data System (ADS)
Scharfenberg, Franz-Josef; Bogner, Franz X.
2013-02-01
This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and cognitive achievement were examined using a pre-post-follow-up design. Participants of our day-long module Genetic Fingerprinting were 409 twelfth-graders. During the module instructional phases (pre-lab, theoretical, experimental, and interpretation phases), we measured the students' mental effort (ME) as an index of CL. By clustering the students' module-phase-specific ME pattern, we found three student CL clusters which were independent of the module instructional phases, labeled as low-level, average-level, and high-level loaded clusters. Additionally, we found two student CL clusters that were each particular to a specific module phase. Their members reported especially high ME invested in one phase each: within the pre-lab phase and within the interpretation phase. Differentiating the clusters, we identified uncertainty tolerance, prior experience in experimentation, epistemic interest, and prior knowledge as relevant learner characteristics. We found relationships to cognitive achievement, but no relationships to the examined laboratory variables. Our results underscore the importance of pre-lab and interpretation phases in hands-on teaching in science education and the need for teachers to pay attention to these phases, both inside and outside of outreach laboratory learning settings.
Identifying a gene expression signature of cluster headache in blood
Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.
2017-01-01
Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859
A plasmid collection for PCR-based gene targeting in the filamentous ascomycete Ashbya gossypii.
Kaufmann, Andreas
2009-08-01
PCR-based gene targeting with heterologous markers is an efficient method to delete genes, generate gene fusions, and modulate gene expression. For the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe, several plasmid collections are available covering a wide range of tags and markers. For several reasons, many of these cassettes cannot be used in the filamentous ascomycete Ashbya gossypii. This article describes the construction of 93 heterologous modules for C- and N-terminal tagging and promoter replacements in A. gossypii. The performance of 12 different fluorescent tags was evaluated by monitoring their brightness, detectability, and photostability when fused to the myosin light-chain protein Mlc2. Furthermore, the thiamine-repressible S. cerevisiae THI13 promoter was established to regulate gene expression in A. gossypii. This collection will help accelerate analysis of gene function in A. gossypii and in other ascomycetes where S. cerevisiae promoter elements are functional.
INfORM: Inference of NetwOrk Response Modules.
Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario
2018-06-15
Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.
Chen, Ming; Henry, Nathan; Almsaeed, Abdullah; Zhou, Xiao; Wegrzyn, Jill; Ficklin, Stephen
2017-01-01
Abstract Tripal is an open source software package for developing biological databases with a focus on genetic and genomic data. It consists of a set of core modules that deliver essential functions for loading and displaying data records and associated attributes including organisms, sequence features and genetic markers. Beyond the core modules, community members are encouraged to contribute extension modules to build on the Tripal core and to customize Tripal for individual community needs. To expand the utility of the Tripal software system, particularly for RNASeq data, we developed two new extension modules. Tripal Elasticsearch enables fast, scalable searching of the entire content of a Tripal site as well as the construction of customized advanced searches of specific data types. We demonstrate the use of this module for searching assembled transcripts by functional annotation. A second module, Tripal Analysis Expression, houses and displays records from gene expression assays such as RNA sequencing. This includes biological source materials (biomaterials), gene expression values and protocols used to generate the data. In the case of an RNASeq experiment, this would reflect the individual organisms and tissues used to produce sequencing libraries, the normalized gene expression values derived from the RNASeq data analysis and a description of the software or code used to generate the expression values. The module will load data from common flat file formats including standard NCBI Biosample XML. Data loading, display options and other configurations can be controlled by authorized users in the Drupal administrative backend. Both modules are open source, include usage documentation, and can be found in the Tripal organization’s GitHub repository. Database URL: Tripal Elasticsearch module: https://github.com/tripal/tripal_elasticsearch Tripal Analysis Expression module: https://github.com/tripal/tripal_analysis_expression PMID:29220446
ERIC Educational Resources Information Center
California State Univ., Fresno. Dept. of Home Economics.
This competency-based preservice home economics teacher education module on marketing practices in relation to low income clientele is the third in a set of three modules on management in economically depressed areas (EDAs). (This set is part of a larger set of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking…
ERIC Educational Resources Information Center
California State Univ., Fresno. Dept. of Home Economics.
This competency-based preservice home economics teacher education module on food availability in economically depressed areas (EDA) is the first in a set of three modules on foods and nutrition in economically depressed areas. (This set is part of a larger set of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking…
Genes and gene networks implicated in aggression related behaviour.
Malki, Karim; Pain, Oliver; Du Rietz, Ebba; Tosto, Maria Grazia; Paya-Cano, Jose; Sandnabba, Kenneth N; de Boer, Sietse; Schalkwyk, Leonard C; Sluyter, Frans
2014-10-01
Aggressive behaviour is a major cause of mortality and morbidity. Despite of moderate heritability estimates, progress in identifying the genetic factors underlying aggressive behaviour has been limited. There are currently three genetic mouse models of high and low aggression created using selective breeding. This is the first study to offer a global transcriptomic characterization of the prefrontal cortex across all three genetic mouse models of aggression. A systems biology approach has been applied to transcriptomic data across the three pairs of selected inbred mouse strains (Turku Aggressive (TA) and Turku Non-Aggressive (TNA), Short Attack Latency (SAL) and Long Attack Latency (LAL) mice and North Carolina Aggressive (NC900) and North Carolina Non-Aggressive (NC100)), providing novel insight into the neurobiological mechanisms and genetics underlying aggression. First, weighted gene co-expression network analysis (WGCNA) was performed to identify modules of highly correlated genes associated with aggression. Probe sets belonging to gene modules uncovered by WGCNA were carried forward for network analysis using ingenuity pathway analysis (IPA). The RankProd non-parametric algorithm was then used to statistically evaluate expression differences across the genes belonging to modules significantly associated with aggression. IPA uncovered two pathways, involving NF-kB and MAPKs. The secondary RankProd analysis yielded 14 differentially expressed genes, some of which have previously been implicated in pathways associated with aggressive behaviour, such as Adrbk2. The results highlighted plausible candidate genes and gene networks implicated in aggression-related behaviour.
Chen, X Y; Chen, Y H; Zhang, L J; Wang, Y; Tong, Z C
2017-02-16
Osteosarcoma (OS) is the most common primary bone malignancy, but current therapies are far from effective for all patients. A better understanding of the pathological mechanism of OS may help to achieve new treatments for this tumor. Hence, the objective of this study was to investigate ego modules and pathways in OS utilizing EgoNet algorithm and pathway-related analysis, and reveal pathological mechanisms underlying OS. The EgoNet algorithm comprises four steps: constructing background protein-protein interaction (PPI) network (PPIN) based on gene expression data and PPI data; extracting differential expression network (DEN) from the background PPIN; identifying ego genes according to topological features of genes in reweighted DEN; and collecting ego modules using module search by ego gene expansion. Consequently, we obtained 5 ego modules (Modules 2, 3, 4, 5, and 6) in total. After applying the permutation test, all presented statistical significance between OS and normal controls. Finally, pathway enrichment analysis combined with Reactome pathway database was performed to investigate pathways, and Fisher's exact test was conducted to capture ego pathways for OS. The ego pathway for Module 2 was CLEC7A/inflammasome pathway, while for Module 3 a tetrasaccharide linker sequence was required for glycosaminoglycan (GAG) synthesis, and for Module 6 was the Rho GTPase cycle. Interestingly, genes in Modules 4 and 5 were enriched in the same pathway, the 2-LTR circle formation. In conclusion, the ego modules and pathways might be potential biomarkers for OS therapeutic index, and give great insight of the molecular mechanism underlying this tumor.
Chen, X.Y.; Chen, Y.H.; Zhang, L.J.; Wang, Y.; Tong, Z.C.
2017-01-01
Osteosarcoma (OS) is the most common primary bone malignancy, but current therapies are far from effective for all patients. A better understanding of the pathological mechanism of OS may help to achieve new treatments for this tumor. Hence, the objective of this study was to investigate ego modules and pathways in OS utilizing EgoNet algorithm and pathway-related analysis, and reveal pathological mechanisms underlying OS. The EgoNet algorithm comprises four steps: constructing background protein-protein interaction (PPI) network (PPIN) based on gene expression data and PPI data; extracting differential expression network (DEN) from the background PPIN; identifying ego genes according to topological features of genes in reweighted DEN; and collecting ego modules using module search by ego gene expansion. Consequently, we obtained 5 ego modules (Modules 2, 3, 4, 5, and 6) in total. After applying the permutation test, all presented statistical significance between OS and normal controls. Finally, pathway enrichment analysis combined with Reactome pathway database was performed to investigate pathways, and Fisher's exact test was conducted to capture ego pathways for OS. The ego pathway for Module 2 was CLEC7A/inflammasome pathway, while for Module 3 a tetrasaccharide linker sequence was required for glycosaminoglycan (GAG) synthesis, and for Module 6 was the Rho GTPase cycle. Interestingly, genes in Modules 4 and 5 were enriched in the same pathway, the 2-LTR circle formation. In conclusion, the ego modules and pathways might be potential biomarkers for OS therapeutic index, and give great insight of the molecular mechanism underlying this tumor. PMID:28225867
Guo, Liyuan; Wang, Jing
2018-01-04
Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
2018-01-01
Abstract Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element–target gene pairs (E–G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. PMID:29140525
Macrogenomic engineering via modulation of the scaling of chromatin packing density.
Almassalha, Luay M; Bauer, Greta M; Wu, Wenli; Cherkezyan, Lusik; Zhang, Di; Kendra, Alexis; Gladstein, Scott; Chandler, John E; VanDerway, David; Seagle, Brandon-Luke L; Ugolkov, Andrey; Billadeau, Daniel D; O'Halloran, Thomas V; Mazar, Andrew P; Roy, Hemant K; Szleifer, Igal; Shahabi, Shohreh; Backman, Vadim
2017-11-01
Many human diseases result from the dysregulation of the complex interactions between tens to thousands of genes. However, approaches for the transcriptional modulation of many genes simultaneously in a predictive manner are lacking. Here, through the combination of simulations, systems modelling and in vitro experiments, we provide a physical regulatory framework based on chromatin packing-density heterogeneity for modulating the genomic information space. Because transcriptional interactions are essentially chemical reactions, they depend largely on the local physical nanoenvironment. We show that the regulation of the chromatin nanoenvironment allows for the predictable modulation of global patterns in gene expression. In particular, we show that the rational modulation of chromatin density fluctuations can lead to a decrease in global transcriptional activity and intercellular transcriptional heterogeneity in cancer cells during chemotherapeutic responses to achieve near-complete cancer cell killing in vitro. Our findings represent a 'macrogenomic engineering' approach to modulating the physical structure of chromatin for whole-scale transcriptional modulation.
Yu, Yang; Li, Quan-Feng; Zhang, Jin-Ping; Zhang, Fan; Zhou, Yan-Fei; Feng, Yan-Zhao; Chen, Yue-Qin; Zhang, Yu-Chan
2017-01-01
Seed setting rate is one of the most important components of rice grain yield. To date, only several genes regulating setting rate have been identified in plant. In this study, we showed that laccase-13 ( OsLAC13 ), a member of laccase family genes which are known for their roles in modulating phenylpropanoid pathway and secondary lignification in cell wall, exerts a regulatory function in rice seed setting rate. OsLAC13 expressed in anthers and promotes hydrogen peroxide production both in vitro and in the filaments and anther connectives. Knock-out of OsLAC13 showed significantly increased seed setting rate, while overexpression of this gene exhibited induced mitochondrial damage and suppressed sugar transportation in anthers, which in turn affected seed setting rate. OsLAC13 also induced H 2 O 2 production and mitochondrial damage in the root tip cells which caused the lethal phenotype. We also showed that high abundant of OsmiR397, the suppressor of OsLAC13 mRNA, increased the seed setting rate of rice plants, and restrains H 2 O 2 accumulation in roots during oxidative stress. Our results suggested a novel regulatory role of OsLAC13 gene in regulating seed setting rate by affecting H 2 O 2 dynamics and mitochondrial integrity in rice.
The Role of Vitamin D in the Transcriptional Program of Human Pregnancy
Al-Garawi, Amal; Carey, Vincent J.; Chhabra, Divya; Morrow, Jarrett; Lasky-Su, Jessica; Qiu, Weiliang; Laranjo, Nancy; Litonjua, Augusto A.; Weiss, Scott T.
2016-01-01
Background Patterns of gene expression of human pregnancy are poorly understood. In a trial of vitamin D supplementation in pregnant women, peripheral blood transcriptomes were measured longitudinally on 30 women and used to characterize gene co-expression networks. Objective Studies suggest that increased maternal Vitamin D levels may reduce the risk of asthma in early life, yet the underlying mechanisms have not been examined. In this study, we used a network-based approach to examine changes in gene expression profiles during the course of normal pregnancy and evaluated their association with maternal Vitamin D levels. Design The VDAART study is a randomized clinical trial of vitamin D supplementation in pregnancy for reduction of pediatric asthma risk. The trial enrolled 881 women at 10–18 weeks of gestation. Longitudinal gene expression measures were obtained on thirty pregnant women, using RNA isolated from peripheral blood samples obtained in the first and third trimesters. Differentially expressed genes were identified using significance of analysis of microarrays (SAM), and clustered using a weighted gene co-expression network analysis (WGCNA). Gene-set enrichment was performed to identify major biological pathways. Results Comparison of transcriptional profiles between first and third trimesters of pregnancy identified 5839 significantly differentially expressed genes (FDR<0.05). Weighted gene co-expression network analysis clustered these transcripts into 14 co-expression modules of which two showed significant correlation with maternal vitamin D levels. Pathway analysis of these two modules revealed genes enriched in immune defense pathways and extracellular matrix reorganization as well as genes enriched in notch signaling and transcription factor networks. Conclusion Our data show that gene expression profiles of healthy pregnant women change during the course of pregnancy and suggest that maternal Vitamin D levels influence transcriptional profiles. These alterations of the maternal transcriptome may contribute to fetal immune imprinting and reduce allergic sensitization in early life. Trial Registration clinicaltrials.gov NCT00920621 PMID:27711190
An iterative network partition algorithm for accurate identification of dense network modules
Sun, Siqi; Dong, Xinran; Fu, Yao; Tian, Weidong
2012-01-01
A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks. PMID:22121225
2010-01-01
One of the important challenges to post-genomic biology is relating observed phenotypic alterations to the underlying collective alterations in genes. Current inferential methods, however, invariably omit large bodies of information on the relationships between genes. We present a method that takes account of such information - expressed in terms of the topology of a correlation network - and we apply the method in the context of current procedures for gene set enrichment analysis. PMID:20187943
2014-01-01
Background Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. PMID:25221624
MIDAS: A Modular DNA Assembly System for Synthetic Biology.
van Dolleweerd, Craig J; Kessans, Sarah A; Van de Bittner, Kyle C; Bustamante, Leyla Y; Bundela, Rudranuj; Scott, Barry; Nicholson, Matthew J; Parker, Emily J
2018-04-20
A modular and hierarchical DNA assembly platform for synthetic biology based on Golden Gate (Type IIS restriction enzyme) cloning is described. This enabling technology, termed MIDAS (for Modular Idempotent DNA Assembly System), can be used to precisely assemble multiple DNA fragments in a single reaction using a standardized assembly design. It can be used to build genes from libraries of sequence-verified, reusable parts and to assemble multiple genes in a single vector, with full user control over gene order and orientation, as well as control of the direction of growth (polarity) of the multigene assembly, a feature that allows genes to be nested between other genes or genetic elements. We describe the detailed design and use of MIDAS, exemplified by the reconstruction, in the filamentous fungus Penicillium paxilli, of the metabolic pathway for production of paspaline and paxilline, key intermediates in the biosynthesis of a range of indole diterpenes-a class of secondary metabolites produced by several species of filamentous fungi. MIDAS was used to efficiently assemble a 25.2 kb plasmid from 21 different modules (seven genes, each composed of three basic parts). By using a parts library-based system for construction of complex assemblies, and a unique set of vectors, MIDAS can provide a flexible route to assembling tailored combinations of genes and other genetic elements, thereby supporting synthetic biology applications in a wide range of expression hosts.
Comparative study on gene set and pathway topology-based enrichment methods.
Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim
2015-10-22
Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both types of methods for enrichment analysis require further improvements in order to deal with the problem of pathway overlaps.
A sigma factor toolbox for orthogonal gene expression in Escherichia coli
Van Brempt, Maarten; Van Nerom, Katleen; Van Hove, Bob; Maertens, Jo; De Mey, Marjan; Charlier, Daniel
2018-01-01
Abstract Synthetic genetic sensors and circuits enable programmable control over timing and conditions of gene expression and, as a result, are increasingly incorporated into the control of complex and multi-gene pathways. Size and complexity of genetic circuits are growing, but stay limited by a shortage of regulatory parts that can be used without interference. Therefore, orthogonal expression and regulation systems are needed to minimize undesired crosstalk and allow for dynamic control of separate modules. This work presents a set of orthogonal expression systems for use in Escherichia coli based on heterologous sigma factors from Bacillus subtilis that recognize specific promoter sequences. Up to four of the analyzed sigma factors can be combined to function orthogonally between each other and toward the host. Additionally, the toolbox is expanded by creating promoter libraries for three sigma factors without loss of their orthogonal nature. As this set covers a wide range of transcription initiation frequencies, it enables tuning of multiple outputs of the circuit in response to different sensory signals in an orthogonal manner. This sigma factor toolbox constitutes an interesting expansion of the synthetic biology toolbox and may contribute to the assembly of more complex synthetic genetic systems in the future. PMID:29361130
Jiang, T; Jiang, C-Y; Shu, J-H; Xu, Y-J
2017-07-10
The molecular mechanism of nasopharyngeal carcinoma (NPC) is poorly understood and effective therapeutic approaches are needed. This research aimed to excavate the attractor modules involved in the progression of NPC and provide further understanding of the underlying mechanism of NPC. Based on the gene expression data of NPC, two specific protein-protein interaction networks for NPC and control conditions were re-weighted using Pearson correlation coefficient. Then, a systematic tracking of candidate modules was conducted on the re-weighted networks via cliques algorithm, and a total of 19 and 38 modules were separately identified from NPC and control networks, respectively. Among them, 8 pairs of modules with similar gene composition were selected, and 2 attractor modules were identified via the attract method. Functional analysis indicated that these two attractor modules participate in one common bioprocess of cell division. Based on the strategy of integrating systemic module inference with the attract method, we successfully identified 2 attractor modules. These attractor modules might play important roles in the molecular pathogenesis of NPC via affecting the bioprocess of cell division in a conjunct way. Further research is needed to explore the correlations between cell division and NPC.
Wang, Yumei; Yin, Xiaoling; Yang, Fang
2018-02-01
Sepsis is an inflammatory-related disease, and severe sepsis would induce multiorgan dysfunction, which is the most common cause of death of patients in noncoronary intensive care units. Progression of novel therapeutic strategies has proven to be of little impact on the mortality of severe sepsis, and unfortunately, its mechanisms still remain poorly understood. In this study, we analyzed gene expression profiles of severe sepsis with failure of lung, kidney, and liver for the identification of potential biomarkers. We first downloaded the gene expression profiles from the Gene Expression Omnibus and performed preprocessing of raw microarray data sets and identification of differential expression genes (DEGs) through the R programming software; then, significantly enriched functions of DEGs in lung, kidney, and liver failure sepsis samples were obtained from the Database for Annotation, Visualization, and Integrated Discovery; finally, protein-protein interaction network was constructed for DEGs based on the STRING database, and network modules were also obtained through the MCODE cluster method. As a result, lung failure sepsis has the highest number of DEGs of 859, whereas the number of DEGs in kidney and liver failure sepsis samples is 178 and 175, respectively. In addition, 17 overlaps were obtained among the three lists of DEGs. Biological processes related to immune and inflammatory response were found to be significantly enriched in DEGs. Network and module analysis identified four gene clusters in which all or most of genes were upregulated. The expression changes of Icam1 and Socs3 were further validated through quantitative PCR analysis. This study should shed light on the development of sepsis and provide potential therapeutic targets for sepsis-induced multiorgan failure.
Between “design” and “bricolage”: Genetic networks, levels of selection, and adaptive evolution
Wilkins, Adam S.
2007-01-01
The extent to which “developmental constraints” in complex organisms restrict evolutionary directions remains contentious. Yet, other forms of internal constraint, which have received less attention, may also exist. It will be argued here that a set of partial constraints below the level of phenotypes, those involving genes and molecules, influences and channels the set of possible evolutionary trajectories. At the top-most organizational level there are the genetic network modules, whose operations directly underlie complex morphological traits. The properties of these network modules, however, have themselves been set by the evolutionary history of the component genes and their interactions. Characterization of the components, structures, and operational dynamics of specific genetic networks should lead to a better understanding not only of the morphological traits they underlie but of the biases that influence the directions of evolutionary change. Furthermore, such knowledge may permit assessment of the relative degrees of probability of short evolutionary trajectories, those on the microevolutionary scale. In effect, a “network perspective” may help transform evolutionary biology into a scientific enterprise with greater predictive capability than it has hitherto possessed. PMID:17494754
Between "design" and "bricolage": genetic networks, levels of selection, and adaptive evolution.
Wilkins, Adam S
2007-05-15
The extent to which "developmental constraints" in complex organisms restrict evolutionary directions remains contentious. Yet, other forms of internal constraint, which have received less attention, may also exist. It will be argued here that a set of partial constraints below the level of phenotypes, those involving genes and molecules, influences and channels the set of possible evolutionary trajectories. At the top-most organizational level there are the genetic network modules, whose operations directly underlie complex morphological traits. The properties of these network modules, however, have themselves been set by the evolutionary history of the component genes and their interactions. Characterization of the components, structures, and operational dynamics of specific genetic networks should lead to a better understanding not only of the morphological traits they underlie but of the biases that influence the directions of evolutionary change. Furthermore, such knowledge may permit assessment of the relative degrees of probability of short evolutionary trajectories, those on the microevolutionary scale. In effect, a "network perspective" may help transform evolutionary biology into a scientific enterprise with greater predictive capability than it has hitherto possessed.
Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets.
Park, Inho; Lee, Kwang H; Lee, Doheon
2010-06-15
Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/~ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Supplementary data are available at Bioinformatics online.
Chasman, Deborah; Walters, Kevin B.; Lopes, Tiago J. S.; Eisfeld, Amie J.; Kawaoka, Yoshihiro; Roy, Sushmita
2016-01-01
Mammalian host response to pathogenic infections is controlled by a complex regulatory network connecting regulatory proteins such as transcription factors and signaling proteins to target genes. An important challenge in infectious disease research is to understand molecular similarities and differences in mammalian host response to diverse sets of pathogens. Recently, systems biology studies have produced rich collections of omic profiles measuring host response to infectious agents such as influenza viruses at multiple levels. To gain a comprehensive understanding of the regulatory network driving host response to multiple infectious agents, we integrated host transcriptomes and proteomes using a network-based approach. Our approach combines expression-based regulatory network inference, structured-sparsity based regression, and network information flow to infer putative physical regulatory programs for expression modules. We applied our approach to identify regulatory networks, modules and subnetworks that drive host response to multiple influenza infections. The inferred regulatory network and modules are significantly enriched for known pathways of immune response and implicate apoptosis, splicing, and interferon signaling processes in the differential response of viral infections of different pathogenicities. We used the learned network to prioritize regulators and study virus and time-point specific networks. RNAi-based knockdown of predicted regulators had significant impact on viral replication and include several previously unknown regulators. Taken together, our integrated analysis identified novel module level patterns that capture strain and pathogenicity-specific patterns of expression and helped identify important regulators of host response to influenza infection. PMID:27403523
NASA Astrophysics Data System (ADS)
Caniza, Horacio; Romero, Alfonso E.; Paccanaro, Alberto
2015-12-01
We introduce a MeSH-based method that accurately quantifies similarity between heritable diseases at molecular level. This method effectively brings together the existing information about diseases that is scattered across the vast corpus of biomedical literature. We prove that sets of MeSH terms provide a highly descriptive representation of heritable disease and that the structure of MeSH provides a natural way of combining individual MeSH vocabularies. We show that our measure can be used effectively in the prediction of candidate disease genes. We developed a web application to query more than 28.5 million relationships between 7,574 hereditary diseases (96% of OMIM) based on our similarity measure.
Disease networks. Uncovering disease-disease relationships through the incomplete interactome.
Menche, Jörg; Sharma, Amitabh; Kitsak, Maksim; Ghiassian, Susan Dina; Vidal, Marc; Loscalzo, Joseph; Barabási, Albert-László
2015-02-20
According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes. Copyright © 2015, American Association for the Advancement of Science.
Colorimetric Detection of Ehrlichia Canis via Nucleic Acid Hybridization in Gold Nano-Colloids
Muangchuen, Ajima; Chaumpluk, Piyasak; Suriyasomboon, Annop; Ekgasit, Sanong
2014-01-01
Canine monocytic ehrlichiosis (CME) is a major thick-bone disease of dog caused by Ehrlichia canis. Detection of this causal agent outside the laboratory using conventional methods is not effective enough. Thus an assay for E. canis detection based on the p30 outer membrane protein gene was developed. It was based on the p30 gene amplification using loop-mediated isothermal DNA amplification (LAMP). The primer set specific to six areas within the target gene were designed and tested for their sensitivity and specificity. Detection of DNA signals was based on modulation of gold nanoparticles' surface properties and performing DNA/DNA hybridization using an oligonucleotide probe. Presence of target DNA affected the gold colloid nanoparticles in terms of particle aggregation with a plasmonic color change of the gold colloids from ruby red to purple, visible by the naked eye. All the assay steps were completed within 90 min including DNA extraction without relying on standard laboratory facilities. This method was very specific to target bacteria. Its sensitivity with probe hybridization was sufficient to detect 50 copies of target DNA. This method should provide an alternative choice for point of care control and management of the disease. PMID:25111239
Colorimetric detection of Ehrlichia canis via nucleic acid hybridization in gold nano-colloids.
Muangchuen, Ajima; Chaumpluk, Piyasak; Suriyasomboon, Annop; Ekgasit, Sanong
2014-08-08
Canine monocytic ehrlichiosis (CME) is a major thick-bone disease of dog caused by Ehrlichia canis. Detection of this causal agent outside the laboratory using conventional methods is not effective enough. Thus an assay for E. canis detection based on the p30 outer membrane protein gene was developed. It was based on the p30 gene amplification using loop-mediated isothermal DNA amplification (LAMP). The primer set specific to six areas within the target gene were designed and tested for their sensitivity and specificity. Detection of DNA signals was based on modulation of gold nanoparticles' surface properties and performing DNA/DNA hybridization using an oligonucleotide probe. Presence of target DNA affected the gold colloid nanoparticles in terms of particle aggregation with a plasmonic color change of the gold colloids from ruby red to purple, visible by the naked eye. All the assay steps were completed within 90 min including DNA extraction without relying on standard laboratory facilities. This method was very specific to target bacteria. Its sensitivity with probe hybridization was sufficient to detect 50 copies of target DNA. This method should provide an alternative choice for point of care control and management of the disease.
Warmest Global Temperature on Record on This Week @NASA – January 20, 2017
2017-01-20
NASA and the National Oceanic and Atmospheric Administration (NOAA) announced on Jan. 18, that global surface temperatures in 2016 were the warmest since modern record keeping began in 1880. The finding was based on results of independent analyses by both agencies. According to analysis by scientists at NASA’s Goddard Institute for Space Studies (GISS) in New York, 2016 is the third year in a row to set a new record for global average surface temperatures, further demonstrating a long-term warming trend. Also, Cygnus Cargo Module Arrives at KSC, Up in 30 Seconds, and Remembering Gene Cernan.
Penrod, Nadia M; Moore, Jason H
2014-02-05
The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. We use this approach to prioritize genes as drug target candidates in a set of ER⁺ breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER⁺ breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use.
2014-01-01
Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353
Gorgoglione, Bartolomeo; Zahran, Eman; Taylor, Nick G H; Feist, Stephen W; Zou, Jun; Secombes, Christopher J
2016-03-01
Chemokine modulation in response to pathogens still needs to be fully characterised in fish, in view of the recently described novel chemokines present. This paper reports the first comparative study of CXC chemokine genes transcription in salmonids (brown trout), with a particular focus on the fish specific CXC chemokines (CXCL_F). Adopting new primer sets, optimised to specifically target mRNA, a RT-qPCR gene screening was carried out. Constitutive gene expression was assessed first in six tissues from SPF brown trout. Transcription modulation was next investigated in kidney and spleen during septicaemic infection induced by a RNA virus (Viral Haemorrhagic Septicaemia virus, genotype Ia) or by a Gram negative bacterium (Yersinia ruckeri, ser. O1/biot. 2). From each target organ specific pathogen burden, measured detecting VHSV-glycoprotein or Y. ruckeri 16S rRNA, and IFN-γ gene expression were analysed for their correlation to chemokine transcription. Both pathogens modulated CXC chemokine gene transcript levels, with marked up-regulation seen in some cases, and with both temporal and tissue specific effects apparent. For example, Y. ruckeri strongly induced chemokine transcription in spleen within 24h, whilst VHS generally induced the largest increases at 3d.p.i. in both tissues. This study gives clues to the role of the novel CXC chemokines, in comparison to the other known CXC chemokines in salmonids. Copyright © 2016 Elsevier Ltd. All rights reserved.
Limonciel, Alice; Moenks, Konrad; Stanzel, Sven; Truisi, Germaine L; Parmentier, Céline; Aschauer, Lydia; Wilmes, Anja; Richert, Lysiane; Hewitt, Philip; Mueller, Stefan O; Lukas, Arno; Kopp-Schneider, Annette; Leonard, Martin O; Jennings, Paul
2015-12-25
High content omic methods provide a deep insight into cellular events occurring upon chemical exposure of a cell population or tissue. However, this improvement in analytic precision is not yet matched by a thorough understanding of molecular mechanisms that would allow an optimal interpretation of these biological changes. For transcriptomics (TCX), one type of molecular effects that can be assessed already is the modulation of the transcriptional activity of a transcription factor (TF). As more ChIP-seq datasets reporting genes specifically bound by a TF become publicly available for mining, the generation of target gene lists of TFs of toxicological relevance becomes possible, based on actual protein-DNA interaction and modulation of gene expression. In this study, we generated target gene signatures for Nrf2, ATF4, XBP1, p53, HIF1a, AhR and PPAR gamma and tracked TF modulation in a large collection of in vitro TCX datasets from renal and hepatic cell models exposed to clinical nephro- and hepato-toxins. The result is a global monitoring of TF modulation with great promise as a mechanistically based tool for chemical hazard identification. Copyright © 2014 Elsevier Ltd. All rights reserved.
Ray, Sumanta; Maulik, Ujjwal
2016-12-20
Detecting perturbation in modular structure during HIV-1 disease progression is an important step to understand stage specific infection pattern of HIV-1 virus in human cell. In this article, we proposed a novel methodology on integration of multiple biological information to identify such disruption in human gene module during different stages of HIV-1 infection. We integrate three different biological information: gene expression information, protein-protein interaction information and gene ontology information in single gene meta-module, through non negative matrix factorization (NMF). As the identified metamodules inherit those information so, detecting perturbation of these, reflects the changes in expression pattern, in PPI structure and in functional similarity of genes during the infection progression. To integrate modules of different data sources into strong meta-modules, NMF based clustering is utilized here. Perturbation in meta-modular structure is identified by investigating the topological and intramodular properties and putting rank to those meta-modules using a rank aggregation algorithm. We have also analyzed the preservation structure of significant GO terms in which the human proteins of the meta-modules participate. Moreover, we have performed an analysis to show the change of coregulation pattern of identified transcription factors (TFs) over the HIV progression stages.
Zhang, Jinfeng; Zhao, Wenjuan; Fu, Rong; Fu, Chenglin; Wang, Lingxia; Liu, Huainian; Li, Shuangcheng; Deng, Qiming; Wang, Shiquan; Zhu, Jun; Liang, Yueyang; Li, Ping; Zheng, Aiping
2018-05-05
Rhizoctonia solani causes rice sheath blight, an important disease affecting the growth of rice (Oryza sativa L.). Attempts to control the disease have met with little success. Based on transcriptional profiling, we previously identified more than 11,947 common differentially expressed genes (TPM > 10) between the rice genotypes TeQing and Lemont. In the current study, we extended these findings by focusing on an analysis of gene co-expression in response to R. solani AG1 IA and identified gene modules within the networks through weighted gene co-expression network analysis (WGCNA). We compared the different genes assigned to each module and the biological interpretations of gene co-expression networks at early and later modules in the two rice genotypes to reveal differential responses to AG1 IA. Our results show that different changes occurred in the two rice genotypes and that the modules in the two groups contain a number of candidate genes possibly involved in pathogenesis, such as the VQ protein. Furthermore, these gene co-expression networks provide comprehensive transcriptional information regarding gene expression in rice in response to AG1 IA. The co-expression networks derived from our data offer ideas for follow-up experimentation that will help advance our understanding of the translational regulation of rice gene expression changes in response to AG1 IA.
Natural variation in PTB1 regulates rice seed setting rate by controlling pollen tube growth.
Li, Shuangcheng; Li, Wenbo; Huang, Bin; Cao, Xuemei; Zhou, Xingyu; Ye, Shumei; Li, Chengbo; Gao, Fengyan; Zou, Ting; Xie, Kailong; Ren, Yun; Ai, Peng; Tang, Yangfan; Li, Xuemei; Deng, Qiming; Wang, Shiquan; Zheng, Aiping; Zhu, Jun; Liu, Huainian; Wang, Lingxia; Li, Ping
2013-01-01
Grain number, panicle seed setting rate, panicle number and grain weight are the most important components of rice grain yield. To date, several genes related to grain weight, grain number and panicle number have been described in rice. However, no genes regulating the panicle seed setting rate have been functionally characterized. Here we show that the domestication-related POLLEN TUBE BLOCKED 1 (PTB1), a RING-type E3 ubiquitin ligase, positively regulates the rice panicle seed setting rate by promoting pollen tube growth. The natural variation in expression of PTB1 which is affected by the promoter haplotype and the environmental temperature, correlates with the rice panicle seed setting rate. Our results support the hypothesis that PTB1 is an important maternal sporophytic factor of pollen tube growth and a key modulator of the rice panicle seed setting rate. This finding has implications for the improvement of rice yield.
Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.
Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin
2017-08-01
This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.
Wang, Li-Xin; Li, Yang; Chen, Guan-Zhi
2018-01-01
Metastatic melanoma is an aggressive skin cancer and is one of the global malignancies with high mortality and morbidity. It is essential to identify and verify diagnostic biomarkers of early metastatic melanoma. Previous studies have systematically assessed protein biomarkers and mRNA-based expression characteristics. However, molecular markers for the early diagnosis of metastatic melanoma have not been identified. To explore potential regulatory targets, we have analyzed the gene microarray expression profiles of malignant melanoma samples by co-expression analysis based on the network approach. The differentially expressed genes (DEGs) were screened by the EdgeR package of R software. A weighted gene co-expression network analysis (WGCNA) was used for the identification of DEGs in the special gene modules and hub genes. Subsequently, a protein-protein interaction network was constructed to extract hub genes associated with gene modules. Finally, twenty-four important hub genes (RASGRP2, IKZF1, CXCR5, LTB, BLK, LINGO3, CCR6, P2RY10, RHOH, JUP, KRT14, PLA2G3, SPRR1A, KRT78, SFN, CLDN4, IL1RN, PKP3, CBLC, KRT16, TMEM79, KLK8, LYPD3 and LYPD5) were treated as valuable factors involved in the immune response and tumor cell development in tumorigenesis. In addition, a transcriptional regulatory network was constructed for these specific modules or hub genes, and a few core transcriptional regulators were found to be mostly associated with our hub genes, including GATA1, STAT1, SP1, and PSG1. In summary, our findings enhance our understanding of the biological process of malignant melanoma metastasis, enabling us to identify specific genes to use for diagnostic and prognostic markers and possibly for targeted therapy.
Malki, Karim; Tosto, Maria Grazia; Mouriño-Talín, Héctor; Rodríguez-Lorenzo, Sabela; Pain, Oliver; Jumhaboy, Irfan; Liu, Tina; Parpas, Panos; Newman, Stuart; Malykh, Artem; Carboni, Lucia; Uher, Rudolf; McGuffin, Peter; Schalkwyk, Leonard C; Bryson, Kevin; Herbster, Mark
2017-04-01
Response to antidepressant (AD) treatment may be a more polygenic trait than previously hypothesized, with many genetic variants interacting in yet unclear ways. In this study we used methods that can automatically learn to detect patterns of statistical regularity from a sparsely distributed signal across hippocampal transcriptome measurements in a large-scale animal pharmacogenomic study to uncover genomic variations associated with AD. The study used four inbred mouse strains of both sexes, two drug treatments, and a control group (escitalopram, nortriptyline, and saline). Multi-class and binary classification using Machine Learning (ML) and regularization algorithms using iterative and univariate feature selection methods, including InfoGain, mRMR, ANOVA, and Chi Square, were used to uncover genomic markers associated with AD response. Relevant genes were selected based on Jaccard distance and carried forward for gene-network analysis. Linear association methods uncovered only one gene associated with drug treatment response. The implementation of ML algorithms, together with feature reduction methods, revealed a set of 204 genes associated with SSRI and 241 genes associated with NRI response. Although only 10% of genes overlapped across the two drugs, network analysis shows that both drugs modulated the CREB pathway, through different molecular mechanisms. Through careful implementation and optimisations, the algorithms detected a weak signal used to predict whether an animal was treated with nortriptyline (77%) or escitalopram (67%) on an independent testing set. The results from this study indicate that the molecular signature of AD treatment may include a much broader range of genomic markers than previously hypothesized, suggesting that response to medication may be as complex as the pathology. The search for biomarkers of antidepressant treatment response could therefore consider a higher number of genetic markers and their interactions. Through predominately different molecular targets and mechanisms of action, the two drugs modulate the same Creb1 pathway which plays a key role in neurotrophic responses and in inflammatory processes. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc.
Teaching bioinformatics and neuroinformatics by using free web-based tools.
Grisham, William; Schottler, Natalie A; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson
2010-01-01
This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with anatomy (Mouse Brain Library), quantitative trait locus analysis (WebQTL from GeneNetwork), bioinformatics and gene expression analyses (University of California, Santa Cruz Genome Browser, National Center for Biotechnology Information's Entrez Gene, and the Allen Brain Atlas), and information resources (PubMed). Instructors can use these various websites in concert to teach genetics from the phenotypic level to the molecular level, aspects of neuroanatomy and histology, statistics, quantitative trait locus analysis, and molecular biology (including in situ hybridization and microarray analysis), and to introduce bioinformatic resources. Students use these resources to discover 1) the region(s) of chromosome(s) influencing the phenotypic trait, 2) a list of candidate genes-narrowed by expression data, 3) the in situ pattern of a given gene in the region of interest, 4) the nucleotide sequence of the candidate gene, and 5) articles describing the gene. Teaching materials such as a detailed student/instructor's manual, PowerPoints, sample exams, and links to free Web resources can be found at http://mdcune.psych.ucla.edu/modules/bioinformatics.
Gene networks specific for innate immunity define post-traumatic stress disorder.
Breen, M S; Maihofer, A X; Glatt, S J; Tylee, D S; Chandler, S D; Tsuang, M T; Risbrough, V B; Baker, D G; O'Connor, D T; Nievergelt, C M; Woelk, C H
2015-12-01
The molecular factors involved in the development of Post-Traumatic Stress Disorder (PTSD) remain poorly understood. Previous transcriptomic studies investigating the mechanisms of PTSD apply targeted approaches to identify individual genes under a cross-sectional framework lack a holistic view of the behaviours and properties of these genes at the system-level. Here we sought to apply an unsupervised gene-network based approach to a prospective experimental design using whole-transcriptome RNA-Seq gene expression from peripheral blood leukocytes of U.S. Marines (N=188), obtained both pre- and post-deployment to conflict zones. We identified discrete groups of co-regulated genes (i.e., co-expression modules) and tested them for association to PTSD. We identified one module at both pre- and post-deployment containing putative causal signatures for PTSD development displaying an over-expression of genes enriched for functions of innate-immune response and interferon signalling (Type-I and Type-II). Importantly, these results were replicated in a second non-overlapping independent dataset of U.S. Marines (N=96), further outlining the role of innate immune and interferon signalling genes within co-expression modules to explain at least part of the causal pathophysiology for PTSD development. A second module, consequential of trauma exposure, contained PTSD resiliency signatures and an over-expression of genes involved in hemostasis and wound responsiveness suggesting that chronic levels of stress impair proper wound healing during/after exposure to the battlefield while highlighting the role of the hemostatic system as a clinical indicator of chronic-based stress. These findings provide novel insights for early preventative measures and advanced PTSD detection, which may lead to interventions that delay or perhaps abrogate the development of PTSD.
Jia, Peilin; Wang, Lily; Fanous, Ayman H.; Pato, Carlos N.; Edwards, Todd L.; Zhao, Zhongming
2012-01-01
With the recent success of genome-wide association studies (GWAS), a wealth of association data has been accomplished for more than 200 complex diseases/traits, proposing a strong demand for data integration and interpretation. A combinatory analysis of multiple GWAS datasets, or an integrative analysis of GWAS data and other high-throughput data, has been particularly promising. In this study, we proposed an integrative analysis framework of multiple GWAS datasets by overlaying association signals onto the protein-protein interaction network, and demonstrated it using schizophrenia datasets. Building on a dense module search algorithm, we first searched for significantly enriched subnetworks for schizophrenia in each single GWAS dataset and then implemented a discovery-evaluation strategy to identify module genes with consistent association signals. We validated the module genes in an independent dataset, and also examined them through meta-analysis of the related SNPs using multiple GWAS datasets. As a result, we identified 205 module genes with a joint effect significantly associated with schizophrenia; these module genes included a number of well-studied candidate genes such as DISC1, GNA12, GNA13, GNAI1, GPR17, and GRIN2B. Further functional analysis suggested these genes are involved in neuronal related processes. Additionally, meta-analysis found that 18 SNPs in 9 module genes had P meta<1×10−4, including the gene HLA-DQA1 located in the MHC region on chromosome 6, which was reported in previous studies using the largest cohort of schizophrenia patients to date. These results demonstrated our bi-directional network-based strategy is efficient for identifying disease-associated genes with modest signals in GWAS datasets. This approach can be applied to any other complex diseases/traits where multiple GWAS datasets are available. PMID:22792057
Analysis of functional importance of binding sites in the Drosophila gap gene network model.
Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria
2015-01-01
The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.
Graubner, Felix R.; Gram, Aykut; Kautz, Ewa; Bauersachs, Stefan; Aslan, Selim; Agaoglu, Ali R.; Boos, Alois
2017-01-01
Abstract In the dog, there is no luteolysis in the absence of pregnancy. Thus, this species lacks any anti-luteolytic endocrine signal as found in other species that modulate uterine function during the critical period of pregnancy establishment. Nevertheless, in the dog an embryo-maternal communication must occur in order to prevent rejection of embryos. Based on this hypothesis, we performed microarray analysis of canine uterine samples collected during pre-attachment phase (days 10-12) and in corresponding non-pregnant controls, in order to elucidate the embryo attachment signal. An additional goal was to identify differences in uterine responses to pre-attachment embryos between dogs and other mammalian species exhibiting different reproductive patterns with regard to luteolysis, implantation, and preparation for placentation. Therefore, the canine microarray data were compared with gene sets from pigs, cattle, horses, and humans. We found 412 genes differentially regulated between the two experimental groups. The functional terms most strongly enriched in response to pre-attachment embryos related to extracellular matrix function and remodeling, and to immune and inflammatory responses. Several candidate genes were validated by semi-quantitative PCR. When compared with other species, best matches were found with human and equine counterparts. Especially for the pig, the majority of overlapping genes showed opposite expression patterns. Interestingly, 1926 genes did not pair with any of the other gene sets. Using a microarray approach, we report the uterine changes in the dog driven by the presence of embryos and compare these results with datasets from other mammalian species, finding common-, contrary-, and exclusively canine-regulated genes. PMID:28651344
Gene expression profiling of selenophosphate synthetase 2 knockdown in Drosophila melanogaster.
Li, Gaopeng; Liu, Liying; Li, Ping; Chen, Luonan; Song, Haiyun; Zhang, Yan
2016-03-01
Selenium (Se) is an important trace element for many organisms and is incorporated into selenoproteins as selenocysteine (Sec). In eukaryotes, selenophosphate synthetase SPS2 is essential for Sec biosynthesis. In recent years, genetic disruptions of both Sec biosynthesis genes and selenoprotein genes have been investigated in different animal models, which provide important clues for understanding the Se metabolism and function in these organisms. However, a systematic study on the knockdown of SPS2 has not been performed in vivo. Herein, we conducted microarray experiments to study the transcriptome of fruit flies with knockdown of SPS2 in larval and adult stages. Several hundred differentially expressed genes were identified in each stage. In spite that the expression levels of other Sec biosynthesis genes and selenoprotein genes were not significantly changed, it is possible that selenoprotein translation might be reduced without impacting the mRNA level. Functional enrichment and network-based analyses revealed that although different sets of differentially expressed genes were obtained in each stage, they were both significantly enriched in the carbohydrate metabolism and redox processes. Furthermore, protein-protein interaction (PPI)-based network clustering analysis implied that several hub genes detected in the top modules, such as Nimrod C1 and regucalcin, could be considered as key regulators that are responsible for the complex responses caused by SPS2 knockdown. Overall, our data provide new insights into the relationship between Se utilization and several fundamental cellular processes as well as diseases.
NuGO contributions to GenePattern
Reiff, C.; Mayer, C.; Müller, M.
2008-01-01
NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository. PMID:19034553
NuGO contributions to GenePattern.
De Groot, P J; Reiff, C; Mayer, C; Müller, M
2008-12-01
NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository.
Tuning CRISPR-Cas9 Gene Drives in Saccharomyces cerevisiae
Roggenkamp, Emily; Giersch, Rachael M.; Schrock, Madison N.; Turnquist, Emily; Halloran, Megan; Finnigan, Gregory C.
2018-01-01
Control of biological populations is an ongoing challenge in many fields, including agriculture, biodiversity, ecological preservation, pest control, and the spread of disease. In some cases, such as insects that harbor human pathogens (e.g., malaria), elimination or reduction of a small number of species would have a dramatic impact across the globe. Given the recent discovery and development of the CRISPR-Cas9 gene editing technology, a unique arrangement of this system, a nuclease-based “gene drive,” allows for the super-Mendelian spread and forced propagation of a genetic element through a population. Recent studies have demonstrated the ability of a gene drive to rapidly spread within and nearly eliminate insect populations in a laboratory setting. While there are still ongoing technical challenges to design of a more optimal gene drive to be used in wild populations, there are still serious ecological and ethical concerns surrounding the nature of this powerful biological agent. Here, we use budding yeast as a safe and fully contained model system to explore mechanisms that might allow for programmed regulation of gene drive activity. We describe four conserved features of all CRISPR-based drives and demonstrate the ability of each drive component—Cas9 protein level, sgRNA identity, Cas9 nucleocytoplasmic shuttling, and novel Cas9-Cas9 tandem fusions—to modulate drive activity within a population. PMID:29348295
Levy, Nitzan; Tatomer, Dierdre; Herber, Candice B.; Zhao, Xiaoyue; Tang, Hui; Sargeant, Toby; Ball, Lonnele J.; Summers, Jonathan; Speed, Terence P.; Leitman, Dale C.
2008-01-01
Estrogen receptors (ERs) regulate gene transcription by interacting with regulatory elements. Most information regarding how ER activates genes has come from studies using a small set of target genes or simple consensus sequences such as estrogen response element, activator protein 1, and Sp1 elements. However, these elements cannot explain the differences in gene regulation patterns and clinical effects observed with estradiol (E2) and selective estrogen receptor modulators. To obtain a greater understanding of how E2 and selective estrogen receptor modulators differentially regulate genes, it is necessary to investigate their action on a more comprehensive set of native regulatory elements derived from ER target genes. Here we used chromatin immunoprecipitation-cloning and sequencing to isolate 173 regulatory elements associated with ERα. Most elements were found in the introns (38%) and regions greater than 10 kb upstream of the transcription initiation site (38%); 24% of the elements were found in the proximal promoter region (<10 kb). Only 11% of the elements contained a classical estrogen response element; 23% of the elements did not have any known response elements, including one derived from the naked cuticle homolog gene, which was associated with the recruitment of p160 coactivators. Transfection studies found that 80% of the 173 elements were regulated by E2, raloxifene, or tamoxifen with ERα or ERβ. Tamoxifen was more effective than raloxifene at activating the elements with ERα, whereas raloxifene was superior with ERβ. Our findings demonstrate that E2, tamoxifen, and raloxifene differentially regulate native ER-regulatory elements isolated by chromatin immunoprecipitation with ERα and ERβ. PMID:17962382
Semantic integration to identify overlapping functional modules in protein interaction networks
Cho, Young-Rae; Hwang, Woochang; Ramanathan, Murali; Zhang, Aidong
2007-01-01
Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification. PMID:17650343
Sharma, Amitabh; Gulbahce, Natali; Pevzner, Samuel J.; Menche, Jörg; Ladenvall, Claes; Folkersen, Lasse; Eriksson, Per; Orho-Melander, Marju; Barabási, Albert-László
2013-01-01
Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes. PMID:23882023
A Functional and Regulatory Network Associated with PIP Expression in Human Breast Cancer
Debily, Marie-Anne; Marhomy, Sandrine El; Boulanger, Virginie; Eveno, Eric; Mariage-Samson, Régine; Camarca, Alessandra; Auffray, Charles; Piatier-Tonneau, Dominique; Imbeaud, Sandrine
2009-01-01
Background The PIP (prolactin-inducible protein) gene has been shown to be expressed in breast cancers, with contradictory results concerning its implication. As both the physiological role and the molecular pathways in which PIP is involved are poorly understood, we conducted combined gene expression profiling and network analysis studies on selected breast cancer cell lines presenting distinct PIP expression levels and hormonal receptor status, to explore the functional and regulatory network of PIP co-modulated genes. Principal Findings Microarray analysis allowed identification of genes co-modulated with PIP independently of modulations resulting from hormonal treatment or cell line heterogeneity. Relevant clusters of genes that can discriminate between [PIP+] and [PIP−] cells were identified. Functional and regulatory network analyses based on a knowledge database revealed a master network of PIP co-modulated genes, including many interconnecting oncogenes and tumor suppressor genes, half of which were detected as differentially expressed through high-precision measurements. The network identified appears associated with an inhibition of proliferation coupled with an increase of apoptosis and an enhancement of cell adhesion in breast cancer cell lines, and contains many genes with a STAT5 regulatory motif in their promoters. Conclusions Our global exploratory approach identified biological pathways modulated along with PIP expression, providing further support for its good prognostic value of disease-free survival in breast cancer. Moreover, our data pointed to the importance of a regulatory subnetwork associated with PIP expression in which STAT5 appears as a potential transcriptional regulator. PMID:19262752
Kang, Hye-Min; Lee, Jin-Sol; Kim, Min-Sub; Lee, Young Hwan; Jung, Jee-Hyun; Hagiwara, Atsushi; Zhou, Bingsheng; Lee, Jae-Seong; Jeong, Chang-Bum
2018-05-30
Autophagy originated from the common ancestor of all life forms, and its function is highly conserved from yeast to humans. Autophagy plays a key role in various fundamental biological processes including defense, and has developed through serial interactions of multiple gene sets referred to as autophagy-related (Atg) genes. Despite their significance in metazoan life and evolution, few studies have been conducted to identify these genes in aquatic invertebrates. In this study, we identified whole Atg genes in four Brachionus rotifer spp., namely B. calyciflorus, B. koreanus, B. plicatilis, and B. rotundiformis, through searches of their entire genomes; and we annotated them according to the yeast nomenclature. Twenty-four genes orthologous to yeast genes were present in all of the Brachionus spp. while three additional gene duplicates were identified in the genome of B. koreanus, indicating that these genes had diversified during the speciation. Also, their transcriptional responses to cadmium exposure indicated regulation by cadmium-induced oxidative-stress-related signaling pathways. This study provides valuable information on 99 conserved Atg genes involved in autophagosome formation in Brachionus spp., with transcriptional modulation in response to cadmium, in the context of the role of autophagy in the damage response. Copyright © 2018 Elsevier B.V. All rights reserved.
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants. PMID:29692794
Henríquez-Valencia, Carlos; Arenas-M, Anita; Medina, Joaquín; Canales, Javier
2018-01-01
Sulfur is an essential nutrient for plant growth and development. Sulfur is a constituent of proteins, the plasma membrane and cell walls, among other important cellular components. To obtain new insights into the gene regulatory networks underlying the sulfate response, we performed an integrative meta-analysis of transcriptomic data from five different sulfate experiments available in public databases. This bioinformatic approach allowed us to identify a robust set of genes whose expression depends only on sulfate availability, indicating that those genes play an important role in the sulfate response. In relation to sulfate metabolism, the biological function of approximately 45% of these genes is currently unknown. Moreover, we found several consistent Gene Ontology terms related to biological processes that have not been extensively studied in the context of the sulfate response; these processes include cell wall organization, carbohydrate metabolism, nitrogen compound transport, and the regulation of proteolysis. Gene co-expression network analyses revealed relationships between the sulfate-responsive genes that were distributed among seven function-specific co-expression modules. The most connected genes in the sulfate co-expression network belong to a module related to the carbon response, suggesting that this biological function plays an important role in the control of the sulfate response. Temporal analyses of the network suggest that sulfate starvation generates a biphasic response, which involves that major changes in gene expression occur during both the early and late responses. Network analyses predicted that the sulfate response is regulated by a limited number of transcription factors, including MYBs, bZIPs, and NF-YAs. In conclusion, our analysis identified new candidate genes and provided new hypotheses to advance our understanding of the transcriptional regulation of sulfate metabolism in plants.
Oshida, Keiyu; Vasani, Naresh; Thomas, Russell S; Applegate, Dawn; Rosen, Mitch; Abbott, Barbara; Lau, Christopher; Guo, Grace; Aleksunes, Lauren M; Klaassen, Curtis; Corton, J Christopher
2015-01-01
The nuclear receptor family member peroxisome proliferator-activated receptor α (PPARα) is activated by therapeutic hypolipidemic drugs and environmentally-relevant chemicals to regulate genes involved in lipid transport and catabolism. Chronic activation of PPARα in rodents increases liver cancer incidence, whereas suppression of PPARα activity leads to hepatocellular steatosis. Analytical approaches were developed to identify biosets (i.e., gene expression differences between two conditions) in a genomic database in which PPARα activity was altered. A gene expression signature of 131 PPARα-dependent genes was built using microarray profiles from the livers of wild-type and PPARα-null mice after exposure to three structurally diverse PPARα activators (WY-14,643, fenofibrate and perfluorohexane sulfonate). A fold-change rank-based test (Running Fisher's test (p-value ≤ 10(-4))) was used to evaluate the similarity between the PPARα signature and a test set of 48 and 31 biosets positive or negative, respectively for PPARα activation; the test resulted in a balanced accuracy of 98%. The signature was then used to identify factors that activate or suppress PPARα in an annotated mouse liver/primary hepatocyte gene expression compendium of ~1850 biosets. In addition to the expected activation of PPARα by fibrate drugs, di(2-ethylhexyl) phthalate, and perfluorinated compounds, PPARα was activated by benzofuran, galactosamine, and TCDD and suppressed by hepatotoxins acetaminophen, lipopolysaccharide, silicon dioxide nanoparticles, and trovafloxacin. Additional factors that activate (fasting, caloric restriction) or suppress (infections) PPARα were also identified. This study 1) developed methods useful for future screening of environmental chemicals, 2) identified chemicals that activate or suppress PPARα, and 3) identified factors including diets and infections that modulate PPARα activity and would be hypothesized to affect chemical-induced PPARα activity.
Oshida, Keiyu; Vasani, Naresh; Thomas, Russell S.; Applegate, Dawn; Rosen, Mitch; Abbott, Barbara; Lau, Christopher; Guo, Grace; Aleksunes, Lauren M.; Klaassen, Curtis; Corton, J. Christopher
2015-01-01
The nuclear receptor family member peroxisome proliferator-activated receptor α (PPARα) is activated by therapeutic hypolipidemic drugs and environmentally-relevant chemicals to regulate genes involved in lipid transport and catabolism. Chronic activation of PPARα in rodents increases liver cancer incidence, whereas suppression of PPARα activity leads to hepatocellular steatosis. Analytical approaches were developed to identify biosets (i.e., gene expression differences between two conditions) in a genomic database in which PPARα activity was altered. A gene expression signature of 131 PPARα-dependent genes was built using microarray profiles from the livers of wild-type and PPARα-null mice after exposure to three structurally diverse PPARα activators (WY-14,643, fenofibrate and perfluorohexane sulfonate). A fold-change rank-based test (Running Fisher’s test (p-value ≤ 10-4)) was used to evaluate the similarity between the PPARα signature and a test set of 48 and 31 biosets positive or negative, respectively for PPARα activation; the test resulted in a balanced accuracy of 98%. The signature was then used to identify factors that activate or suppress PPARα in an annotated mouse liver/primary hepatocyte gene expression compendium of ~1850 biosets. In addition to the expected activation of PPARα by fibrate drugs, di(2-ethylhexyl) phthalate, and perfluorinated compounds, PPARα was activated by benzofuran, galactosamine, and TCDD and suppressed by hepatotoxins acetaminophen, lipopolysaccharide, silicon dioxide nanoparticles, and trovafloxacin. Additional factors that activate (fasting, caloric restriction) or suppress (infections) PPARα were also identified. This study 1) developed methods useful for future screening of environmental chemicals, 2) identified chemicals that activate or suppress PPARα, and 3) identified factors including diets and infections that modulate PPARα activity and would be hypothesized to affect chemical-induced PPARα activity. PMID:25689681
Functional Module Analysis for Gene Coexpression Networks with Network Integration.
Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K
2015-01-01
Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Gene Selection and Cancer Classification: A Rough Sets Based Approach
NASA Astrophysics Data System (ADS)
Sun, Lijun; Miao, Duoqian; Zhang, Hongyun
Indentification of informative gene subsets responsible for discerning between available samples of gene expression data is an important task in bioinformatics. Reducts, from rough sets theory, corresponding to a minimal set of essential genes for discerning samples, is an efficient tool for gene selection. Due to the compuational complexty of the existing reduct algoritms, feature ranking is usually used to narrow down gene space as the first step and top ranked genes are selected . In this paper,we define a novel certierion based on the expression level difference btween classes and contribution to classification of the gene for scoring genes and present a algorithm for generating all possible reduct from informative genes.The algorithm takes the whole attribute sets into account and find short reduct with a significant reduction in computational complexity. An exploration of this approach on benchmark gene expression data sets demonstrates that this approach is successful for selecting high discriminative genes and the classification accuracy is impressive.
Freyre-González, Julio A; Alonso-Pavón, José A; Treviño-Quintanilla, Luis G; Collado-Vides, Julio
2008-10-27
Previous studies have used different methods in an effort to extract the modular organization of transcriptional regulatory networks. However, these approaches are not natural, as they try to cluster strongly connected genes into a module or locate known pleiotropic transcription factors in lower hierarchical layers. Here, we unravel the transcriptional regulatory network of Escherichia coli by separating it into its key elements, thus revealing its natural organization. We also present a mathematical criterion, based on the topological features of the transcriptional regulatory network, to classify the network elements into one of two possible classes: hierarchical or modular genes. We found that modular genes are clustered into physiologically correlated groups validated by a statistical analysis of the enrichment of the functional classes. Hierarchical genes encode transcription factors responsible for coordinating module responses based on general interest signals. Hierarchical elements correlate highly with the previously studied global regulators, suggesting that this could be the first mathematical method to identify global regulators. We identified a new element in transcriptional regulatory networks never described before: intermodular genes. These are structural genes that integrate, at the promoter level, signals coming from different modules, and therefore from different physiological responses. Using the concept of pleiotropy, we have reconstructed the hierarchy of the network and discuss the role of feedforward motifs in shaping the hierarchical backbone of the transcriptional regulatory network. This study sheds new light on the design principles underpinning the organization of transcriptional regulatory networks, showing a novel nonpyramidal architecture composed of independent modules globally governed by hierarchical transcription factors, whose responses are integrated by intermodular genes.
Koster, Roelof; Mitra, Nandita; D'Andrea, Kurt; Vardhanabhuti, Saran; Chung, Charles C; Wang, Zhaoming; Loren Erickson, R; Vaughn, David J; Litchfield, Kevin; Rahman, Nazneen; Greene, Mark H; McGlynn, Katherine A; Turnbull, Clare; Chanock, Stephen J; Nathanson, Katherine L; Kanetsky, Peter A
2014-11-15
Genome-wide association (GWA) studies of testicular germ cell tumor (TGCT) have identified 18 susceptibility loci, some containing genes encoding proteins important in male germ cell development. Deletions of one of these genes, DMRT1, lead to male-to-female sex reversal and are associated with development of gonadoblastoma. To further explore genetic association with TGCT, we undertook a pathway-based analysis of SNP marker associations in the Penn GWAs (349 TGCT cases and 919 controls). We analyzed a custom-built sex determination gene set consisting of 32 genes using three different methods of pathway-based analysis. The sex determination gene set ranked highly compared with canonical gene sets, and it was associated with TGCT (FDRG = 2.28 × 10(-5), FDRM = 0.014 and FDRI = 0.008 for Gene Set Analysis-SNP (GSA-SNP), Meta-Analysis Gene Set Enrichment of Variant Associations (MAGENTA) and Improved Gene Set Enrichment Analysis for Genome-wide Association Study (i-GSEA4GWAS) analysis, respectively). The association remained after removal of DMRT1 from the gene set (FDRG = 0.0002, FDRM = 0.055 and FDRI = 0.009). Using data from the NCI GWA scan (582 TGCT cases and 1056 controls) and UK scan (986 TGCT cases and 4946 controls), we replicated these findings (NCI: FDRG = 0.006, FDRM = 0.014, FDRI = 0.033, and UK: FDRG = 1.04 × 10(-6), FDRM = 0.016, FDRI = 0.025). After removal of DMRT1 from the gene set, the sex determination gene set remains associated with TGCT in the NCI (FDRG = 0.039, FDRM = 0.050 and FDRI = 0.055) and UK scans (FDRG = 3.00 × 10(-5), FDRM = 0.056 and FDRI = 0.044). With the exception of DMRT1, genes in the sex determination gene set have not previously been identified as TGCT susceptibility loci in these GWA scans, demonstrating the complementary nature of a pathway-based approach for genome-wide analysis of TGCT. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Habib, E. H.; Deshotel, M.; Merck, M. F.; Lall, U.; Farnham, D. J.
2016-12-01
Traditional approaches to undergraduate hydrology and water resource education are textbook based, adopt unit processes and rely on idealized examples of specific applications, rather than examining the contextual relations in the processes and the dynamics connecting climate and ecosystems. The overarching goal of this project is to address the needed paradigm shift in undergraduate education of engineering hydrology and water resources education to reflect parallel advances in hydrologic research and technology, mainly in the areas of new observational settings, data and modeling resources and web-based technologies. This study presents efforts to develop a set of learning modules that are case-based, data and simulation driven and delivered via a web user interface. The modules are based on real-world case studies from three regional hydrologic settings: Coastal Louisiana, Utah Rocky Mountains and Florida Everglades. These three systems provide unique learning opportunities on topics such as: regional-scale budget analysis, hydrologic effects of human and natural changes, flashflood protection, climate-hydrology teleconnections and water resource management scenarios. The technical design and contents of the modules aim to support students' ability for transforming their learning outcomes and skills to hydrologic systems other than those used by the specific activity. To promote active learning, the modules take students through a set of highly engaging learning activities that are based on analysis of hydrologic data and model simulations. The modules include user support in the form of feedback and self-assessment mechanisms that are integrated within the online modules. Module effectiveness is assessed through an improvement-focused evaluation model using a mixed-method research approach guiding collection and analysis of evaluation data. Both qualitative and quantitative data are collected through student learning data, product analysis, and staff interviews. The presentation shares with the audience lessons learned from the development and implementation of the modules, students' feedback, guidelines on design and content attributes that support active learning in hydrology, and challenges encountered during the class implementation and evaluation of the modules.
MINE: Module Identification in Networks
2011-01-01
Background Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks. Results MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties. Conclusions MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans. PMID:21605434
Development of Novel Nonagonist PPAR-Gamma Ligands for Lung Cancer Treatment
2016-08-01
Affymetrix gene expression profiling. To get the purest representation of this gene set, we generated fibroblasts from the brown adipose tissue of mice... tissues . It has been shown that p53 plays an important role in metabolism and adipose tissue function, and this may be modulated by PPARγ expression as...presentations. Poster Presentation: Melin J. Khandekar, Alex S. Banks , Dina Laznik- Bogoslavski, James P. White, Jang H. Choi, Kwok-kin Wong, Ted
Montagne, Kevin; Onuma, Yasuko; Ito, Yuzuru; Aiki, Yasuhiko; Furukawa, Katsuko S; Ushida, Takashi
2017-01-01
Due to the high water content of cartilage, hydrostatic pressure is likely one of the main physical stimuli sensed by chondrocytes. Whereas, in the physiological range (0 to around 10 MPa), hydrostatic pressure exerts mostly pro-chondrogenic effects in chondrocyte models, excessive pressures have been reported to induce detrimental effects on cartilage, such as increased apoptosis and inflammation, and decreased cartilage marker expression. Though some genes modulated by high pressure have been identified, the effects of high pressure on the global gene expression pattern have still not been investigated. In this study, using microarray technology and real-time PCR validation, we analyzed the transcriptome of ATDC5 chondrocyte progenitors submitted to a continuous pressure of 25 MPa for up to 24 h. Several hundreds of genes were found to be modulated by pressure, including some not previously known to be mechano-sensitive. High pressure markedly increased the expression of stress-related genes, apoptosis-related genes and decreased that of cartilage matrix genes. Furthermore, a large set of genes involved in the progression of osteoarthritis were also induced by high pressure, suggesting that hydrostatic pressure could partly mimic in vitro some of the genetic alterations occurring in osteoarthritis.
Validating module network learning algorithms using simulated data.
Michoel, Tom; Maere, Steven; Bonnet, Eric; Joshi, Anagha; Saeys, Yvan; Van den Bulcke, Tim; Van Leemput, Koenraad; van Remortel, Piet; Kuiper, Martin; Marchal, Kathleen; Van de Peer, Yves
2007-05-03
In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Despite the demonstrated success of such algorithms in uncovering biologically relevant regulatory relations, further developments in the area are hampered by a lack of tools to compare the performance of alternative module network learning strategies. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators. We show that data simulators such as SynTReN are very well suited for the purpose of developing, testing and improving module network algorithms. We used SynTReN data to develop and test an alternative module network learning strategy, which is incorporated in the software package LeMoNe, and we provide evidence that this alternative strategy has several advantages with respect to existing methods.
Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi
2016-01-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405
Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi
2015-11-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Kaufmann, Markus; Schuffenhauer, Ansgar; Fruh, Isabelle; Klein, Jessica; Thiemeyer, Anke; Rigo, Pierre; Gomez-Mancilla, Baltazar; Heidinger-Millot, Valerie; Bouwmeester, Tewis; Schopfer, Ulrich; Mueller, Matthias; Fodor, Barna D; Cobos-Correa, Amanda
2015-10-01
Fragile X syndrome (FXS) is the most common form of inherited mental retardation, and it is caused in most of cases by epigenetic silencing of the Fmr1 gene. Today, no specific therapy exists for FXS, and current treatments are only directed to improve behavioral symptoms. Neuronal progenitors derived from FXS patient induced pluripotent stem cells (iPSCs) represent a unique model to study the disease and develop assays for large-scale drug discovery screens since they conserve the Fmr1 gene silenced within the disease context. We have established a high-content imaging assay to run a large-scale phenotypic screen aimed to identify compounds that reactivate the silenced Fmr1 gene. A set of 50,000 compounds was tested, including modulators of several epigenetic targets. We describe an integrated drug discovery model comprising iPSC generation, culture scale-up, and quality control and screening with a very sensitive high-content imaging assay assisted by single-cell image analysis and multiparametric data analysis based on machine learning algorithms. The screening identified several compounds that induced a weak expression of fragile X mental retardation protein (FMRP) and thus sets the basis for further large-scale screens to find candidate drugs or targets tackling the underlying mechanism of FXS with potential for therapeutic intervention. © 2015 Society for Laboratory Automation and Screening.
Spectral gene set enrichment (SGSE).
Frost, H Robert; Li, Zhigang; Moore, Jason H
2015-03-03
Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes the statistical association between gene sets and principal components (PCs) using our principal component gene set enrichment (PCGSE) method. The overall statistical association between each gene set and the spectral structure of the data is then computed by combining the PC-level p-values using the weighted Z-method with weights set to the PC variance scaled by Tracy-Widom test p-values. Using simulated data, we show that the SGSE algorithm can accurately recover spectral features from noisy data. To illustrate the utility of our method on real data, we demonstrate the superior performance of the SGSE method relative to standard cluster-based techniques for testing the association between MSigDB gene sets and the variance structure of microarray gene expression data. Unsupervised gene set testing can provide important information about the biological signal held in high-dimensional genomic data sets. Because it uses the association between gene sets and samples PCs to generate a measure of unsupervised enrichment, the SGSE method is independent of cluster or network creation algorithms and, most importantly, is able to utilize the statistical significance of PC eigenvalues to ignore elements of the data most likely to represent noise.
GenePRIMP: A Gene Prediction Improvement Pipeline For Prokaryotic Genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kyrpides, Nikos C.; Ivanova, Natalia N.; Pati, Amrita
2010-07-08
GenePRIMP (Gene Prediction Improvement Pipeline, Http://geneprimp.jgi-psf.org), a computational process that performs evidence-based evaluation of gene models in prokaryotic genomes and reports anomalies including inconsistent start sites, missing genes, and split genes. We show that manual curation of gene models using the anomaly reports generated by GenePRIMP improves their quality and demonstrate the applicability of GenePRIMP in improving finishing quality and comparing different genome sequencing and annotation technologies. Keywords in context: Gene model, Quality Control, Translation start sites, Automatic correction. Hardware requirements; PC, MAC; Operating System: UNIX/LINUX; Compiler/Version: Perl 5.8.5 or higher; Special requirements: NCBI Blast and nr installation; File Types:more » Source Code, Executable module(s), Sample problem input data; installation instructions other; programmer documentation. Location/transmission: http://geneprimp.jgi-psf.org/gp.tar.gz« less
The common transcriptional subnetworks of the grape berry skin in the late stages of ripening.
Ghan, Ryan; Petereit, Juli; Tillett, Richard L; Schlauch, Karen A; Toubiana, David; Fait, Aaron; Cramer, Grant R
2017-05-30
Wine grapes are important economically in many countries around the world. Defining the optimum time for grape harvest is a major challenge to the grower and winemaker. Berry skins are an important source of flavor, color and other quality traits in the ripening stage. Senescent-like processes such as chloroplast disorganization and cell death characterize the late ripening stage. To better understand the molecular and physiological processes involved in the late stages of berry ripening, RNA-seq analysis of the skins of seven wine grape cultivars (Cabernet Franc, Cabernet Sauvignon, Merlot, Pinot Noir, Chardonnay, Sauvignon Blanc and Semillon) was performed. RNA-seq analysis identified approximately 2000 common differentially expressed genes for all seven cultivars across four different berry sugar levels (20 to 26 °Brix). Network analyses, both a posteriori (standard) and a priori (gene co-expression network analysis), were used to elucidate transcriptional subnetworks and hub genes associated with traits in the berry skins of the late stages of berry ripening. These independent approaches revealed genes involved in photosynthesis, catabolism, and nucleotide metabolism. The transcript abundance of most photosynthetic genes declined with increasing sugar levels in the berries. The transcript abundance of other processes increased such as nucleic acid metabolism, chromosome organization and lipid catabolism. Weighted gene co-expression network analysis (WGCNA) identified 64 gene modules that were organized into 12 subnetworks of three modules or more and six higher order gene subnetworks. Some gene subnetworks were highly correlated with sugar levels and some subnetworks were highly enriched in the chloroplast and nucleus. The petal R package was utilized independently to construct a true small-world and scale-free complex gene co-expression network model. A subnetwork of 216 genes with the highest connectivity was elucidated, consistent with the module results from WGCNA. Hub genes in these subnetworks were identified including numerous members of the core circadian clock, RNA splicing, proteolysis and chromosome organization. An integrated model was constructed linking light sensing with alternative splicing, chromosome remodeling and the circadian clock. A common set of differentially expressed genes and gene subnetworks from seven different cultivars were examined in the skin of the late stages of grapevine berry ripening. A densely connected gene subnetwork was elucidated involving a complex interaction of berry senescent processes (autophagy), catabolism, the circadian clock, RNA splicing, proteolysis and epigenetic regulation. Hypotheses were induced from these data sets involving sugar accumulation, light, autophagy, epigenetic regulation, and fruit development. This work provides a better understanding of berry development and the transcriptional processes involved in the late stages of ripening.
Circulating polymerase chain reaction chips utilizing multiple-membrane activation
NASA Astrophysics Data System (ADS)
Wang, Chih-Hao; Chen, Yi-Yu; Liao, Chia-Sheng; Hsieh, Tsung-Min; Luo, Ching-Hsing; Wu, Jiunn-Jong; Lee, Huei-Huang; Lee, Gwo-Bin
2007-02-01
This paper reports a new micromachined, circulating, polymerase chain reaction (PCR) chip for nucleic acid amplification. The PCR chip is comprised of a microthermal control module and a polydimethylsiloxane (PDMS)-based microfluidic control module. The microthermal control modules are formed with three individual heating and temperature-sensing sections, each modulating a specific set temperature for denaturation, annealing and extension processes, respectively. Micro-pneumatic valves and multiple-membrane activations are used to form the microfluidic control module to transport sample fluids through three reaction regions. Compared with other PCR chips, the new chip is more compact in size, requires less time for heating and cooling processes, and has the capability to randomly adjust time ratios and cycle numbers depending on the PCR process. Experimental results showed that detection genes for two pathogens, Streptococcus pyogenes (S. pyogenes, 777 bps) and Streptococcus pneumoniae (S. pneumoniae, 273 bps), can be successfully amplified using the new circulating PCR chip. The minimum number of thermal cycles to amplify the DNA-based S. pyogenes for slab gel electrophoresis is 20 cycles with an initial concentration of 42.5 pg µl-1. Experimental data also revealed that a high reproducibility up to 98% could be achieved if the initial template concentration of the S. pyogenes was higher than 4 pg µl-1. The preliminary results of the current paper were presented at the 19th IEEE International Conference on Micro Electro Mechanical Systems (IEEE MEMS 2006), Istanbul, Turkey, 22-26 January, 2006.
Occupational Home Economics Education Series. Securing Employment. Competency Based Teaching Module.
ERIC Educational Resources Information Center
Lowe, Phyllis; And Others
This module, one of ten competency based modules developed for vocational teachers, focuses on securing employment in home economics. It is designed for a variety of levels of learners (secondary, postsecondary, adult) in both school and nonschool educational settings. Five competencies to be developed with this module deal with the meaning of…
Identifying candidate driver genes by integrative ovarian cancer genomics data
NASA Astrophysics Data System (ADS)
Lu, Xinguo; Lu, Jibo
2017-08-01
Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.
Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie
2013-01-01
Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens.
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.
Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida
2012-07-20
Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
ERIC Educational Resources Information Center
Gylling, Margaret
This competency-based preservice home economics teacher education module on merchandising textiles and ready-to-wear is the third in a set of three modules on occupational aspects of textiles and clothing. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education…
ERIC Educational Resources Information Center
Movey, Jan
This competency-based preservice home economics teacher education module on environmental issues and the consumer is the third in a set of seven modules on consumer education related to management. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education [MATCHE]--see CE…
ERIC Educational Resources Information Center
California State Univ., Fresno. Dept. of Home Economics.
This competency-based preservice home economics teacher education module on individuals and families in crisis is the fourth in a set of five modules on consumer education related to human development. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education [MATCHE]--see…
ERIC Educational Resources Information Center
California State Univ., Fresno. Dept. of Home Economics.
This competency-based preservice home economics teacher education module on incorporating the consumer approach in homemaking classes is the fourth in a set of four core curriculum modules on consumer approach to homemaking education. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and…
ERIC Educational Resources Information Center
Henry, Nina
This competency-based preservice home economics teacher education module on assembly line garment construction is the second in a set of three modules on occupational aspects of textiles and clothing. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education [MATCHE]--see…
Multi-membership gene regulation in pathway based microarray analysis
2011-01-01
Background Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes. PMID:21939531
Multi-membership gene regulation in pathway based microarray analysis.
Pavlidis, Stelios P; Payne, Annette M; Swift, Stephen M
2011-09-22
Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.
Liu, Ruolin; Dickerson, Julie
2017-11-01
We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly module parses aligned reads into splicing graphs, and uses network flow algorithms to select the most likely transcripts. The quantification module uses a latent class model to assign read counts from the nodes of splicing graphs to transcripts. Strawberry simultaneously estimates the transcript abundances and corrects for sequencing bias through an EM algorithm. Based on simulations, Strawberry outperforms Cufflinks and StringTie in terms of both assembly and quantification accuracies. Under the evaluation of a real data set, the estimated transcript expression by Strawberry has the highest correlation with Nanostring probe counts, an independent experiment measure for transcript expression. Strawberry is written in C++14, and is available as open source software at https://github.com/ruolin/strawberry under the MIT license.
Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.
Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas
2017-01-21
We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.
Yoon, Dukyong; Kim, Hyosil; Suh-Kim, Haeyoung; Park, Rae Woong; Lee, KiYoung
2011-01-01
Microarray analyses based on differentially expressed genes (DEGs) have been widely used to distinguish samples across different cellular conditions. However, studies based on DEGs have not been able to clearly determine significant differences between samples of pathophysiologically similar HIV-1 stages, e.g., between acute and chronic progressive (or AIDS) or between uninfected and clinically latent stages. We here suggest a novel approach to allow such discrimination based on stage-specific genetic features of HIV-1 infection. Our approach is based on co-expression changes of genes known to interact. The method can identify a genetic signature for a single sample as contrasted with existing protein-protein-based analyses with correlational designs. Our approach distinguishes each sample using differentially co-expressed interacting protein pairs (DEPs) based on co-expression scores of individual interacting pairs within a sample. The co-expression score has positive value if two genes in a sample are simultaneously up-regulated or down-regulated. And the score has higher absolute value if expression-changing ratios are similar between the two genes. We compared characteristics of DEPs with that of DEGs by evaluating their usefulness in separation of HIV-1 stage. And we identified DEP-based network-modules and their gene-ontology enrichment to find out the HIV-1 stage-specific gene signature. Based on the DEP approach, we observed clear separation among samples from distinct HIV-1 stages using clustering and principal component analyses. Moreover, the discrimination power of DEPs on the samples (70-100% accuracy) was much higher than that of DEGs (35-45%) using several well-known classifiers. DEP-based network analysis also revealed the HIV-1 stage-specific network modules; the main biological processes were related to "translation," "RNA splicing," "mRNA, RNA, and nucleic acid transport," and "DNA metabolism." Through the HIV-1 stage-related modules, changing stage-specific patterns of protein interactions could be observed. DEP-based method discriminated the HIV-1 infection stages clearly, and revealed a HIV-1 stage-specific gene signature. The proposed DEP-based method might complement existing DEG-based approaches in various microarray expression analyses.
Blatti, Charles; Sinha, Saurabh
2016-07-15
Analysis of co-expressed gene sets typically involves testing for enrichment of different annotations or 'properties' such as biological processes, pathways, transcription factor binding sites, etc., one property at a time. This common approach ignores any known relationships among the properties or the genes themselves. It is believed that known biological relationships among genes and their many properties may be exploited to more accurately reveal commonalities of a gene set. Previous work has sought to achieve this by building biological networks that combine multiple types of gene-gene or gene-property relationships, and performing network analysis to identify other genes and properties most relevant to a given gene set. Most existing network-based approaches for recognizing genes or annotations relevant to a given gene set collapse information about different properties to simplify (homogenize) the networks. We present a network-based method for ranking genes or properties related to a given gene set. Such related genes or properties are identified from among the nodes of a large, heterogeneous network of biological information. Our method involves a random walk with restarts, performed on an initial network with multiple node and edge types that preserve more of the original, specific property information than current methods that operate on homogeneous networks. In this first stage of our algorithm, we find the properties that are the most relevant to the given gene set and extract a subnetwork of the original network, comprising only these relevant properties. We then re-rank genes by their similarity to the given gene set, based on a second random walk with restarts, performed on the above subnetwork. We demonstrate the effectiveness of this algorithm for ranking genes related to Drosophila embryonic development and aggressive responses in the brains of social animals. DRaWR was implemented as an R package available at veda.cs.illinois.edu/DRaWR. blatti@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Shifera, Amde Selassie; Hardin, John A.
2009-01-01
The Renilla luciferase gene is commonly used as an internal control in luciferase-based reporter gene assays to normalize the values of the experimental reporter gene for variations that could be caused by transfection efficiency and sample handling. Various plasmids encoding Renilla luciferase under different promoter constructs are commercially available. The validity of the use of Renilla luciferase as an internal control is based on the assumption that it is constitutively expressed in transfected cells and that its constitutive expression is not modulated by experimental factors that could result in either the upregulation or the downregulation of the amounts of the enzyme produced. During the past ten years, a number of reports have appeared that identified a variety of conditions that could alter the basal constitutive expression of Renilla luciferase. The use of Renilla luciferase in those circumstances would not be valid and an alternative way of normalization would be necessary. This review covers the factors that have been reported thus far as modulating the expression of Renilla luciferase from plasmid constructs. PMID:19788887
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H
2015-01-01
Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
Quantitative gene-gene and gene-environment mapping for leaf shape variation using tree-based models
USDA-ARS?s Scientific Manuscript database
Leaf shape traits have long been a focus of many disciplines, but searching for complex genetic and environmental interactive mechanisms regulating leaf shape variation has not yet been well developed. The question of the respective roles of gene and environment and how they interplay to modulate l...
Pace, Tomasino; Olivieri, Anna; Sanchez, Massimo; Albanesi, Veronica; Picci, Leonardo; Siden Kiamos, Inga; Janse, Chris J; Waters, Andrew P; Pizzi, Elisabetta; Ponzi, Marta
2006-05-01
Transmission of the malaria parasite depends on specialized gamete precursors (gametocytes) that develop in the bloodstream of a vertebrate host. Gametocyte/gamete differentiation requires controlled patterns of gene expression and regulation not only of stage and gender-specific genes but also of genes associated with DNA replication and mitosis. Once taken up by mosquito, male gametocytes undergo three mitotic cycles within few minutes to produce eight motile gametes. Here we analysed, in two Plasmodium species, the expression of SET, a conserved nuclear protein involved in chromatin dynamics. SET is expressed in both asexual and sexual blood stages but strongly accumulates in male gametocytes. We demonstrated functionally the presence of two distinct promoters upstream of the set open reading frame, the one active in all blood stage parasites while the other active only in gametocytes and in a fraction of schizonts possibly committed to sexual differentiation. In ookinetes both promoters exhibit a basal activity, while in the oocysts the gametocyte-specific promoter is silent and the reporter gene is only transcribed from the constitutive promoter. This transcriptional control, described for the first time in Plasmodium, provides a mechanism by which single-copy genes can be differently modulated during parasite development. In male gametocytes an overexpression of SET might contribute to a prompt entry and execution of S/M phases within mosquito vector.
Pipe inspection and repair system
NASA Technical Reports Server (NTRS)
Schempf, Hagen (Inventor); Mutschler, Edward (Inventor); Chemel, Brian (Inventor); Boehmke, Scott (Inventor); Crowley, William (Inventor)
2004-01-01
A multi-module pipe inspection and repair device. The device includes a base module, a camera module, a sensor module, an MFL module, a brush module, a patch set/test module, and a marker module. Each of the modules may be interconnected to construct one of an inspection device, a preparation device, a marking device, and a repair device.
Insights into TREM2 biology by network analysis of human brain gene expression data
Forabosco, Paola; Ramasamy, Adaikalavan; Trabzuni, Daniah; Walker, Robert; Smith, Colin; Bras, Jose; Levine, Adam P.; Hardy, John; Pocock, Jennifer M.; Guerreiro, Rita; Weale, Michael E.; Ryten, Mina
2013-01-01
Rare variants in TREM2 cause susceptibility to late-onset Alzheimer's disease. Here we use microarray-based expression data generated from 101 neuropathologically normal individuals and covering 10 brain regions, including the hippocampus, to understand TREM2 biology in human brain. Using network analysis, we detect a highly preserved TREM2-containing module in human brain, show that it relates to microglia, and demonstrate that TREM2 is a hub gene in 5 brain regions, including the hippocampus, suggesting that it can drive module function. Using enrichment analysis we show significant overrepresentation of genes implicated in the adaptive and innate immune system. Inspection of genes with the highest connectivity to TREM2 suggests that it plays a key role in mediating changes in the microglial cytoskeleton necessary not only for phagocytosis, but also migration. Most importantly, we show that the TREM2-containing module is significantly enriched for genes genetically implicated in Alzheimer's disease, multiple sclerosis, and motor neuron disease, implying that these diseases share common pathways centered on microglia and that among the genes identified are possible new disease-relevant genes. PMID:23855984
Bruce, A. Gregory; Barcy, Serge; DiMaio, Terri; Gan, Emilia; Garrigues, H. Jacques; Lagunoff, Michael; Rose, Timothy M.
2017-01-01
The transcriptome of the Kaposi’s sarcoma-associated herpesvirus (KSHV/HHV8) after primary latent infection of human blood (BEC), lymphatic (LEC) and immortalized (TIME) endothelial cells was analyzed using RNAseq, and compared to long-term latency in BCBL-1 lymphoma cells. Naturally expressed transcripts were obtained without artificial induction, and a comprehensive annotation of the KSHV genome was determined. A set of unique coding sequence (UCDS) features and a process to resolve overlapping transcripts were developed to accurately quantitate transcript levels from specific promoters. Similar patterns of KSHV expression were detected in BCBL-1 cells undergoing long-term latent infections and in primary latent infections of both BEC and LEC cultures. High expression levels of poly-adenylated nuclear (PAN) RNA and spliced and unspliced transcripts encoding the K12 Kaposin B/C complex and associated microRNA region were detected, with an elevated expression of a large set of lytic genes in all latently infected cultures. Quantitation of non-overlapping regions of transcripts across the complete KSHV genome enabled for the first time accurate evaluation of the KSHV transcriptome associated with viral latency in different cell types. Hierarchical clustering applied to a gene correlation matrix identified modules of co-regulated genes with similar correlation profiles, which corresponded with biological and functional similarities of the encoded gene products. Gene modules were differentially upregulated during latency in specific cell types indicating a role for cellular factors associated with differentiated and/or proliferative states of the host cell to influence viral gene expression. PMID:28335496
Lai, Yinglei; Zhang, Fanni; Nayak, Tapan K; Modarres, Reza; Lee, Norman H; McCaffrey, Timothy A
2014-01-01
Gene set enrichment analysis (GSEA) is an important approach to the analysis of coordinate expression changes at a pathway level. Although many statistical and computational methods have been proposed for GSEA, the issue of a concordant integrative GSEA of multiple expression data sets has not been well addressed. Among different related data sets collected for the same or similar study purposes, it is important to identify pathways or gene sets with concordant enrichment. We categorize the underlying true states of differential expression into three representative categories: no change, positive change and negative change. Due to data noise, what we observe from experiments may not indicate the underlying truth. Although these categories are not observed in practice, they can be considered in a mixture model framework. Then, we define the mathematical concept of concordant gene set enrichment and calculate its related probability based on a three-component multivariate normal mixture model. The related false discovery rate can be calculated and used to rank different gene sets. We used three published lung cancer microarray gene expression data sets to illustrate our proposed method. One analysis based on the first two data sets was conducted to compare our result with a previous published result based on a GSEA conducted separately for each individual data set. This comparison illustrates the advantage of our proposed concordant integrative gene set enrichment analysis. Then, with a relatively new and larger pathway collection, we used our method to conduct an integrative analysis of the first two data sets and also all three data sets. Both results showed that many gene sets could be identified with low false discovery rates. A consistency between both results was also observed. A further exploration based on the KEGG cancer pathway collection showed that a majority of these pathways could be identified by our proposed method. This study illustrates that we can improve detection power and discovery consistency through a concordant integrative analysis of multiple large-scale two-sample gene expression data sets.
Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA.
Yin, Li; Cai, Zhihui; Zhu, Baoan; Xu, Cunshuan
2018-02-14
Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.
Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin
2014-01-01
The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
Molecular mechanisms of OLIG2 transcription factor in brain cancer
Lian, Nathan; Kesari, Santosh
2016-01-01
Oligodendrocyte lineage transcription factor 2 (OLIG2) plays a pivotal role in glioma development. Here we conducted a comprehensive study of the critical gene regulatory networks involving OLIG2. These include the networks responsible for OLIG2 expression, its translocation to nucleus, cell cycle, epigenetic regulation, and Rho-pathway interactions. We described positive feedback loops including OLIG2: loops of epigenetic regulation and loops involving receptor tyrosine kinases. These loops may be responsible for the prolonged oncogenic activity of OLIG2. The proposed schemes for epigenetic regulation of the gene networks involving OLIG2 are confirmed by patient survival (Kaplan–Meier) curves based on the cancer genome atlas (TCGA) datasets. Finally, we elucidate the Coherent-Gene Modules (CGMs) networks—framework of OLIG2 involvement in cancer. We showed that genes interacting with OLIG2 formed eight CGMs having a set of intermodular connections. We showed also that among the genes involved in these modules the most connected hub is EGFR, then, on lower level, HSP90 and CALM1, followed by three lower levels including epigenetic genes KDM1A and NCOR1. The genes on the six upper levels of the hierarchy are involved in interconnections of all eight CGMs and organize functionally defined gene-signaling subnetworks having specific functions. For example, CGM1 is involved in epigenetic control. CGM2 is significantly related to cell proliferation and differentiation. CGM3 includes a number of interconnected helix–loop–helix transcription factors (bHLH) including OLIG2. Many of these TFs are partially controlled by OLIG2. The CGM4 is involved in PDGF-related: angiogenesis, tumor cell proliferation and differentiation. These analyses provide testable hypotheses and approaches to inhibit OLIG2 pathway and relevant feed-forward and feedback loops to be interrogated. This broad approach can be applied to other TFs. PMID:27447975
Preservation affinity in consensus modules among stages of HIV-1 progression.
Mosaddek Hossain, Sk Md; Ray, Sumanta; Mukhopadhyay, Anirban
2017-03-20
Analysis of gene expression data provides valuable insights into disease mechanism. Investigating relationship among co-expression modules of different stages is a meaningful tool to understand the way in which a disease progresses. Identifying topological preservation of modular structure also contributes to that understanding. HIV-1 disease provides a well-documented progression pattern through three stages of infection: acute, chronic and non-progressor. In this article, we have developed a novel framework to describe the relationship among the consensus (or shared) co-expression modules for each pair of HIV-1 infection stages. The consensus modules are identified to assess the preservation of network properties. We have investigated the preservation patterns of co-expression networks during HIV-1 disease progression through an eigengene-based approach. We discovered that the expression patterns of consensus modules have a strong preservation during the transitions of three infection stages. In particular, it is noticed that between acute and non-progressor stages the preservation is slightly more than the other pair of stages. Moreover, we have constructed eigengene networks for the identified consensus modules and observed the preservation structure among them. Some consensus modules are marked as preserved in two pairs of stages and are analyzed further to form a higher order meta-network consisting of a group of preserved modules. Additionally, we observed that module membership (MM) values of genes within a module are consistent with the preservation characteristics. The MM values of genes within a pair of preserved modules show strong correlation patterns across two infection stages. We have performed an extensive analysis to discover preservation pattern of co-expression network constructed from microarray gene expression data of three different HIV-1 progression stages. The preservation pattern is investigated through identification of consensus modules in each pair of infection stages. It is observed that the preservation of the expression pattern of consensus modules remains more prominent during the transition of infection from acute stage to non-progressor stage. Additionally, we observed that the module membership values of genes are coherent with preserved modules across the HIV-1 progression stages.
Acevedo-Luna, Natalia; Mariño-Ramírez, Leonardo; Halbert, Armand; Hansen, Ulla; Landsman, David; Spouge, John L
2016-11-21
Transcription factors (TFs) form complexes that bind regulatory modules (RMs) within DNA, to control specific sets of genes. Some transcription factor binding sites (TFBSs) near the transcription start site (TSS) display tight positional preferences relative to the TSS. Furthermore, near the TSS, RMs can co-localize TFBSs with each other and the TSS. The proportion of TFBS positional preferences due to TFBS co-localization within RMs is unknown, however. ChIP experiments confirm co-localization of some TFBSs genome-wide, including near the TSS, but they typically examine only a few TFs at a time, using non-physiological conditions that can vary from lab to lab. In contrast, sequence analysis can examine many TFs uniformly and methodically, broadly surveying the co-localization of TFBSs with tight positional preferences relative to the TSS. Our statistics found 43 significant sets of human motifs in the JASPAR TF Database with positional preferences relative to the TSS, with 38 preferences tight (±5 bp). Each set of motifs corresponded to a gene group of 135 to 3304 genes, with 42/43 (98%) gene groups independently validated by DAVID, a gene ontology database, with FDR < 0.05. Motifs corresponding to two TFBSs in a RM should co-occur more than by chance alone, enriching the intersection of the gene groups corresponding to the two TFs. Thus, a gene-group intersection systematically enriched beyond chance alone provides evidence that the two TFs participate in an RM. Of the 903 = 43*42/2 intersections of the 43 significant gene groups, we found 768/903 (85%) pairs of gene groups with significantly enriched intersections, with 564/768 (73%) intersections independently validated by DAVID with FDR < 0.05. A user-friendly web site at http://go.usa.gov/3kjsH permits biologists to explore the interaction network of our TFBSs to identify candidate subunit RMs. Gene duplication and convergent evolution within a genome provide obvious biological mechanisms for replicating an RM near the TSS that binds a particular TF subunit. Of all intersections of our 43 significant gene groups, 85% were significantly enriched, with 73% of the significant enrichments independently validated by gene ontology. The co-localization of TFBSs within RMs therefore likely explains much of the tight TFBS positional preferences near the TSS.
ERIC Educational Resources Information Center
California State Univ., Fresno. Dept. of Home Economics.
This competency-based preservice home economics teacher education module on consumer approach to textiles and clothing is the first in a set of four modules on consumer education related to textiles and clothing. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and Homemaking Education…
ERIC Educational Resources Information Center
Waskey, Frank
This competency-based preservice home economics teacher education module on operations and activities of a food service operation is the second in a set of three modules on occupational education relating to foods and nutrition. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching Consumer and…
ERIC Educational Resources Information Center
Newsome, Ratana
This competency-based preservice home economics teacher education module on technological, sociological, ecological, and environmental factors related to food is the first in a set of five modules on consumer education related to foods and nutrition. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching…
Humphry, Matt; Bednarek, Paweł; Kemmerling, Birgit; Koh, Serry; Stein, Mónica; Göbel, Ulrike; Stüber, Kurt; Piślewska-Bednarek, Mariola; Loraine, Ann; Schulze-Lefert, Paul; Somerville, Shauna; Panstruga, Ralph
2010-01-01
At least two components that modulate plant resistance against the fungal powdery mildew disease are ancient and have been conserved since the time of the monocot–dicot split (≈200 Mya). These components are the seven transmembrane domain containing MLO/MLO2 protein and the syntaxin ROR2/PEN1, which act antagonistically and have been identified in the monocot barley (Hordeum vulgare) and the dicot Arabidopsis thaliana, respectively. Additionally, syntaxin-interacting N-ethylmaleimide sensitive factor adaptor protein receptor proteins (VAMP721/722 and SNAP33/34) as well as a myrosinase (PEN2) and an ABC transporter (PEN3) contribute to antifungal resistance in both barley and/or Arabidopsis. Here, we show that these genetically defined defense components share a similar set of coexpressed genes in the two plant species, comprising a statistically significant overrepresentation of gene products involved in regulation of transcription, posttranslational modification, and signaling. Most of the coexpressed Arabidopsis genes possess a common cis-regulatory element that may dictate their coordinated expression. We exploited gene coexpression to uncover numerous components in Arabidopsis involved in antifungal defense. Together, our data provide evidence for an evolutionarily conserved regulon composed of core components and clade/species-specific innovations that functions as a module in plant innate immunity. PMID:21098265
Protective pathways against colitis mediated by appendicitis and appendectomy
Cheluvappa, R; Luo, A S; Palmer, C; Grimm, M C
2011-01-01
Appendicitis followed by appendectomy (AA) at a young age protects against inflammatory bowel disease (IBD). Using a novel murine appendicitis model, we showed that AA protected against subsequent experimental colitis. To delineate genes/pathways involved in this protection, AA was performed and samples harvested from the most distal colon. RNA was extracted from four individual colonic samples per group (AA group and double-laparotomy control group) and each sample microarray analysed followed by gene-set enrichment analysis (GSEA). The gene-expression study was validated by quantitative reverse transcription–polymerase chain reaction (RT–PCR) of 14 selected genes across the immunological spectrum. Distal colonic expression of 266 gene-sets was up-regulated significantly in AA group samples (false discovery rates < 1%; P-value < 0·001). Time–course RT–PCR experiments involving the 14 genes displayed down-regulation over 28 days. The IBD-associated genes tnfsf10, SLC22A5, C3, ccr5, irgm, ptger4 and ccl20 were modulated in AA mice 3 days after surgery. Many key immunological and cellular function-associated gene-sets involved in the protective effect of AA in experimental colitis were identified. The down-regulation of 14 selected genes over 28 days after surgery indicates activation, repression or de-repression of these genes leading to downstream AA-conferred anti-colitis protection. Further analysis of these genes, profiles and biological pathways may assist in developing better therapeutic strategies in the management of intractable IBD. PMID:21707591
Detection of Significant Pneumococcal Meningitis Biomarkers by Ego Network.
Wang, Qian; Lou, Zhifeng; Zhai, Liansuo; Zhao, Haibin
2017-06-01
To identify significant biomarkers for detection of pneumococcal meningitis based on ego network. Based on the gene expression data of pneumococcal meningitis and global protein-protein interactions (PPIs) data recruited from open access databases, the authors constructed a differential co-expression network (DCN) to identify pneumococcal meningitis biomarkers in a network view. Here EgoNet algorithm was employed to screen the significant ego networks that could accurately distinguish pneumococcal meningitis from healthy controls, by sequentially seeking ego genes, searching candidate ego networks, refinement of candidate ego networks and significance analysis to identify ego networks. Finally, the functional inference of the ego networks was performed to identify significant pathways for pneumococcal meningitis. By differential co-expression analysis, the authors constructed the DCN that covered 1809 genes and 3689 interactions. From the DCN, a total of 90 ego genes were identified. Starting from these ego genes, three significant ego networks (Module 19, Module 70 and Module 71) that could predict clinical outcomes for pneumococcal meningitis were identified by EgoNet algorithm, and the corresponding ego genes were GMNN, MAD2L1 and TPX2, respectively. Pathway analysis showed that these three ego networks were related to CDT1 association with the CDC6:ORC:origin complex, inactivation of APC/C via direct inhibition of the APC/C complex pathway, and DNA strand elongation, respectively. The authors successfully screened three significant ego modules which could accurately predict the clinical outcomes for pneumococcal meningitis and might play important roles in host response to pathogen infection in pneumococcal meningitis.
Query-based biclustering of gene expression data using Probabilistic Relational Models.
Zhao, Hui; Cloots, Lore; Van den Bulcke, Tim; Wu, Yan; De Smet, Riet; Storms, Valerie; Meysman, Pieter; Engelen, Kristof; Marchal, Kathleen
2011-02-15
With the availability of large scale expression compendia it is now possible to view own findings in the light of what is already available and retrieve genes with an expression profile similar to a set of genes of interest (i.e., a query or seed set) for a subset of conditions. To that end, a query-based strategy is needed that maximally exploits the coexpression behaviour of the seed genes to guide the biclustering, but that at the same time is robust against the presence of noisy genes in the seed set as seed genes are often assumed, but not guaranteed to be coexpressed in the queried compendium. Therefore, we developed ProBic, a query-based biclustering strategy based on Probabilistic Relational Models (PRMs) that exploits the use of prior distributions to extract the information contained within the seed set. We applied ProBic on a large scale Escherichia coli compendium to extend partially described regulons with potentially novel members. We compared ProBic's performance with previously published query-based biclustering algorithms, namely ISA and QDB, from the perspective of bicluster expression quality, robustness of the outcome against noisy seed sets and biological relevance.This comparison learns that ProBic is able to retrieve biologically relevant, high quality biclusters that retain their seed genes and that it is particularly strong in handling noisy seeds. ProBic is a query-based biclustering algorithm developed in a flexible framework, designed to detect biologically relevant, high quality biclusters that retain relevant seed genes even in the presence of noise or when dealing with low quality seed sets.
Dwivedi, Bhakti; Kowalski, Jeanne
2018-01-01
While many methods exist for integrating multi-omics data or defining gene sets, there is no one single tool that defines gene sets based on merging of multiple omics data sets. We present shinyGISPA, an open-source application with a user-friendly web-based interface to define genes according to their similarity in several molecular changes that are driving a disease phenotype. This tool was developed to help facilitate the usability of a previously published method, Gene Integrated Set Profile Analysis (GISPA), among researchers with limited computer-programming skills. The GISPA method allows the identification of multiple gene sets that may play a role in the characterization, clinical application, or functional relevance of a disease phenotype. The tool provides an automated workflow that is highly scalable and adaptable to applications that go beyond genomic data merging analysis. It is available at http://shinygispa.winship.emory.edu/shinyGISPA/.
Dwivedi, Bhakti
2018-01-01
While many methods exist for integrating multi-omics data or defining gene sets, there is no one single tool that defines gene sets based on merging of multiple omics data sets. We present shinyGISPA, an open-source application with a user-friendly web-based interface to define genes according to their similarity in several molecular changes that are driving a disease phenotype. This tool was developed to help facilitate the usability of a previously published method, Gene Integrated Set Profile Analysis (GISPA), among researchers with limited computer-programming skills. The GISPA method allows the identification of multiple gene sets that may play a role in the characterization, clinical application, or functional relevance of a disease phenotype. The tool provides an automated workflow that is highly scalable and adaptable to applications that go beyond genomic data merging analysis. It is available at http://shinygispa.winship.emory.edu/shinyGISPA/. PMID:29415010
Graubner, Felix R; Gram, Aykut; Kautz, Ewa; Bauersachs, Stefan; Aslan, Selim; Agaoglu, Ali R; Boos, Alois; Kowalewski, Mariusz P
2017-08-01
In the dog, there is no luteolysis in the absence of pregnancy. Thus, this species lacks any anti-luteolytic endocrine signal as found in other species that modulate uterine function during the critical period of pregnancy establishment. Nevertheless, in the dog an embryo-maternal communication must occur in order to prevent rejection of embryos. Based on this hypothesis, we performed microarray analysis of canine uterine samples collected during pre-attachment phase (days 10-12) and in corresponding non-pregnant controls, in order to elucidate the embryo attachment signal. An additional goal was to identify differences in uterine responses to pre-attachment embryos between dogs and other mammalian species exhibiting different reproductive patterns with regard to luteolysis, implantation, and preparation for placentation. Therefore, the canine microarray data were compared with gene sets from pigs, cattle, horses, and humans. We found 412 genes differentially regulated between the two experimental groups. The functional terms most strongly enriched in response to pre-attachment embryos related to extracellular matrix function and remodeling, and to immune and inflammatory responses. Several candidate genes were validated by semi-quantitative PCR. When compared with other species, best matches were found with human and equine counterparts. Especially for the pig, the majority of overlapping genes showed opposite expression patterns. Interestingly, 1926 genes did not pair with any of the other gene sets. Using a microarray approach, we report the uterine changes in the dog driven by the presence of embryos and compare these results with datasets from other mammalian species, finding common-, contrary-, and exclusively canine-regulated genes. © The Authors 2017. Published by Oxford University Press on behalf of Society for the Study of Reproduction.
2013-01-01
Background The development of new therapies for orphan genetic diseases represents an extremely important medical and social challenge. Drug repositioning, i.e. finding new indications for approved drugs, could be one of the most cost- and time-effective strategies to cope with this problem, at least in a subset of cases. Therefore, many computational approaches based on the analysis of high throughput gene expression data have so far been proposed to reposition available drugs. However, most of these methods require gene expression profiles directly relevant to the pathologic conditions under study, such as those obtained from patient cells and/or from suitable experimental models. In this work we have developed a new approach for drug repositioning, based on identifying known drug targets showing conserved anti-correlated expression profiles with human disease genes, which is completely independent from the availability of ‘ad hoc’ gene expression data-sets. Results By analyzing available data, we provide evidence that the genes displaying conserved anti-correlation with drug targets are antagonistically modulated in their expression by treatment with the relevant drugs. We then identified clusters of genes associated to similar phenotypes and showing conserved anticorrelation with drug targets. On this basis, we generated a list of potential candidate drug-disease associations. Importantly, we show that some of the proposed associations are already supported by independent experimental evidence. Conclusions Our results support the hypothesis that the identification of gene clusters showing conserved anticorrelation with drug targets can be an effective method for drug repositioning and provide a wide list of new potential drug-disease associations for experimental validation. PMID:24088245
Lobach, Iryna; Fan, Ruzong; Manga, Prashiela
A central problem in genetic epidemiology is to identify and rank genetic markers involved in a disease. Complex diseases, such as cancer, hypertension, diabetes, are thought to be caused by an interaction of a panel of genetic factors, that can be identified by markers, which modulate environmental factors. Moreover, the effect of each genetic marker may be small. Hence, the association signal may be missed unless a large sample is considered, or a priori biomedical data are used. Recent advances generated a vast variety of a priori information, including linkage maps and information about gene regulatory dependence assembled into curated pathway databases. We propose a genotype-based approach that takes into account linkage disequilibrium (LD) information between genetic markers that are in moderate LD while modeling gene-gene and gene-environment interactions. A major advantage of our method is that the observed genetic information enters a model directly thus eliminating the need to estimate haplotype-phase. Our approach results in an algorithm that is inexpensive computationally and does not suffer from bias induced by haplotype-phase ambiguity. We investigated our model in a series of simulation experiments and demonstrated that the proposed approach results in estimates that are nearly unbiased and have small variability. We applied our method to the analysis of data from a melanoma case-control study and investigated interaction between a set of pigmentation genes and environmental factors defined by age and gender. Furthermore, an application of our method is demonstrated using a study of Alcohol Dependence.
Normanno, Davide; Vanzi, Francesco; Pavone, Francesco Saverio
2008-01-01
Gene expression regulation is a fundamental biological process which deploys specific sets of genomic information depending on physiological or environmental conditions. Several transcription factors (including lac repressor, LacI) are present in the cell at very low copy number and increase their local concentration by binding to multiple sites on DNA and looping the intervening sequence. In this work, we employ single-molecule manipulation to experimentally address the role of DNA supercoiling in the dynamics and stability of LacI-mediated DNA looping. We performed measurements over a range of degrees of supercoiling between −0.026 and +0.026, in the absence of axial stretching forces. A supercoiling-dependent modulation of the lifetimes of both the looped and unlooped states was observed. Our experiments also provide evidence for multiple structural conformations of the LacI–DNA complex, depending on torsional constraints. The supercoiling-dependent modulation demonstrated here adds an important element to the model of the lac operon. In fact, the complex network of proteins acting on the DNA in a living cell constantly modifies its topological and mechanical properties: our observations demonstrate the possibility of establishing a signaling pathway from factors affecting DNA supercoiling to transcription factors responsible for the regulation of specific sets of genes. PMID:18310101
Occupational Home Economics Education Series. Consumer Services. Competency Based Teaching Module.
ERIC Educational Resources Information Center
Lowe, Phyllis; And Others
This module, one of ten competency based modules developed for vocational home economics teachers, is based on a job cluster in consumer services. It is designed for a variety of levels (secondary, post-secondary, adult) in both school and non-school settings. Focusing on the specific job title of consumer advisor, eight competencies are listed…
Occupational Home Economics Education Series. Catering Services. Competency Based Teaching Module.
ERIC Educational Resources Information Center
Lowe, Phyllis; And Others
This module, one of ten competency based modules developed for vocational home economics teachers, is based on a job cluster in the catering industry. It is designed for use with a variety of levels of learners (secondary, postsecondary, adult) in both school and non-school educational settings. Focusing on two levels of employment, food caterer…
Cha, Kihoon; Hwang, Taeho; Oh, Kimin; Yi, Gwan-Su
2015-01-01
It has been reported that several brain diseases can be treated as transnosological manner implicating possible common molecular basis under those diseases. However, molecular level commonality among those brain diseases has been largely unexplored. Gene expression analyses of human brain have been used to find genes associated with brain diseases but most of those studies were restricted either to an individual disease or to a couple of diseases. In addition, identifying significant genes in such brain diseases mostly failed when it used typical methods depending on differentially expressed genes. In this study, we used a correlation-based biclustering approach to find coexpressed gene sets in five neurodegenerative diseases and three psychiatric disorders. By using biclustering analysis, we could efficiently and fairly identified various gene sets expressed specifically in both single and multiple brain diseases. We could find 4,307 gene sets correlatively expressed in multiple brain diseases and 3,409 gene sets exclusively specified in individual brain diseases. The function enrichment analysis of those gene sets showed many new possible functional bases as well as neurological processes that are common or specific for those eight diseases. This study introduces possible common molecular bases for several brain diseases, which open the opportunity to clarify the transnosological perspective assumed in brain diseases. It also showed the advantages of correlation-based biclustering analysis and accompanying function enrichment analysis for gene expression data in this type of investigation.
2015-01-01
Background It has been reported that several brain diseases can be treated as transnosological manner implicating possible common molecular basis under those diseases. However, molecular level commonality among those brain diseases has been largely unexplored. Gene expression analyses of human brain have been used to find genes associated with brain diseases but most of those studies were restricted either to an individual disease or to a couple of diseases. In addition, identifying significant genes in such brain diseases mostly failed when it used typical methods depending on differentially expressed genes. Results In this study, we used a correlation-based biclustering approach to find coexpressed gene sets in five neurodegenerative diseases and three psychiatric disorders. By using biclustering analysis, we could efficiently and fairly identified various gene sets expressed specifically in both single and multiple brain diseases. We could find 4,307 gene sets correlatively expressed in multiple brain diseases and 3,409 gene sets exclusively specified in individual brain diseases. The function enrichment analysis of those gene sets showed many new possible functional bases as well as neurological processes that are common or specific for those eight diseases. Conclusions This study introduces possible common molecular bases for several brain diseases, which open the opportunity to clarify the transnosological perspective assumed in brain diseases. It also showed the advantages of correlation-based biclustering analysis and accompanying function enrichment analysis for gene expression data in this type of investigation. PMID:26043779
Bersanelli, Matteo; Mosca, Ettore; Remondini, Daniel; Castellani, Gastone; Milanesi, Luciano
2016-01-01
A relation exists between network proximity of molecular entities in interaction networks, functional similarity and association with diseases. The identification of network regions associated with biological functions and pathologies is a major goal in systems biology. We describe a network diffusion-based pipeline for the interpretation of different types of omics in the context of molecular interaction networks. We introduce the network smoothing index, a network-based quantity that allows to jointly quantify the amount of omics information in genes and in their network neighbourhood, using network diffusion to define network proximity. The approach is applicable to both descriptive and inferential statistics calculated on omics data. We also show that network resampling, applied to gene lists ranked by quantities derived from the network smoothing index, indicates the presence of significantly connected genes. As a proof of principle, we identified gene modules enriched in somatic mutations and transcriptional variations observed in samples of prostate adenocarcinoma (PRAD). In line with the local hypothesis, network smoothing index and network resampling underlined the existence of a connected component of genes harbouring molecular alterations in PRAD. PMID:27731320
Therapeutic gene targeting approaches for the treatment of dyslipidemias and atherosclerosis.
Mäkinen, Petri I; Ylä-Herttuala, Seppo
2013-04-01
Despite improved therapies, cardiovascular diseases are the leading cause of morbidity and mortality worldwide. Therefore, new therapeutic approaches are still needed. In the gene therapy field, RNA interference (RNAi) and regulation of microRNAs (miRNAs) have gained a lot of attention in addition to traditional overexpression based strategies. Here, recent findings in therapeutic gene silencing and modulation of small RNA expression related to atherogenesis and dyslipidemia are summarized. Novel gene therapy approaches for the treatment of hyperlipidemia have been addressed. Antisense oligonucleotide and RNAi-based therapies against apolipoprotein B100 and proprotein convertase subtilisin/kexin type 9 have shown already efficacy in preclinical and clinical trials. In addition, several miRNAs dysregulated in atherosclerotic lesions and regulating cholesterol homeostasis have been found, which may represent novel targets for future therapies. New therapies for lowering lipid levels are now being tested in clinical trials, and both antisense oligonucleotide and RNAi-based therapies have shown promising results in lowering cholesterol levels. However, the modulation of inflammatory component in atherosclerosis by gene therapy and targeting of the effects to plaques are still difficult challenges.
ERIC Educational Resources Information Center
Joseph, Marjory
This competency-based preservice home economics teacher education module on applications and implications of new technology in textiles and clothing is the fourth in a set of four modules on consumer education related to textiles and clothing. (This set is part of a larger series of sixty-seven modules on the Management Approach to Teaching…
Bassuk, Alexander G.; Muthuswamy, Lakshmi B.; Boland, Riley; Smith, Tiffany L.; Hulstrand, Alissa M.; Northrup, Hope; Hakeman, Matthew; Dierdorff, Jason M.; Yung, Christina K.; Long, Abby; Brouillette, Rachel B.; Au, Kit Sing; Gurnett, Christina; Houston, Douglas W.; Cornell, Robert A.; Manak, J. Robert
2013-01-01
Neural tube defects (NTDs) are common birth defects of complex etiology. Family and population-based studies have confirmed a genetic component to NTDs. However, despite more than three decades of research, the genes involved in human NTDs remain largely unknown. We tested the hypothesis that rare copy number variants (CNVs), especially de novo germline CNVs, are a significant risk factor for NTDs. We used array-based comparative genomic hybridization (aCGH) to identify rare CNVs in 128 Caucasian and 61 Hispanic patients with non-syndromic lumbar-sacral myelomeningocele. We also performed aCGH analysis on the parents of affected individuals with rare CNVs where parental DNA was available (42 sets). Among the eight de novo CNVs that we identified, three generated copy number changes of entire genes. One large heterozygous deletion removed 27 genes, including PAX3, a known spina bifida-associated gene. A second CNV altered genes (PGPD8, ZC3H6) for which little is known regarding function or expression. A third heterozygous deletion removed GPC5 and part of GPC6, genes encoding glypicans. Glypicans are proteoglycans that modulate the activity of morphogens such as Sonic Hedgehog (SHH) and bone morphogenetic proteins (BMPs), both of which have been implicated in NTDs. Additionally, glypicans function in the planar cell polarity (PCP) pathway, and several PCP genes have been associated with NTDs. Here, we show that GPC5 orthologs are expressed in the neural tube, and that inhibiting their expression in frog and fish embryos results in NTDs. These results implicate GPC5 as a gene required for normal neural tube development. PMID:23223018
Write Proposals. Module CG B-2 of Category B--Supporting. Competency-Based Career Guidance Modules.
ERIC Educational Resources Information Center
Gustafson, Richard A.
This module is intended to help guidance personnel in a variety of educational and agency settings plan and develop successful proposals to assist in financing the improvement of existing or future career guidance programs. The module is one of a series of competency-based guidance program training packages focusing upon specific professional and…
Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi
2013-01-01
Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.
We propose the use of gene expression profiling to complement the chemical characterization currently based on HTS assay data and present a case study relevant to the Endocrine Disruptor Screening Program. We have developed computational methods to identify estrogen receptor &alp...
Laboratory Exercise in Behavioral Genetics Using Team-Based Learning Strategies
ERIC Educational Resources Information Center
Peterson, Elizabeth K.; Carrico, Pauline
2015-01-01
In this paper, we describe a two-week learning module where students tested the role of the "fruitless" gene on aggression and courtship in "Drosophila melanogaster" via team-based learning (TBL) strategies. The purpose of this module was to determine if TBL could be used in the future as a platform to implement the course…
Recreational music-making alters gene expression pathways in patients with coronary heart disease
Bittman, Barry; Croft, Daniel T.; Brinker, Jeannie; van Laar, Ryan; Vernalis, Marina N.; Ellsworth, Darrell L.
2013-01-01
Background Psychosocial stress profoundly impacts long-term cardiovascular health through adverse effects on sympathetic nervous system activity, endothelial dysfunction, and atherosclerotic development. Recreational Music Making (RMM) is a unique stress amelioration strategy encompassing group music-based activities that has great therapeutic potential for treating patients with stress-related cardiovascular disease. Material/Methods Participants (n=34) with a history of ischemic heart disease were subjected to an acute time-limited stressor, then randomized to RMM or quiet reading for one hour. Peripheral blood gene expression using GeneChip® Human Genome U133A 2.0 arrays was assessed at baseline, following stress, and after the relaxation session. Results Full gene set enrichment analysis identified 16 molecular pathways differentially regulated (P<0.005) during stress that function in immune response, cell mobility, and transcription. During relaxation, two pathways showed a significant change in expression in the control group, while 12 pathways governing immune function and gene expression were modulated among RMM participants. Only 13% (2/16) of pathways showed differential expression during stress and relaxation. Conclusions Human stress and relaxation responses may be controlled by different molecular pathways. Relaxation through active engagement in Recreational Music Making may be more effective than quiet reading at altering gene expression and thus more clinically useful for stress amelioration. PMID:23435350
Functional modules by relating protein interaction networks and gene expression.
Tornow, Sabine; Mewes, H W
2003-11-01
Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships.
Functional modules by relating protein interaction networks and gene expression
Tornow, Sabine; Mewes, H. W.
2003-01-01
Genes and proteins are organized on the basis of their particular mutual relations or according to their interactions in cellular and genetic networks. These include metabolic or signaling pathways and protein interaction, regulatory or co-expression networks. Integrating the information from the different types of networks may lead to the notion of a functional network and functional modules. To find these modules, we propose a new technique which is based on collective, multi-body correlations in a genetic network. We calculated the correlation strength of a group of genes (e.g. in the co-expression network) which were identified as members of a module in a different network (e.g. in the protein interaction network) and estimated the probability that this correlation strength was found by chance. Groups of genes with a significant correlation strength in different networks have a high probability that they perform the same function. Here, we propose evaluating the multi-body correlations by applying the superparamagnetic approach. We compare our method to the presently applied mean Pearson correlations and show that our method is more sensitive in revealing functional relationships. PMID:14576317
Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.
Hu, Yan-Shi; Xin, Juncai; Hu, Ying; Zhang, Lei; Wang, Ju
2017-04-27
Our understanding of the molecular mechanisms underlying Alzheimer's disease (AD) remains incomplete. Previous studies have revealed that genetic factors provide a significant contribution to the pathogenesis and development of AD. In the past years, numerous genes implicated in this disease have been identified via genetic association studies on candidate genes or at the genome-wide level. However, in many cases, the roles of these genes and their interactions in AD are still unclear. A comprehensive and systematic analysis focusing on the biological function and interactions of these genes in the context of AD will therefore provide valuable insights to understand the molecular features of the disease. In this study, we collected genes potentially associated with AD by screening publications on genetic association studies deposited in PubMed. The major biological themes linked with these genes were then revealed by function and biochemical pathway enrichment analysis, and the relation between the pathways was explored by pathway crosstalk analysis. Furthermore, the network features of these AD-related genes were analyzed in the context of human interactome and an AD-specific network was inferred using the Steiner minimal tree algorithm. We compiled 430 human genes reported to be associated with AD from 823 publications. Biological theme analysis indicated that the biological processes and biochemical pathways related to neurodevelopment, metabolism, cell growth and/or survival, and immunology were enriched in these genes. Pathway crosstalk analysis then revealed that the significantly enriched pathways could be grouped into three interlinked modules-neuronal and metabolic module, cell growth/survival and neuroendocrine pathway module, and immune response-related module-indicating an AD-specific immune-endocrine-neuronal regulatory network. Furthermore, an AD-specific protein network was inferred and novel genes potentially associated with AD were identified. By means of network and pathway-based methodology, we explored the pathogenetic mechanism underlying AD at a systems biology level. Results from our work could provide valuable clues for understanding the molecular mechanism underlying AD. In addition, the framework proposed in this study could be used to investigate the pathological molecular network and genes relevant to other complex diseases or phenotypes.
Counterselection method based on conditional silencing of antitoxin genes in Escherichia coli.
Tsukuda, Miyuki; Nakashima, Nobutaka; Miyazaki, Kentaro
2015-11-01
Counterselection is a genetic engineering technique to eliminate specific genetic fragments containing selectable marker genes. Although the technique is widely used in bacterial genome engineering and plasmid curing experiments, the repertoire of the markers usable in Escherichia coli is limited. Here we developed a novel counterselection method in E. coli based on antisense RNA (asRNA) technology directed against toxin-antitoxin (TA) modules. Under normal conditions, excess antitoxin neutralizes its cognate toxin and thus the module is stably maintained in the genome. We hypothesised that repression of an antitoxin gene would perturb cell growth due to the toxin being released. We designed asRNAs corresponding to all 19 type II antitoxins encoded in the E. coli genome. asRNAs were then conditionally expressed; repression of MqsA in the MqsR/MqsA module had the greatest inhibitory effect, followed by RnlB in the RnlA/RnlB module. The utility of asRNA(MqsA) as a counterselection marker was demonstrated by efficient plasmid curing and strain improvement experiments. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Finding quasi-modules of human and viral miRNAs: a case study of human cytomegalovirus (HCMV)
2012-01-01
Background MicroRNAs (miRNAs) are important regulators of gene expression encoded by a variety of organisms, including viruses. Although the function of most of the viral miRNAs is currently unknown, there is evidence that both viral and host miRNAs contribute to the interactions between viruses and their hosts. miRNAs constitute a complex combinatorial network, where one miRNA may target many genes and one gene may be targeted by multiple miRNAs. In particular, viral and host miRNAs may also have mutual target genes. Based on published evidence linking viral and host miRNAs there are three modes of mutual regulation: competing, cooperating, and compensating modes. Results In this paper we explore the compensating mode of mutual regulation upon Human Cytomegalovirus (HCMV) infection, when host miRNAs are down regulated and viral miRNAs compensate by mimicking their function. To achieve this, we develop a new algorithm which finds groups, called quasi-modules, of viral and host miRNAs and their mutual target genes, and use a new host miRNA expression data for HCMV-infected and uninfected cells. For two of the reported quasi-modules, supporting evidence from biological and medical literature is provided. Conclusions The modules found by our method may advance the understanding of the role of miRNAs in host-viral interactions, and the genes in these modules may serve as candidates for further experimental validation. PMID:23206407
Acceleration of Age-Associated Methylation Patterns in HIV-1-Infected Adults
Sehl, Mary; Sinsheimer, Janet S.; Hultin, Patricia M.; Hultin, Lance E.; Quach, Austin; Martínez-Maza, Otoniel; Horvath, Steve; Vilain, Eric; Jamieson, Beth D.
2015-01-01
Patients with treated HIV-1-infection experience earlier occurrence of aging-associated diseases, raising speculation that HIV-1-infection, or antiretroviral treatment, may accelerate aging. We recently described an age-related co-methylation module comprised of hundreds of CpGs; however, it is unknown whether aging and HIV-1-infection exert negative health effects through similar, or disparate, mechanisms. We investigated whether HIV-1-infection would induce age-associated methylation changes. We evaluated DNA methylation levels at >450,000 CpG sites in peripheral blood mononuclear cells (PBMC) of young (20-35) and older (36-56) adults in two separate groups of participants. Each age group for each data set consisted of 12 HIV-1-infected and 12 age-matched HIV-1-uninfected samples for a total of 96 samples. The effects of age and HIV-1 infection on methylation at each CpG revealed a strong correlation of 0.49, p<1 x10-200 and 0.47, p<1x10-200. Weighted gene correlation network analysis (WGCNA) identified 17 co-methylation modules; module 3 (ME3) was significantly correlated with age (cor=0.70) and HIV-1 status (cor=0.31). Older HIV-1+ individuals had a greater number of hypermethylated CpGs across ME3 (p=0.015). In a multivariate model, ME3 was significantly associated with age and HIV status (Data set 1: βage= 0.007088, p=2.08 x 10-9; βHIV= 0.099574, p=0.0011; Data set 2: βage= 0.008762, p=1.27x 10-5; βHIV= 0.128649, p= 0.0001). Using this model, we estimate that HIV-1 infection accelerates age-related methylation by approximately 13.7 years in data set 1 and 14.7 years in data set 2. The genes related to CpGs in ME3 are enriched for polycomb group target genes known to be involved in cell renewal and aging. The overlap between ME3 and an aging methylation module found in solid tissues is also highly significant (Fisher-exact p=5.6 x 10-6, odds ratio=1.91). These data demonstrate that HIV-1 infection is associated with methylation patterns that are similar to age-associated patterns and suggest that general aging and HIV-1 related aging work through some common cellular and molecular mechanisms. These results are an important first step for finding potential therapeutic targets and novel clinical approaches to mitigate the detrimental effects of both HIV-1-infection and aging. PMID:25807146
Carbonetto, Peter; Stephens, Matthew
2013-01-01
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study. PMID:24098138
MYB46 Modulates Disease Susceptibility to Botrytis cinerea in Arabidopsis12[W
Ramírez, Vicente; Agorio, Astrid; Coego, Alberto; García-Andrade, Javier; Hernández, M. José; Balaguer, Begoña; Ouwerkerk, Pieter B.F.; Zarra, Ignacio; Vera, Pablo
2011-01-01
In this study, we show that the Arabidopsis (Arabidopsis thaliana) transcription factor MYB46, previously described to regulate secondary cell wall biosynthesis in the vascular tissue of the stem, is pivotal for mediating disease susceptibility to the fungal pathogen Botrytis cinerea. We identified MYB46 by its ability to bind to a new cis-element located in the 5′ promoter region of the pathogen-induced Ep5C gene, which encodes a type III cell wall-bound peroxidase. We present genetic and molecular evidence indicating that MYB46 modulates the magnitude of Ep5C gene induction following pathogenic insults. Moreover, we demonstrate that different myb46 knockdown mutant plants exhibit increased disease resistance to B. cinerea, a phenotype that is accompanied by selective transcriptional reprogramming of a set of genes encoding cell wall proteins and enzymes, of which extracellular type III peroxidases are conspicuous. In essence, our results substantiate that defense-related signaling pathways and cell wall integrity are interconnected and that MYB46 likely functions as a disease susceptibility modulator to B. cinerea through the integration of cell wall remodeling and downstream activation of secondary lines of defense. PMID:21282403
ERIC Educational Resources Information Center
Smith, Sharman
This competency-based preservice home economics teacher education module on consumer rights and responsibilities is the third in a set of four core curriculum modules on consumer approach to homemaking education. (This set is part of a larger series of sixty-seven on the Management Approach to Teaching Consumer and Homemaking Education…
Sun, Mei-Yu; Li, Jing-Yi; Li, Dong; Huang, Feng-Jie; Wang, Di; Li, Hui; Xing, Quan; Zhu, Hui-Bin; Shi, Lei
2018-04-12
Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as 'GuSuiBu'. The corresponding effective components of naringin/neoeriocitrin share highly similar chemical structure and medicinal function. Our HPLC-MS/MS results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin related genes involved in their regulatory pathways. For lack of the basic genetic information, we applied a combination of SMRT sequencing and SGS to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the DEG-based heat map analysis revealed the naringin/neoeriocitrin related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. Hereinto, naringin/neoeriocitrin related DEGs distributed in nine distinct modules, and DEGs in these modules showed significant different patterns of transcript abundance to be linked with specific tissues or ages. Moreover, WGCNA results further identified that PAL, 4CL, C4H and C3H, HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis respectively and exhibited high co-expression with MYB- and bHLH-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue- and time-specificity of gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome dataset provided the important genetic information for further research on D. roosii.
Down-weighting overlapping genes improves gene set analysis
2012-01-01
Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. PMID:22713124
Feng, Yinling; Wang, Xuefeng
2017-03-01
In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.
ERIC Educational Resources Information Center
Lowe, Phyllis; And Others
This module, one of ten competency based modules developed for vocational home economics teachers, is based on a job cluster in the housing management field. It is designed for a variety of levels of learners (secondary, postsecondary, adult) in both school and non-school settings. Focusing on the specific job title of housing management aide,…
Cataldo, Ilaria; Azhari, Atiqah; Lepri, Bruno; Esposito, Gianluca
2017-10-21
Oxytocin plays an important role in the modulation of social behavior in both typical and atypical contexts. Also, the quality of early parental care sets the foundation for long-term psychosocial development. Here, we review studies that investigated how oxytocin receptor (OXTR) interacts with early parental care experiences to influence the development of psychiatric disorders. Using Pubmed, Scopus and PsycInfo databases, we utilized the keyword "OXTR" before subsequently searching for specific OXTR single nucleotide polymorphisms (SNPs), generating a list of 598 studies in total. The papers were catalogued in a database and filtered for gene-environment interaction, psychiatric disorders and involvement of parental care. In particular, rs53576 and rs2254298 were found to be significantly involved in gene-environment interactions that modulated risk for psychopathology and the following psychiatric disorders: disruptive behavior, depression, anxiety, eating disorder and borderline personality disorder. These results illustrate the importance of OXTR in mediating the impact of parental care on the emergence of psychopathology. Copyright © 2017 Elsevier Ltd. All rights reserved.
Genetic and environmental pathways to complex diseases.
Gohlke, Julia M; Thomas, Reuben; Zhang, Yonqing; Rosenstein, Michael C; Davis, Allan P; Murphy, Cynthia; Becker, Kevin G; Mattingly, Carolyn J; Portier, Christopher J
2009-05-05
Pathogenesis of complex diseases involves the integration of genetic and environmental factors over time, making it particularly difficult to tease apart relationships between phenotype, genotype, and environmental factors using traditional experimental approaches. Using gene-centered databases, we have developed a network of complex diseases and environmental factors through the identification of key molecular pathways associated with both genetic and environmental contributions. Comparison with known chemical disease relationships and analysis of transcriptional regulation from gene expression datasets for several environmental factors and phenotypes clustered in a metabolic syndrome and neuropsychiatric subnetwork supports our network hypotheses. This analysis identifies natural and synthetic retinoids, antipsychotic medications, Omega 3 fatty acids, and pyrethroid pesticides as potential environmental modulators of metabolic syndrome phenotypes through PPAR and adipocytokine signaling and organophosphate pesticides as potential environmental modulators of neuropsychiatric phenotypes. Identification of key regulatory pathways that integrate genetic and environmental modulators define disease associated targets that will allow for efficient screening of large numbers of environmental factors, screening that could set priorities for further research and guide public health decisions.
2012-01-01
Background Oxidative Stress contributes to the pathogenesis of many diseases. The NRF2/KEAP1 axis is a key transcriptional regulator of the anti-oxidant response in cells. Nrf2 knockout mice have implicated this pathway in regulating inflammatory airway diseases such as asthma and COPD. To better understand the role the NRF2 pathway has on respiratory disease we have taken a novel approach to define NRF2 dependent gene expression in a relevant lung system. Methods Normal human lung fibroblasts were transfected with siRNA specific for NRF2 or KEAP1. Gene expression changes were measured at 30 and 48 hours using a custom Affymetrix Gene array. Changes in Eotaxin-1 gene expression and protein secretion were further measured under various inflammatory conditions with siRNAs and pharmacological tools. Results An anti-correlated gene set (inversely regulated by NRF2 and KEAP1 RNAi) that reflects specific NRF2 regulated genes was identified. Gene annotations show that NRF2-mediated oxidative stress response is the most significantly regulated pathway, followed by heme metabolism, metabolism of xenobiotics by Cytochrome P450 and O-glycan biosynthesis. Unexpectedly the key eosinophil chemokine Eotaxin-1/CCL11 was found to be up-regulated when NRF2 was inhibited and down-regulated when KEAP1 was inhibited. This transcriptional regulation leads to modulation of Eotaxin-1 secretion from human lung fibroblasts under basal and inflammatory conditions, and is specific to Eotaxin-1 as NRF2 or KEAP1 knockdown had no effect on the secretion of a set of other chemokines and cytokines. Furthermore, the known NRF2 small molecule activators CDDO and Sulphoraphane can also dose dependently inhibit Eotaxin-1 release from human lung fibroblasts. Conclusions These data uncover a previously unknown role for NRF2 in regulating Eotaxin-1 expression and further the mechanistic understanding of this pathway in modulating inflammatory lung disease. PMID:23061798
Fourtounis, Jimmy; Wang, I-Ming; Mathieu, Marie-Claude; Claveau, David; Loo, Tenneille; Jackson, Aimee L; Peters, Mette A; Therien, Alex G; Boie, Yves; Crackower, Michael A
2012-10-12
Oxidative Stress contributes to the pathogenesis of many diseases. The NRF2/KEAP1 axis is a key transcriptional regulator of the anti-oxidant response in cells. Nrf2 knockout mice have implicated this pathway in regulating inflammatory airway diseases such as asthma and COPD. To better understand the role the NRF2 pathway has on respiratory disease we have taken a novel approach to define NRF2 dependent gene expression in a relevant lung system. Normal human lung fibroblasts were transfected with siRNA specific for NRF2 or KEAP1. Gene expression changes were measured at 30 and 48 hours using a custom Affymetrix Gene array. Changes in Eotaxin-1 gene expression and protein secretion were further measured under various inflammatory conditions with siRNAs and pharmacological tools. An anti-correlated gene set (inversely regulated by NRF2 and KEAP1 RNAi) that reflects specific NRF2 regulated genes was identified. Gene annotations show that NRF2-mediated oxidative stress response is the most significantly regulated pathway, followed by heme metabolism, metabolism of xenobiotics by Cytochrome P450 and O-glycan biosynthesis. Unexpectedly the key eosinophil chemokine Eotaxin-1/CCL11 was found to be up-regulated when NRF2 was inhibited and down-regulated when KEAP1 was inhibited. This transcriptional regulation leads to modulation of Eotaxin-1 secretion from human lung fibroblasts under basal and inflammatory conditions, and is specific to Eotaxin-1 as NRF2 or KEAP1 knockdown had no effect on the secretion of a set of other chemokines and cytokines. Furthermore, the known NRF2 small molecule activators CDDO and Sulphoraphane can also dose dependently inhibit Eotaxin-1 release from human lung fibroblasts. These data uncover a previously unknown role for NRF2 in regulating Eotaxin-1 expression and further the mechanistic understanding of this pathway in modulating inflammatory lung disease.
Estimation of gene induction enables a relevance-based ranking of gene sets.
Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens
2009-07-01
In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.
Dai, Guanping; Sun, Tao; Miao, Liangtian; Li, Qingyan; Xiao, Dongguang; Zhang, Xueli
2014-08-01
β-carotene belongs to carotenoids family, widely applied in pharmaceuticals, neutraceuticals, cosmetics and food industries. In this study, three key genes (dxs, idi, and crt operon) within β-carotene synthetic pathway in recombinant Escherichia coli strain CAR005 were modulated with RBS Library to improve β-carotene production. There were 7%, 11% and 17% increase of β-carotene yield respectively after modulating dxs, idi and crt operon genes with RBS Library, demonstrating that modulating gene expression with regulatory parts libraries would have more opportunities to obtain optimal production of target compound. Combined modulation of crt operon, dxs and idi genes led to 35% increase of β-carotene yield compared to parent strain CAR005. The optimal gene expression strength identified in single gene modulation would not be the optimal strength when used in combined modulation. Our study provides a new strategy for improving production of target compound through modulation of gene expression.
Li, Zhao-Qun; Zhang, Shuai; Ma, Yan; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Dong, Shuang-Lin; Cui, Jin-Jie
2013-01-01
Background Chrysopa pallens (Rambur) are the most important natural enemies and predators of various agricultural pests. Understanding the sophisticated olfactory system in insect antennae is crucial for studying the physiological bases of olfaction and also could lead to effective applications of C. pallens in integrated pest management. However no transcriptome information is available for Neuroptera, and sequence data for C. pallens are scarce, so obtaining more sequence data is a priority for researchers on this species. Results To facilitate identifying sets of genes involved in olfaction, a normalized transcriptome of C. pallens was sequenced. A total of 104,603 contigs were obtained and assembled into 10,662 clusters and 39,734 singletons; 20,524 were annotated based on BLASTX analyses. A large number of candidate chemosensory genes were identified, including 14 odorant-binding proteins (OBPs), 22 chemosensory proteins (CSPs), 16 ionotropic receptors, 14 odorant receptors, and genes potentially involved in olfactory modulation. To better understand the OBPs, CSPs and cytochrome P450s, phylogenetic trees were constructed. In addition, 10 digital gene expression libraries of different tissues were constructed and gene expression profiles were compared among different tissues in males and females. Conclusions Our results provide a basis for exploring the mechanisms of chemoreception in C. pallens, as well as other insects. The evolutionary analyses in our study provide new insights into the differentiation and evolution of insect OBPs and CSPs. Our study provided large-scale sequence information for further studies in C. pallens. PMID:23826220
Bøhn, Siv K; Myhrstad, Mari C; Thoresen, Magne; Holden, Marit; Karlsen, Anette; Tunheim, Siv Haugen; Erlund, Iris; Svendsen, Mette; Seljeflot, Ingebjørg; Moskaug, Jan O; Duttaroy, Asim K; Laake, Petter; Arnesen, Harald; Tonstad, Serena; Collins, Andrew; Drevon, Christan A; Blomhoff, Rune
2010-09-16
Plant-based diets rich in fruit and vegetables can prevent development of several chronic age-related diseases. However, the mechanisms behind this protective effect are not elucidated. We have tested the hypothesis that intake of antioxidant-rich foods can affect groups of genes associated with cellular stress defence in human blood cells. NCT00520819 http://clinicaltrials.gov. In an 8-week dietary intervention study, 102 healthy male smokers were randomised to either a diet rich in various antioxidant-rich foods, a kiwifruit diet (three kiwifruits/d added to the regular diet) or a control group. Blood cell gene expression profiles were obtained from 10 randomly selected individuals of each group. Diet-induced changes on gene expression were compared to controls using a novel application of the gene set enrichment analysis (GSEA) on transcription profiles obtained using Affymetrix HG-U133-Plus 2.0 whole genome arrays. Changes were observed in the blood cell gene expression profiles in both intervention groups when compared to the control group. Groups of genes involved in regulation of cellular stress defence, such as DNA repair, apoptosis and hypoxia, were significantly upregulated (GSEA, FDR q-values < 5%) by both diets compared to the control group. Genes with common regulatory motifs for aryl hydrocarbon receptor (AhR) and AhR nuclear translocator (AhR/ARNT) were upregulated by both interventions (FDR q-values < 5%). Plasma antioxidant biomarkers (polyphenols/carotenoids) increased in both groups. The observed changes in the blood cell gene expression profiles suggest that the beneficial effects of a plant-based diet on human health may be mediated through optimization of defence processes.
Xu, Wei-Ming; Yang, Kuo; Jiang, Li-Jie; Hu, Jing-Qing; Zhou, Xue-Zhong
2018-01-01
Background: Ischemic heart disease (IHD) has been the leading cause of death for several decades globally, IHD patients usually hold the symptoms of phlegm-stasis cementation syndrome (PSCS) as significant complications. However, the underlying molecular mechanisms of PSCS complicated with IHD have not yet been fully elucidated. Materials and Methods: Network medicine methods were utilized to elucidate the underlying molecular mechanisms of IHD phenotypes. Firstly, high-quality IHD-associated genes from both human curated disease-gene association database and biomedical literatures were integrated. Secondly, the IHD disease modules were obtained by dissecting the protein-protein interaction (PPI) topological modules in the String V9.1 database and the mapping of IHD-associated genes to the PPI topological modules. After that, molecular functional analyses (e.g., Gene Ontology and pathway enrichment analyses) for these IHD disease modules were conducted. Finally, the PSCS syndrome modules were identified by mapping the PSCS related symptom-genes to the IHD disease modules, which were further validated by both pharmacological and physiological evidences derived from published literatures. Results: The total of 1,056 high-quality IHD-associated genes were integrated and evaluated. In addition, eight IHD disease modules (the PPI sub-networks significantly relevant to IHD) were identified, in which two disease modules were relevant to PSCS syndrome (i.e., two PSCS syndrome modules). These two modules had enriched pathways on Toll-like receptor signaling pathway (hsa04620) and Renin-angiotensin system (hsa04614), with the molecular functions of angiotensin maturation (GO:0002003) and response to bacterium (GO:0009617), which had been validated by classical Chinese herbal formulas-related targets, IHD-related drug targets, and the phenotype features derived from human phenotype ontology (HPO) and published biomedical literatures. Conclusion: A network medicine-based approach was proposed to identify the underlying molecular modules of PSCS complicated with IHD, which could be used for interpreting the pharmacological mechanisms of well-established Chinese herbal formulas ( e.g., Tao Hong Si Wu Tang, Dan Shen Yin, Hunag Lian Wen Dan Tang and Gua Lou Xie Bai Ban Xia Tang ). In addition, these results delivered novel understandings of the molecular network mechanisms of IHD phenotype subtypes with PSCS complications, which would be both insightful for IHD precision medicine and the integration of disease and TCM syndrome diagnoses.
Xu, Wei-Ming; Yang, Kuo; Jiang, Li-Jie; Hu, Jing-Qing; Zhou, Xue-Zhong
2018-01-01
Background: Ischemic heart disease (IHD) has been the leading cause of death for several decades globally, IHD patients usually hold the symptoms of phlegm-stasis cementation syndrome (PSCS) as significant complications. However, the underlying molecular mechanisms of PSCS complicated with IHD have not yet been fully elucidated. Materials and Methods: Network medicine methods were utilized to elucidate the underlying molecular mechanisms of IHD phenotypes. Firstly, high-quality IHD-associated genes from both human curated disease-gene association database and biomedical literatures were integrated. Secondly, the IHD disease modules were obtained by dissecting the protein-protein interaction (PPI) topological modules in the String V9.1 database and the mapping of IHD-associated genes to the PPI topological modules. After that, molecular functional analyses (e.g., Gene Ontology and pathway enrichment analyses) for these IHD disease modules were conducted. Finally, the PSCS syndrome modules were identified by mapping the PSCS related symptom-genes to the IHD disease modules, which were further validated by both pharmacological and physiological evidences derived from published literatures. Results: The total of 1,056 high-quality IHD-associated genes were integrated and evaluated. In addition, eight IHD disease modules (the PPI sub-networks significantly relevant to IHD) were identified, in which two disease modules were relevant to PSCS syndrome (i.e., two PSCS syndrome modules). These two modules had enriched pathways on Toll-like receptor signaling pathway (hsa04620) and Renin-angiotensin system (hsa04614), with the molecular functions of angiotensin maturation (GO:0002003) and response to bacterium (GO:0009617), which had been validated by classical Chinese herbal formulas-related targets, IHD-related drug targets, and the phenotype features derived from human phenotype ontology (HPO) and published biomedical literatures. Conclusion: A network medicine-based approach was proposed to identify the underlying molecular modules of PSCS complicated with IHD, which could be used for interpreting the pharmacological mechanisms of well-established Chinese herbal formulas (e.g., Tao Hong Si Wu Tang, Dan Shen Yin, Hunag Lian Wen Dan Tang and Gua Lou Xie Bai Ban Xia Tang). In addition, these results delivered novel understandings of the molecular network mechanisms of IHD phenotype subtypes with PSCS complications, which would be both insightful for IHD precision medicine and the integration of disease and TCM syndrome diagnoses. PMID:29403392
Xu, Tao; Li, Yongchao; He, Zhili; Van Nostrand, Joy D; Zhou, Jizhong
2017-01-01
Essential gene functions remain largely underexplored in bacteria. Clostridium cellulolyticum is a promising candidate for consolidated bioprocessing; however, its genetic manipulation to reduce the formation of less-valuable acetate is technically challenging due to the essentiality of acetate-producing genes. Here we developed a Cas9 nickase-assisted chromosome-based RNA repression to stably manipulate essential genes in C. cellulolyticum . Our plasmid-based expression of antisense RNA (asRNA) molecules targeting the phosphotransacetylase ( pta ) gene successfully reduced the enzymatic activity by 35% in cellobiose-grown cells, metabolically decreased the acetate titer by 15 and 52% in wildtype transformants on cellulose and xylan, respectively. To control both acetate and lactate simultaneously, we transformed the repression plasmid into lactate production-deficient mutant and found the plasmid delivery reduced acetate titer by more than 33%, concomitant with negligible lactate formation. The strains with pta gene repression generally diverted more carbon into ethanol. However, further testing on chromosomal integrants that were created by double-crossover recombination exhibited only very weak repression because DNA integration dramatically lessened gene dosage. With the design of a tandem repetitive promoter-driven asRNA module and the use of a new Cas9 nickase genome editing tool, a chromosomal integrant (LM3P) was generated in a single step and successfully enhanced RNA repression, with a 27% decrease in acetate titer on cellulose in antibiotic-free medium. These results indicate the effectiveness of tandem promoter-driven RNA repression modules in promoting gene repression in chromosomal integrants. Our combinatorial method using a Cas9 nickase genome editing tool to integrate the gene repression module demonstrates easy-to-use and high-efficiency advantages, paving the way for stably manipulating genes, even essential ones, for functional characterization and microbial engineering.
Xu, Tao; Li, Yongchao; He, Zhili; Van Nostrand, Joy D.; Zhou, Jizhong
2017-01-01
Essential gene functions remain largely underexplored in bacteria. Clostridium cellulolyticum is a promising candidate for consolidated bioprocessing; however, its genetic manipulation to reduce the formation of less-valuable acetate is technically challenging due to the essentiality of acetate-producing genes. Here we developed a Cas9 nickase-assisted chromosome-based RNA repression to stably manipulate essential genes in C. cellulolyticum. Our plasmid-based expression of antisense RNA (asRNA) molecules targeting the phosphotransacetylase (pta) gene successfully reduced the enzymatic activity by 35% in cellobiose-grown cells, metabolically decreased the acetate titer by 15 and 52% in wildtype transformants on cellulose and xylan, respectively. To control both acetate and lactate simultaneously, we transformed the repression plasmid into lactate production-deficient mutant and found the plasmid delivery reduced acetate titer by more than 33%, concomitant with negligible lactate formation. The strains with pta gene repression generally diverted more carbon into ethanol. However, further testing on chromosomal integrants that were created by double-crossover recombination exhibited only very weak repression because DNA integration dramatically lessened gene dosage. With the design of a tandem repetitive promoter-driven asRNA module and the use of a new Cas9 nickase genome editing tool, a chromosomal integrant (LM3P) was generated in a single step and successfully enhanced RNA repression, with a 27% decrease in acetate titer on cellulose in antibiotic-free medium. These results indicate the effectiveness of tandem promoter-driven RNA repression modules in promoting gene repression in chromosomal integrants. Our combinatorial method using a Cas9 nickase genome editing tool to integrate the gene repression module demonstrates easy-to-use and high-efficiency advantages, paving the way for stably manipulating genes, even essential ones, for functional characterization and microbial engineering. PMID:28936208
Hopp, Lydia; Lembcke, Kathrin; Binder, Hans; Wirth, Henry
2013-01-01
We present an analytic framework based on Self-Organizing Map (SOM) machine learning to study large scale patient data sets. The potency of the approach is demonstrated in a case study using gene expression data of more than 200 mature aggressive B-cell lymphoma patients. The method portrays each sample with individual resolution, characterizes the subtypes, disentangles the expression patterns into distinct modules, extracts their functional context using enrichment techniques and enables investigation of the similarity relations between the samples. The method also allows to detect and to correct outliers caused by contaminations. Based on our analysis, we propose a refined classification of B-cell Lymphoma into four molecular subtypes which are characterized by differential functional and clinical characteristics. PMID:24833231
A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning
2018-01-01
Risk stratification model for lung cancer with gene expression profile is of great interest. Instead of previous models based on individual prognostic genes, we aimed to develop a novel system-level risk stratification model for lung adenocarcinoma based on gene coexpression network. Using multiple microarray, gene coexpression network analysis was performed to identify survival-related networks. A deep learning based risk stratification model was constructed with representative genes of these networks. The model was validated in two test sets. Survival analysis was performed using the output of the model to evaluate whether it could predict patients' survival independent of clinicopathological variables. Five networks were significantly associated with patients' survival. Considering prognostic significance and representativeness, genes of the two survival-related networks were selected for input of the model. The output of the model was significantly associated with patients' survival in two test sets and training set (p < 0.00001, p < 0.0001 and p = 0.02 for training and test sets 1 and 2, resp.). In multivariate analyses, the model was associated with patients' prognosis independent of other clinicopathological features. Our study presents a new perspective on incorporating gene coexpression networks into the gene expression signature and clinical application of deep learning in genomic data science for prognosis prediction. PMID:29581968
Lusk, Ryan; Saba, Laura M; Vanderlinden, Lauren A; Zidek, Vaclav; Silhavy, Jan; Pravenec, Michal; Hoffman, Paula L; Tabakoff, Boris
2018-04-24
A statistical pipeline was developed and used for determining candidate genes and candidate gene co-expression networks involved in two alcohol (i.e., ethanol) metabolism phenotypes, namely alcohol clearance and acetate area under the curve (AUC) in a recombinant inbred (HXB/BXH) rat panel. The approach was also used to provide an indication of how ethanol metabolism can impact the normal function of the identified networks. RNA was extracted from alcohol-naïve liver tissue of 30 strains of HXB/BXH recombinant inbred rats. The reconstructed transcripts were quantitated and data was used to construct gene co-expression modules and networks. A separate group of rats, comprising the same 30 strains, were injected with ethanol (2 gm/kg) for measurement of blood ethanol and acetate levels. These data were used for QTL analysis of the rate of ethanol disappearance and circulating acetate levels. The analysis pipeline required calculation of the module eigengene values, the correction of these values with ethanol metabolism rates and acetate levels across the rat strains and the determination of the eigengene QTLs. For a module to be considered a candidate for determining phenotype, the module eigengene values had to have significant correlation with the strain phenotypic values and the module eigengene QTLs had to overlap the phenotypic QTLs. Of the 658 transcript co-expression modules generated from liver RNA sequencing data, a single module satisfied all criteria for being a candidate for determining the alcohol clearance trait. This module contained two alcohol dehydrogenase genes, including the gene whose product was previously shown to be responsible for the majority of alcohol elimination in the rat. This module was also the only module identified as a candidate for influencing circulating acetate levels. This module was also linked to the process of generation and utilization of retinoic acid as related to the autonomous immune response. We propose that our analytical pipeline can successfully identify genetic regions and transcripts which predispose a particular phenotype and our analysis provides functional context for co-expression module components. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Novel gene sets improve set-level classification of prokaryotic gene expression data.
Holec, Matěj; Kuželka, Ondřej; Železný, Filip
2015-10-28
Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.
Microglia recapitulate a hematopoietic master regulator network in the aging human frontal cortex
Wehrspaun, Claudia C.; Haerty, Wilfried; Ponting, Chris P.
2015-01-01
Microglia form the immune system of the brain. Previous studies in cell cultures and animal models suggest altered activation states and cellular senescence in the aged brain. Instead, we analyzed 3 transcriptome data sets from the postmortem frontal cortex of 381 control individuals to show that microglia gene markers assemble into a transcriptional module in a gene coexpression network. These markers predominantly represented M1 and M1/M2b activation phenotypes. Expression of genes in this module generally declines over the adult life span. This decrease was more pronounced in microglia surface receptors for microglia and/or neuron crosstalk than in markers for activation state phenotypes. In addition to these receptors for exogenous signals, microglia are controlled by brain-expressed regulatory factors. We identified a subnetwork of transcription factors, including RUNX1, IRF8, PU.1, and TAL1, which are master regulators (MRs) for the age-dependent microglia module. The causal contributions of these MRs on the microglia module were verified using publicly available ChIP-Seq data. Interactions of these key MRs were preserved in a protein-protein interaction network. Importantly, these MRs appear to be essential for regulating microglia homeostasis in the adult human frontal cortex in addition to their crucial roles in hematopoiesis and myeloid cell-fate decisions during embryogenesis. PMID:26002684
Ray, Sumanta; Hossain, Sk Md Mosaddek; Khatun, Lutfunnesa; Mukhopadhyay, Anirban
2017-12-20
Alzheimer's disease (AD) is a chronic neuro-degenerative disruption of the brain which involves in large scale transcriptomic variation. The disease does not impact every regions of the brain at the same time, instead it progresses slowly involving somewhat sequential interaction with different regions. Analysis of the expression patterns of the genes in different regions of the brain influenced in AD surely contribute for a enhanced comprehension of AD pathogenesis and shed light on the early characterization of the disease. Here, we have proposed a framework to identify perturbation and preservation characteristics of gene expression patterns across six distinct regions of the brain ("EC", "HIP", "PC", "MTG", "SFG", and "VCX") affected in AD. Co-expression modules were discovered considering a couple of regions at once. These are then analyzed to know the preservation and perturbation characteristics. Different module preservation statistics and a rank aggregation mechanism have been adopted to detect the changes of expression patterns across brain regions. Gene ontology (GO) and pathway based analysis were also carried out to know the biological meaning of preserved and perturbed modules. In this article, we have extensively studied the preservation patterns of co-expressed modules in six distinct brain regions affected in AD. Some modules are emerged as the most preserved while some others are detected as perturbed between a pair of brain regions. Further investigation on the topological properties of preserved and non-preserved modules reveals a substantial association amongst "betweenness centrality" and "degree" of the involved genes. Our findings may render a deeper realization of the preservation characteristics of gene expression patterns in discrete brain regions affected by AD.
QuadBase2: web server for multiplexed guanine quadruplex mining and visualization
Dhapola, Parashar; Chowdhury, Shantanu
2016-01-01
DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890
Identifying pathogenic processes by integrating microarray data with prior knowledge
2014-01-01
Background It is of great importance to identify molecular processes and pathways that are involved in disease etiology. Although there has been an extensive use of various high-throughput methods for this task, pathogenic pathways are still not completely understood. Often the set of genes or proteins identified as altered in genome-wide screens show a poor overlap with canonical disease pathways. These findings are difficult to interpret, yet crucial in order to improve the understanding of the molecular processes underlying the disease progression. We present a novel method for identifying groups of connected molecules from a set of differentially expressed genes. These groups represent functional modules sharing common cellular function and involve signaling and regulatory events. Specifically, our method makes use of Bayesian statistics to identify groups of co-regulated genes based on the microarray data, where external information about molecular interactions and connections are used as priors in the group assignments. Markov chain Monte Carlo sampling is used to search for the most reliable grouping. Results Simulation results showed that the method improved the ability of identifying correct groups compared to traditional clustering, especially for small sample sizes. Applied to a microarray heart failure dataset the method found one large cluster with several genes important for the structure of the extracellular matrix and a smaller group with many genes involved in carbohydrate metabolism. The method was also applied to a microarray dataset on melanoma cancer patients with or without metastasis, where the main cluster was dominated by genes related to keratinocyte differentiation. Conclusion Our method found clusters overlapping with known pathogenic processes, but also pointed to new connections extending beyond the classical pathways. PMID:24758699
Combining multiple tools outperforms individual methods in gene set enrichment analyses.
Alhamdoosh, Monther; Ng, Milica; Wilson, Nicholas J; Sheridan, Julie M; Huynh, Huy; Wilson, Michael J; Ritchie, Matthew E
2017-02-01
Gene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level results between multiple experimental conditions. The ensemble of genes set enrichment analyses (EGSEA) is a method developed for RNA-sequencing data that combines results from twelve algorithms and calculates collective gene set scores to improve the biological relevance of the highest ranked gene sets. EGSEA's gene set database contains around 25 000 gene sets from sixteen collections. It has multiple visualization capabilities that allow researchers to view gene sets at various levels of granularity. EGSEA has been tested on simulated data and on a number of human and mouse datasets and, based on biologists' feedback, consistently outperforms the individual tools that have been combined. Our evaluation demonstrates the superiority of the ensemble approach for GSE analysis, and its utility to effectively and efficiently extrapolate biological functions and potential involvement in disease processes from lists of differentially regulated genes. EGSEA is available as an R package at http://www.bioconductor.org/packages/EGSEA/ . The gene sets collections are available in the R package EGSEAdata from http://www.bioconductor.org/packages/EGSEAdata/ . monther.alhamdoosh@csl.com.au mritchie@wehi.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Disclosing the Molecular Basis of the Postharvest Life of Berry in Different Grapevine Genotypes1
Fasoli, Marianna; Amato, Alessandra; Anesi, Andrea; Ceoldo, Stefania; Avesani, Linda; Pezzotti, Mario
2016-01-01
The molecular events that characterize postripening grapevine berries have rarely been investigated and are poorly defined. In particular, a detailed definition of changes occurring during the postharvest dehydration, a process undertaken to make some particularly special wine styles, would be of great interest for both winemakers and plant biologists. We report an exhaustive survey of transcriptomic and metabolomic responses in berries representing six grapevine genotypes subjected to postharvest dehydration under identical controlled conditions. The modulation of phenylpropanoid metabolism clearly distinguished the behavior of genotypes, with stilbene accumulation as the major metabolic event, although the transient accumulation/depletion of anthocyanins and flavonols was the prevalent variation in genotypes that do not accumulate stilbenes. The modulation of genes related to phenylpropanoid/stilbene metabolism highlighted the distinct metabolomic plasticity of genotypes, allowing for the identification of candidate structural and regulatory genes. In addition to genotype-specific responses, a core set of genes was consistently modulated in all genotypes, representing the common features of berries undergoing dehydration and/or commencing senescence. This included genes controlling ethylene and auxin metabolism as well as genes involved in oxidative and osmotic stress, defense responses, anaerobic respiration, and cell wall and carbohydrate metabolism. Several transcription factors were identified that may control these shared processes in the postharvest berry. Changes representing both common and genotype-specific responses to postharvest conditions shed light on the cellular processes taking place in harvested berries stored under dehydrating conditions for several months. PMID:27670818
Te, Jerez A.; AbdulHameed, Mohamed Diwan M.
2016-01-01
Abstract Organ injuries caused by environmental chemical exposures or use of pharmaceutical drugs pose a serious health risk that may be difficult to assess because of a lack of non‐invasive diagnostic tests. Mapping chemical injuries to organ‐specific histopathology outcomes via biomarkers will provide a foundation for designing precise and robust diagnostic tests. We identified co‐expressed genes (modules) specific to injury endpoints using the Open Toxicogenomics Project‐Genomics Assisted Toxicity Evaluation System (TG‐GATEs) – a toxicogenomics database containing organ‐specific gene expression data matched to dose‐ and time‐dependent chemical exposures and adverse histopathology assessments in Sprague–Dawley rats. We proposed a protocol for selecting gene modules associated with chemical‐induced injuries that classify 11 liver and eight kidney histopathology endpoints based on dose‐dependent activation of the identified modules. We showed that the activation of the modules for a particular chemical exposure condition, i.e., chemical‐time‐dose combination, correlated with the severity of histopathological damage in a dose‐dependent manner. Furthermore, the modules could distinguish different types of injuries caused by chemical exposures as well as determine whether the injury module activation was specific to the tissue of origin (liver and kidney). The generated modules provide a link between toxic chemical exposures, different molecular initiating events among underlying molecular pathways and resultant organ damage. Published 2016. This article is a U.S. Government work and is in the public domain in the USA. Journal of Applied Toxicology published by John Wiley & Sons, Ltd. PMID:26725466
Shen, Haoran; Liang, Zhou; Zheng, Saihua; Li, Xuelian
2017-01-01
The purpose of this study was to identify promising candidate genes and pathways in polycystic ovary syndrome (PCOS). Microarray dataset GSE345269 obtained from the Gene Expression Omnibus database includes 7 granulosa cell samples from PCOS patients, and 3 normal granulosa cell samples. Differentially expressed genes (DEGs) were screened between PCOS and normal samples. Pathway enrichment analysis was conducted for DEGs using ClueGO and CluePedia plugin of Cytoscape. A Reactome functional interaction (FI) network of the DEGs was built using ReactomeFIViz, and then network modules were extracted, followed by pathway enrichment analysis for the modules. Expression of DEGs in granulosa cell samples was measured using quantitative RT-PCR. A total of 674 DEGs were retained, which were significantly enriched with inflammation and immune-related pathways. Eight modules were extracted from the Reactome FI network. Pathway enrichment analysis revealed significant pathways of each module: module 0, Regulation of RhoA activity and Signaling by Rho GTPases pathways shared ARHGAP4 and ARHGAP9; module 2, GlycoProtein VI-mediated activation cascade pathway was enriched with RHOG; module 3, Thromboxane A2 receptor signaling, Chemokine signaling pathway, CXCR4-mediated signaling events pathways were enriched with LYN, the hub gene of module 3. Results of RT-PCR confirmed the finding of the bioinformatic analysis that ARHGAP4, ARHGAP9, RHOG and LYN were significantly upregulated in PCOS. RhoA-related pathways, GlycoProtein VI-mediated activation cascade pathway, ARHGAP4, ARHGAP9, RHOG and LYN may be involved in the pathogenesis of PCOS. PMID:28949383
Tylee, Daniel S; Hess, Jonathan L; Quinn, Thomas P; Barve, Rahul; Huang, Hailiang; Zhang-James, Yanli; Chang, Jeffrey; Stamova, Boryana S; Sharp, Frank R; Hertz-Picciotto, Irva; Faraone, Stephen V; Kong, Sek Won; Glatt, Stephen J
2017-04-01
Blood-based microarray studies comparing individuals affected with autism spectrum disorder (ASD) and typically developing individuals help characterize differences in circulating immune cell functions and offer potential biomarker signal. We sought to combine the subject-level data from previously published studies by mega-analysis to increase the statistical power. We identified studies that compared ex vivo blood or lymphocytes from ASD-affected individuals and unrelated comparison subjects using Affymetrix or Illumina array platforms. Raw microarray data and clinical meta-data were obtained from seven studies, totaling 626 affected and 447 comparison subjects. Microarray data were processed using uniform methods. Covariate-controlled mixed-effect linear models were used to identify gene transcripts and co-expression network modules that were significantly associated with diagnostic status. Permutation-based gene-set analysis was used to identify functionally related sets of genes that were over- and under-expressed among ASD samples. Our results were consistent with diminished interferon-, EGF-, PDGF-, PI3K-AKT-mTOR-, and RAS-MAPK-signaling cascades, and increased ribosomal translation and NK-cell related activity in ASD. We explored evidence for sex-differences in the ASD-related transcriptomic signature. We also demonstrated that machine-learning classifiers using blood transcriptome data perform with moderate accuracy when data are combined across studies. Comparing our results with those from blood-based studies of protein biomarkers (e.g., cytokines and trophic factors), we propose that ASD may feature decoupling between certain circulating signaling proteins (higher in ASD samples) and the transcriptional cascades which they typically elicit within circulating immune cells (lower in ASD samples). These findings provide insight into ASD-related transcriptional differences in circulating immune cells. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Tylee, Daniel S.; Hess, Jonathan L.; Quinn, Thomas P.; Barve, Rahul; Huang, Hailiang; Zhang-James, Yanli; Chang, Jeffrey; Stamova, Boryana S.; Sharp, Frank R.; Hertz-Picciotto, Irva; Faraone, Stephen V.; Kong, Sek Won; Glatt, Stephen J.
2017-01-01
Blood-based microarray studies comparing individuals affected with autism spectrum disorder (ASD) and typically developing individuals help characterize differences in circulating immune cell functions and offer potential biomarker signal. We sought to combine the subject-level data from previously published studies by mega-analysis to increase the statistical power. We identified studies that compared ex-vivo blood or lymphocytes from ASD-affected individuals and unrelated comparison subjects using Affymetrix or Illumina array platforms. Raw microarray data and clinical meta-data were obtained from seven studies, totaling 626 affected and 447 comparison subjects. Microarray data were processed using uniform methods. Covariate-controlled mixed-effect linear models were used to identify gene transcripts and co-expression network modules that were significantly associated with diagnostic status. Permutation-based gene-set analysis was used to identify functionally related sets of genes that were over- and under-expressed among ASD samples. Our results were consistent with diminished interferon-, EGF-, PDGF-, PI3K-AKT-mTOR-, and RAS-MAPK-signaling cascades, and increased ribosomal translation and NK-cell related activity in ASD. We explored evidence for sex-differences in the ASD-related transcriptomic signature. We also demonstrated that machine-learning classifiers using blood transcriptome data perform with moderate accuracy when data are combined across studies. Comparing our results with those from blood-based studies of protein biomarkers (e.g., cytokines and trophic factors), we propose that ASD may feature decoupling between certain circulating signaling proteins (higher in ASD samples) and the transcriptional cascades which they typically elicit within circulating immune cells (lower in ASD samples). These findings provide insight into ASD-related transcriptional differences in circulating immune cells. PMID:27862943
Mozduri, Z; Bakhtiarizadeh, M R; Salehi, A
2018-06-01
Negative energy balance (NEB) is an altered metabolic state in modern high-yielding dairy cows. This metabolic state occurs in the early postpartum period when energy demands for milk production and maintenance exceed that of energy intake. Negative energy balance or poor adaptation to this metabolic state has important effects on the liver and can lead to metabolic disorders and reduced fertility. The roles of regulatory factors, including transcription factors (TFs) and micro RNAs (miRNAs) have often been separately studied for evaluating of NEB. However, adaptive response to NEB is controlled by complex gene networks and still not fully understood. In this study, we aimed to discover the integrated gene regulatory networks involved in NEB development in liver tissue. We downloaded data sets including mRNA and miRNA expression profiles related to three and four cows with severe and moderate NEB, respectively. Our method integrated two independent types of information: module inference network by TFs, miRNAs and mRNA expression profiles (RNA-seq data) and computational target predictions. In total, 176 modules were predicted by using gene expression data and 64 miRNAs and 63 TFs were assigned to these modules. By using our integrated computational approach, we identified 13 TF-module and 19 miRNA-module interactions. Most of these modules were associated with liver metabolic processes as well as immune and stress responses, which might play crucial roles in NEB development. Literature survey results also showed that several regulators and gene targets have already been characterized as important factors in liver metabolic processes. These results provided novel insights into regulatory mechanisms at the TF and miRNA levels during NEB. In addition, the method described in this study seems to be applicable to construct integrated regulatory networks for different diseases or disorders.
Protective pathways against colitis mediated by appendicitis and appendectomy.
Cheluvappa, R; Luo, A S; Palmer, C; Grimm, M C
2011-09-01
Appendicitis followed by appendectomy (AA) at a young age protects against inflammatory bowel disease (IBD). Using a novel murine appendicitis model, we showed that AA protected against subsequent experimental colitis. To delineate genes/pathways involved in this protection, AA was performed and samples harvested from the most distal colon. RNA was extracted from four individual colonic samples per group (AA group and double-laparotomy control group) and each sample microarray analysed followed by gene-set enrichment analysis (GSEA). The gene-expression study was validated by quantitative reverse transcription-polymerase chain reaction (RT-PCR) of 14 selected genes across the immunological spectrum. Distal colonic expression of 266 gene-sets was up-regulated significantly in AA group samples (false discovery rates < 1%; P-value < 0·001). Time-course RT-PCR experiments involving the 14 genes displayed down-regulation over 28 days. The IBD-associated genes tnfsf10, SLC22A5, C3, ccr5, irgm, ptger4 and ccl20 were modulated in AA mice 3 days after surgery. Many key immunological and cellular function-associated gene-sets involved in the protective effect of AA in experimental colitis were identified. The down-regulation of 14 selected genes over 28 days after surgery indicates activation, repression or de-repression of these genes leading to downstream AA-conferred anti-colitis protection. Further analysis of these genes, profiles and biological pathways may assist in developing better therapeutic strategies in the management of intractable IBD. © 2011 The Authors. Clinical and Experimental Immunology © 2011 British Society for Immunology.
TALE: a tale of genome editing.
Zhang, Mingjie; Wang, Feng; Li, Shifei; Wang, Yan; Bai, Yun; Xu, Xueqing
2014-01-01
Transcription activator-like effectors (TALEs), first identified in Xanthomonas bacteria, are naturally occurring or artificially designed proteins that modulate gene transcription. These proteins recognize and bind DNA sequences based on a variable numbers of tandem repeats. Each repeat is comprised of a set of ∼ 34 conserved amino acids; within this conserved domain, there are usually two amino acids that distinguish one TALE from another. Interestingly, TALEs have revealed a simple cipher for the one-to-one recognition of proteins for DNA bases. Synthetic TALEs have been used to successfully target genes in a variety of species, including humans. Depending on the type of functional domain that is fused to the TALE of interest, these proteins can have diverse biological effects. For example, after binding DNA, TALEs fused to transcriptional activation domains can function as robust transcription factors (TALE-TFs), while fused to restriction endonucleases (TALENs) can cut DNA. Targeted genome editing, in theory, is capable of modifying any endogenous gene sequence of interest; this can be performed in cells or organisms, and may be applied to clinical gene-based therapies in the future. With current technologies, highly accurate, specific, and reliable gene editing cannot be achieved. Thus, recognition and binding mechanisms governing TALE biology are currently hot research areas. In this review, we summarize the major advances in TALE technology over the past several years with a focus on the interaction between TALEs and DNA, TALE design and construction, potential applications for this technology, and unique characteristics that make TALEs superior to zinc finger endonucleases. Copyright © 2013 Elsevier Ltd. All rights reserved.
Inhibition of p53 acetylation by INHAT subunit SET/TAF-Iβ represses p53 activity
Kim, Ji-Young; Lee, Kyu-Sun; Seol, Jin-Ee; Yu, Kweon; Chakravarti, Debabrata; Seo, Sang-Beom
2012-01-01
The tumor suppressor p53 responds to a wide variety of cellular stress signals. Among potential regulatory pathways, post-translational modifications such as acetylation by CBP/p300 and PCAF have been suggested for modulation of p53 activity. However, exactly how p53 acetylation is modulated remains poorly understood. Here, we found that SET/TAF-Iβ inhibited p300- and PCAF-mediated p53 acetylation in an INHAT (inhibitor of histone acetyltransferase) domain-dependent manner. SET/TAF-Iβ interacted with p53 and repressed transcription of p53 target genes. Consequently, SET/TAF-Iβ blocked both p53-mediated cell cycle arrest and apoptosis in response to cellular stress. Using different apoptosis analyses, including FACS, TUNEL and BrdU incorporation assays, we also found that SET/TAF-Iβ induced cellular proliferation via inhibition of p53 acetylation. Furthermore, we observed that apoptotic Drosophila eye phenotype induced by either dp53 overexpression or UV irradiation was rescued by expression of dSet. Inhibition of dp53 acetylation by dSet was observed in both cases. Our findings provide new insights into the regulation of stress-induced p53 activation by HAT-inhibiting histone chaperone SET/TAF-Iβ. PMID:21911363
Inhibition of p53 acetylation by INHAT subunit SET/TAF-Iβ represses p53 activity.
Kim, Ji-Young; Lee, Kyu-Sun; Seol, Jin-Ee; Yu, Kweon; Chakravarti, Debabrata; Seo, Sang-Beom
2012-01-01
The tumor suppressor p53 responds to a wide variety of cellular stress signals. Among potential regulatory pathways, post-translational modifications such as acetylation by CBP/p300 and PCAF have been suggested for modulation of p53 activity. However, exactly how p53 acetylation is modulated remains poorly understood. Here, we found that SET/TAF-Iβ inhibited p300- and PCAF-mediated p53 acetylation in an INHAT (inhibitor of histone acetyltransferase) domain-dependent manner. SET/TAF-Iβ interacted with p53 and repressed transcription of p53 target genes. Consequently, SET/TAF-Iβ blocked both p53-mediated cell cycle arrest and apoptosis in response to cellular stress. Using different apoptosis analyses, including FACS, TUNEL and BrdU incorporation assays, we also found that SET/TAF-Iβ induced cellular proliferation via inhibition of p53 acetylation. Furthermore, we observed that apoptotic Drosophila eye phenotype induced by either dp53 overexpression or UV irradiation was rescued by expression of dSet. Inhibition of dp53 acetylation by dSet was observed in both cases. Our findings provide new insights into the regulation of stress-induced p53 activation by HAT-inhibiting histone chaperone SET/TAF-Iβ.
The Evolution of the Secreted Regulatory Protein Progranulin.
Palfree, Roger G E; Bennett, Hugh P J; Bateman, Andrew
2015-01-01
Progranulin is a secreted growth factor that is active in tumorigenesis, wound repair, and inflammation. Haploinsufficiency of the human progranulin gene, GRN, causes frontotemporal dementia. Progranulins are composed of chains of cysteine-rich granulin modules. Modules may be released from progranulin by proteolysis as 6kDa granulin polypeptides. Both intact progranulin and some of the granulin polypeptides are biologically active. The granulin module occurs in certain plant proteases and progranulins are present in early diverging metazoan clades such as the sponges, indicating their ancient evolutionary origin. There is only one Grn gene in mammalian genomes. More gene-rich Grn families occur in teleost fish with between 3 and 6 members per species including short-form Grns that have no tetrapod counterparts. Our goals are to elucidate progranulin and granulin module evolution by investigating (i): the origins of metazoan progranulins (ii): the evolutionary relationships between the single Grn of tetrapods and the multiple Grn genes of fish (iii): the evolution of granulin module architectures of vertebrate progranulins (iv): the conservation of mammalian granulin polypeptide sequences and how the conserved granulin amino acid sequences map to the known three dimensional structures of granulin modules. We report that progranulin-like proteins are present in unicellular eukaryotes that are closely related to metazoa suggesting that progranulin is among the earliest extracellular regulatory proteins still employed by multicellular animals. From the genomes of the elephant shark and coelacanth we identified contemporary representatives of a precursor for short-from Grn genes of ray-finned fish that is lost in tetrapods. In vertebrate Grns pathways of exon duplication resulted in a conserved module architecture at the amino-terminus that is frequently accompanied by an unusual pattern of tandem nearly identical module repeats near the carboxyl-terminus. Polypeptide sequence conservation of mammalian granulin modules identified potential structure-activity relationships that may be informative in designing progranulin based therapeutics.
The Evolution of the Secreted Regulatory Protein Progranulin
Palfree, Roger G. E.; Bennett, Hugh P. J.; Bateman, Andrew
2015-01-01
Progranulin is a secreted growth factor that is active in tumorigenesis, wound repair, and inflammation. Haploinsufficiency of the human progranulin gene, GRN, causes frontotemporal dementia. Progranulins are composed of chains of cysteine-rich granulin modules. Modules may be released from progranulin by proteolysis as 6kDa granulin polypeptides. Both intact progranulin and some of the granulin polypeptides are biologically active. The granulin module occurs in certain plant proteases and progranulins are present in early diverging metazoan clades such as the sponges, indicating their ancient evolutionary origin. There is only one Grn gene in mammalian genomes. More gene-rich Grn families occur in teleost fish with between 3 and 6 members per species including short-form Grns that have no tetrapod counterparts. Our goals are to elucidate progranulin and granulin module evolution by investigating (i): the origins of metazoan progranulins (ii): the evolutionary relationships between the single Grn of tetrapods and the multiple Grn genes of fish (iii): the evolution of granulin module architectures of vertebrate progranulins (iv): the conservation of mammalian granulin polypeptide sequences and how the conserved granulin amino acid sequences map to the known three dimensional structures of granulin modules. We report that progranulin-like proteins are present in unicellular eukaryotes that are closely related to metazoa suggesting that progranulin is among the earliest extracellular regulatory proteins still employed by multicellular animals. From the genomes of the elephant shark and coelacanth we identified contemporary representatives of a precursor for short-from Grn genes of ray-finned fish that is lost in tetrapods. In vertebrate Grns pathways of exon duplication resulted in a conserved module architecture at the amino-terminus that is frequently accompanied by an unusual pattern of tandem nearly identical module repeats near the carboxyl-terminus. Polypeptide sequence conservation of mammalian granulin modules identified potential structure-activity relationships that may be informative in designing progranulin based therapeutics. PMID:26248158
Integrative modules for efficient genome engineering in yeast
Amen, Triana; Kaganovich, Daniel
2017-01-01
We present a set of vectors containing integrative modules for efficient genome integration into the commonly used selection marker loci of the yeast Saccharomyces cerevisiae. A fragment for genome integration is generated via PCR with a unique set of short primers and integrated into HIS3, URA3, ADE2, and TRP1 loci. The desired level of expression can be achieved by using constitutive (TEF1p, GPD1p), inducible (CUP1p, GAL1/10p), and daughter-specific (DSE4p) promoters available in the modules. The reduced size of the integrative module compared to conventional integrative plasmids allows efficient integration of multiple fragments. We demonstrate the efficiency of this tool by simultaneously tagging markers of the nucleus, vacuole, actin, and peroxisomes with genomically integrated fluorophores. Improved integration of our new pDK plasmid series allows stable introduction of several genes and can be used for multi-color imaging. New bidirectional promoters (TEF1p-GPD1p, TEF1p-CUP1p, and TEF1p-DSE4p) allow tractable metabolic engineering. PMID:28660202
Lee, Won Jun; Kim, Sang Cheol; Lee, Seul Ji; Lee, Jeongmi; Park, Jeong Hill; Yu, Kyung-Sang; Lim, Johan; Kwon, Sung Won
2014-01-01
Based on the process of carcinogenesis, carcinogens are classified as either genotoxic or non-genotoxic. In contrast to non-genotoxic carcinogens, many genotoxic carcinogens have been reported to cause tumor in carcinogenic bioassays in animals. Thus evaluating the genotoxicity potential of chemicals is important to discriminate genotoxic from non-genotoxic carcinogens for health care and pharmaceutical industry safety. Additionally, investigating the difference between the mechanisms of genotoxic and non-genotoxic carcinogens could provide the foundation for a mechanism-based classification for unknown compounds. In this study, we investigated the gene expression of HepG2 cells treated with genotoxic or non-genotoxic carcinogens and compared their mechanisms of action. To enhance our understanding of the differences in the mechanisms of genotoxic and non-genotoxic carcinogens, we implemented a gene set analysis using 12 compounds for the training set (12, 24, 48 h) and validated significant gene sets using 22 compounds for the test set (24, 48 h). For a direct biological translation, we conducted a gene set analysis using Globaltest and selected significant gene sets. To validate the results, training and test compounds were predicted by the significant gene sets using a prediction analysis for microarrays (PAM). Finally, we obtained 6 gene sets, including sets enriched for genes involved in the adherens junction, bladder cancer, p53 signaling pathway, pathways in cancer, peroxisome and RNA degradation. Among the 6 gene sets, the bladder cancer and p53 signaling pathway sets were significant at 12, 24 and 48 h. We also found that the DDB2, RRM2B and GADD45A, genes related to the repair and damage prevention of DNA, were consistently up-regulated for genotoxic carcinogens. Our results suggest that a gene set analysis could provide a robust tool in the investigation of the different mechanisms of genotoxic and non-genotoxic carcinogens and construct a more detailed understanding of the perturbation of significant pathways.
Lee, Won Jun; Kim, Sang Cheol; Lee, Seul Ji; Lee, Jeongmi; Park, Jeong Hill; Yu, Kyung-Sang; Lim, Johan; Kwon, Sung Won
2014-01-01
Based on the process of carcinogenesis, carcinogens are classified as either genotoxic or non-genotoxic. In contrast to non-genotoxic carcinogens, many genotoxic carcinogens have been reported to cause tumor in carcinogenic bioassays in animals. Thus evaluating the genotoxicity potential of chemicals is important to discriminate genotoxic from non-genotoxic carcinogens for health care and pharmaceutical industry safety. Additionally, investigating the difference between the mechanisms of genotoxic and non-genotoxic carcinogens could provide the foundation for a mechanism-based classification for unknown compounds. In this study, we investigated the gene expression of HepG2 cells treated with genotoxic or non-genotoxic carcinogens and compared their mechanisms of action. To enhance our understanding of the differences in the mechanisms of genotoxic and non-genotoxic carcinogens, we implemented a gene set analysis using 12 compounds for the training set (12, 24, 48 h) and validated significant gene sets using 22 compounds for the test set (24, 48 h). For a direct biological translation, we conducted a gene set analysis using Globaltest and selected significant gene sets. To validate the results, training and test compounds were predicted by the significant gene sets using a prediction analysis for microarrays (PAM). Finally, we obtained 6 gene sets, including sets enriched for genes involved in the adherens junction, bladder cancer, p53 signaling pathway, pathways in cancer, peroxisome and RNA degradation. Among the 6 gene sets, the bladder cancer and p53 signaling pathway sets were significant at 12, 24 and 48 h. We also found that the DDB2, RRM2B and GADD45A, genes related to the repair and damage prevention of DNA, were consistently up-regulated for genotoxic carcinogens. Our results suggest that a gene set analysis could provide a robust tool in the investigation of the different mechanisms of genotoxic and non-genotoxic carcinogens and construct a more detailed understanding of the perturbation of significant pathways. PMID:24497971
The limitations of simple gene set enrichment analysis assuming gene independence.
Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P
2016-02-01
Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods. © The Author(s) 2012.
Dunachie, Susanna; Berthoud, Tamara; Hill, Adrian V.S.; Fletcher, Helen A.
2015-01-01
Introduction The complexity of immunity to malaria is well known, and clear correlates of protection against malaria have not been established. A better understanding of immune markers induced by candidate malaria vaccines would greatly enhance vaccine development, immunogenicity monitoring and estimation of vaccine efficacy in the field. We have previously reported complete or partial efficacy against experimental sporozoite challenge by several vaccine regimens in healthy malaria-naïve subjects in Oxford. These include a prime-boost regimen with RTS,S/AS02A and modified vaccinia virus Ankara (MVA) expressing the CSP antigen, and a DNA-prime, MVA-boost regimen expressing the ME TRAP antigens. Using samples from these trials we performed transcriptional profiling, allowing a global assessment of responses to vaccination. Methods We used Human RefSeq8 Bead Chips from Illumina to examine gene expression using PBMC (peripheral blood mononuclear cells) from 16 human volunteers. To focus on antigen-specific changes, comparisons were made between PBMC stimulated with CSP or TRAP peptide pools and unstimulated PBMC post vaccination. We then correlated gene expression with protection against malaria in a human Plasmodium falciparum malaria challenge model. Results Differentially expressed genes induced by both vaccine regimens were predominantly in the IFN-γ pathway. Gene set enrichment analysis revealed antigen-specific effects on genes associated with IFN induction and proteasome modules after vaccination. Genes associated with IFN induction and antigen presentation modules were positively enriched in subjects with complete protection from malaria challenge, while genes associated with haemopoietic stem cells, regulatory monocytes and the myeloid lineage modules were negatively enriched in protected subjects. Conclusions These results represent novel insights into the immune repertoires involved in malaria vaccination. PMID:26256523
Sommariva, Michele; De Cecco, Loris; De Cesare, Michelandrea; Sfondrini, Lucia; Ménard, Sylvie; Melani, Cecilia; Delia, Domenico; Zaffaroni, Nadia; Pratesi, Graziella; Uva, Valentina; Tagliabue, Elda; Balsari, Andrea
2011-10-15
Synthetic oligodeoxynucleotides expressing CpG motifs (CpG-ODN) are a Toll-like receptor 9 (TLR9) agonist that can enhance the antitumor activity of DNA-damaging chemotherapy and radiation therapy in preclinical mouse models. We hypothesized that the success of these combinations is related to the ability of CpG-ODN to modulate genes involved in DNA repair. We conducted an in silico analysis of genes implicated in DNA repair in data sets obtained from murine colon carcinoma cells in mice injected intratumorally with CpG-ODN and from splenocytes in mice treated intraperitoneally with CpG-ODN. CpG-ODN treatment caused downregulation of DNA repair genes in tumors. Microarray analyses of human IGROV-1 ovarian carcinoma xenografts in mice treated intraperitoneally with CpG-ODN confirmed in silico findings. When combined with the DNA-damaging drug cisplatin, CpG-ODN significantly increased the life span of mice compared with individual treatments. In contrast, CpG-ODN led to an upregulation of genes involved in DNA repair in immune cells. Cisplatin-treated patients with ovarian carcinoma as well as anthracycline-treated patients with breast cancer who are classified as "CpG-like" for the level of expression of CpG-ODN modulated DNA repair genes have a better outcome than patients classified as "CpG-untreated-like," indicating the relevance of these genes in the tumor cell response to DNA-damaging drugs. Taken together, the findings provide evidence that the tumor microenvironment can sensitize cancer cells to DNA-damaging chemotherapy, thereby expanding the benefits of CpG-ODN therapy beyond induction of a strong immune response.
Dunachie, Susanna; Berthoud, Tamara; Hill, Adrian V S; Fletcher, Helen A
2015-09-29
The complexity of immunity to malaria is well known, and clear correlates of protection against malaria have not been established. A better understanding of immune markers induced by candidate malaria vaccines would greatly enhance vaccine development, immunogenicity monitoring and estimation of vaccine efficacy in the field. We have previously reported complete or partial efficacy against experimental sporozoite challenge by several vaccine regimens in healthy malaria-naïve subjects in Oxford. These include a prime-boost regimen with RTS,S/AS02A and modified vaccinia virus Ankara (MVA) expressing the CSP antigen, and a DNA-prime, MVA-boost regimen expressing the ME TRAP antigens. Using samples from these trials we performed transcriptional profiling, allowing a global assessment of responses to vaccination. We used Human RefSeq8 Bead Chips from Illumina to examine gene expression using PBMC (peripheral blood mononuclear cells) from 16 human volunteers. To focus on antigen-specific changes, comparisons were made between PBMC stimulated with CSP or TRAP peptide pools and unstimulated PBMC post vaccination. We then correlated gene expression with protection against malaria in a human Plasmodium falciparum malaria challenge model. Differentially expressed genes induced by both vaccine regimens were predominantly in the IFN-γ pathway. Gene set enrichment analysis revealed antigen-specific effects on genes associated with IFN induction and proteasome modules after vaccination. Genes associated with IFN induction and antigen presentation modules were positively enriched in subjects with complete protection from malaria challenge, while genes associated with haemopoietic stem cells, regulatory monocytes and the myeloid lineage modules were negatively enriched in protected subjects. These results represent novel insights into the immune repertoires involved in malaria vaccination. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Multiconstrained gene clustering based on generalized projections
2010-01-01
Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386
ePlant and the 3D data display initiative: integrative systems biology on the world wide web.
Fucile, Geoffrey; Di Biase, David; Nahal, Hardeep; La, Garon; Khodabandeh, Shokoufeh; Chen, Yani; Easley, Kante; Christendat, Dinesh; Kelley, Lawrence; Provart, Nicholas J
2011-01-10
Visualization tools for biological data are often limited in their ability to interactively integrate data at multiple scales. These computational tools are also typically limited by two-dimensional displays and programmatic implementations that require separate configurations for each of the user's computing devices and recompilation for functional expansion. Towards overcoming these limitations we have developed "ePlant" (http://bar.utoronto.ca/eplant) - a suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. These tools display data spanning multiple biological scales on interactive three-dimensional models. Currently, ePlant consists of the following modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. The ePlant's protein structure explorer module represents experimentally determined and theoretical structures covering >70% of the Arabidopsis proteome. The ePlant framework is accessed entirely through a web browser, and is therefore platform-independent. It can be applied to any model organism. To facilitate the development of three-dimensional displays of biological data on the world wide web we have established the "3D Data Display Initiative" (http://3ddi.org).
A novel feature extraction approach for microarray data based on multi-algorithm fusion
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277
A novel feature extraction approach for microarray data based on multi-algorithm fusion.
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Ma, Zihao; Carr, Steven A.
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC).more » Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. Molecular & Cellular Proteomics 16: 10.1074/mcp.M116.060301, 121–134, 2017.« less
Positive-unlabeled learning for disease gene identification
Yang, Peng; Li, Xiao-Li; Mei, Jian-Ping; Kwoh, Chee-Keong; Ng, See-Kiong
2012-01-01
Background: Identifying disease genes from human genome is an important but challenging task in biomedical research. Machine learning methods can be applied to discover new disease genes based on the known ones. Existing machine learning methods typically use the known disease genes as the positive training set P and the unknown genes as the negative training set N (non-disease gene set does not exist) to build classifiers to identify new disease genes from the unknown genes. However, such kind of classifiers is actually built from a noisy negative set N as there can be unknown disease genes in N itself. As a result, the classifiers do not perform as well as they could be. Result: Instead of treating the unknown genes as negative examples in N, we treat them as an unlabeled set U. We design a novel positive-unlabeled (PU) learning algorithm PUDI (PU learning for disease gene identification) to build a classifier using P and U. We first partition U into four sets, namely, reliable negative set RN, likely positive set LP, likely negative set LN and weak negative set WN. The weighted support vector machines are then used to build a multi-level classifier based on the four training sets and positive training set P to identify disease genes. Our experimental results demonstrate that our proposed PUDI algorithm outperformed the existing methods significantly. Conclusion: The proposed PUDI algorithm is able to identify disease genes more accurately by treating the unknown data more appropriately as unlabeled set U instead of negative set N. Given that many machine learning problems in biomedical research do involve positive and unlabeled data instead of negative data, it is possible that the machine learning methods for these problems can be further improved by adopting PU learning methods, as we have done here for disease gene identification. Availability and implementation: The executable program and data are available at http://www1.i2r.a-star.edu.sg/∼xlli/PUDI/PUDI.html. Contact: xlli@i2r.a-star.edu.sg or yang0293@e.ntu.edu.sg Supplementary information: Supplementary Data are available at Bioinformatics online. PMID:22923290
Reboiro-Jato, Miguel; Arrais, Joel P; Oliveira, José Luis; Fdez-Riverola, Florentino
2014-01-30
The diagnosis and prognosis of several diseases can be shortened through the use of different large-scale genome experiments. In this context, microarrays can generate expression data for a huge set of genes. However, to obtain solid statistical evidence from the resulting data, it is necessary to train and to validate many classification techniques in order to find the best discriminative method. This is a time-consuming process that normally depends on intricate statistical tools. geneCommittee is a web-based interactive tool for routinely evaluating the discriminative classification power of custom hypothesis in the form of biologically relevant gene sets. While the user can work with different gene set collections and several microarray data files to configure specific classification experiments, the tool is able to run several tests in parallel. Provided with a straightforward and intuitive interface, geneCommittee is able to render valuable information for diagnostic analyses and clinical management decisions based on systematically evaluating custom hypothesis over different data sets using complementary classifiers, a key aspect in clinical research. geneCommittee allows the enrichment of microarrays raw data with gene functional annotations, producing integrated datasets that simplify the construction of better discriminative hypothesis, and allows the creation of a set of complementary classifiers. The trained committees can then be used for clinical research and diagnosis. Full documentation including common use cases and guided analysis workflows is freely available at http://sing.ei.uvigo.es/GC/.
Faruki, Hawazin; Mayhew, Gregory M; Fan, Cheng; Wilkerson, Matthew D; Parker, Scott; Kam-Morgan, Lauren; Eisenberg, Marcia; Horten, Bruce; Hayes, D Neil; Perou, Charles M; Lai-Goldman, Myla
2016-06-01
Context .- A histologic classification of lung cancer subtypes is essential in guiding therapeutic management. Objective .- To complement morphology-based classification of lung tumors, a previously developed lung subtyping panel (LSP) of 57 genes was tested using multiple public fresh-frozen gene-expression data sets and a prospectively collected set of formalin-fixed, paraffin-embedded lung tumor samples. Design .- The LSP gene-expression signature was evaluated in multiple lung cancer gene-expression data sets totaling 2177 patients collected from 4 platforms: Illumina RNAseq (San Diego, California), Agilent (Santa Clara, California) and Affymetrix (Santa Clara) microarrays, and quantitative reverse transcription-polymerase chain reaction. Gene centroids were calculated for each of 3 genomic-defined subtypes: adenocarcinoma, squamous cell carcinoma, and neuroendocrine, the latter of which encompassed both small cell carcinoma and carcinoid. Classification by LSP into 3 subtypes was evaluated in both fresh-frozen and formalin-fixed, paraffin-embedded tumor samples, and agreement with the original morphology-based diagnosis was determined. Results .- The LSP-based classifications demonstrated overall agreement with the original clinical diagnosis ranging from 78% (251 of 322) to 91% (492 of 538 and 869 of 951) in the fresh-frozen public data sets and 84% (65 of 77) in the formalin-fixed, paraffin-embedded data set. The LSP performance was independent of tissue-preservation method and gene-expression platform. Secondary, blinded pathology review of formalin-fixed, paraffin-embedded samples demonstrated concordance of 82% (63 of 77) with the original morphology diagnosis. Conclusions .- The LSP gene-expression signature is a reproducible and objective method for classifying lung tumors and demonstrates good concordance with morphology-based classification across multiple data sets. The LSP panel can supplement morphologic assessment of lung cancers, particularly when classification by standard methods is challenging.
Modulation of ColE1-like Plasmid Replication for Recombinant Gene Expression
Camps, Manel
2010-01-01
ColE1-like plasmids constitute the most popular vectors for recombinant protein expression. ColE1 plasmid replication is tightly controlled by an antisense RNA mechanism that is highly dynamic, tuning plasmid metabolic burden to the physiological state of the host. Plasmid homeostasis is upset upon induction of recombinant protein expression because of non-physiological levels of expression and because of the frequently biased amino acid composition of recombinant proteins. Disregulation of plasmid replication is the main cause of collapse of plasmid-based expression systems because of a simultaneous increase in the metabolic burden (due to increased average copy number) and in the probability of generation of plasmid-free cells (due to increased copy number variation). Interference between regulatory elements of co-resident plasmids causes comparable effects on plasmid stability (plasmid incompatibility). Modulating plasmid copy number for recombinant gene expression aims at achieving a high gene dosage while preserving the stability of the expression system. Here I present strategies targeting plasmid replication for optimizing recombinant gene expression. Specifically, I review approaches aimed at modulating the antisense regulatory system (as well as their implications for plasmid incompatibility) and innovative strategies involving modulation of host factors, of R-loop formation, and of the timing of recombinant gene expression. PMID:20218961
NASA Astrophysics Data System (ADS)
Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.
2016-04-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.
Zhu, Xun; Wolfgruber, Thomas K; Tasato, Austin; Arisdakessian, Cédric; Garmire, David G; Garmire, Lana X
2017-12-05
Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.
Gemperlein, Katja; Zipf, Gregor; Bernauer, Hubert S; Müller, Rolf; Wenzel, Silke C
2016-01-01
Long-chain polyunsaturated fatty acids (LC-PUFAs) can be produced de novo via polyketide synthase-like enzymes known as PUFA synthases, which are encoded by pfa biosynthetic gene clusters originally discovered from marine microorganisms. Recently similar gene clusters were detected and characterized in terrestrial myxobacteria revealing several striking differences. As the identified myxobacterial producers are difficult to handle genetically and grow very slowly we aimed to establish heterologous expression platforms for myxobacterial PUFA synthases. Here we report the heterologous expression of the pfa gene cluster from Aetherobacter fasciculatus (SBSr002) in the phylogenetically distant model host bacteria Escherichia coli and Pseudomonas putida. The latter host turned out to be the more promising PUFA producer revealing higher production rates of n-6 docosapentaenoic acid (DPA) and docosahexaenoic acid (DHA). After several rounds of genetic engineering of expression plasmids combined with metabolic engineering of P. putida, DHA production yields were eventually increased more than threefold. Additionally, we applied synthetic biology approaches to redesign and construct artificial versions of the A. fasciculatus pfa gene cluster, which to the best of our knowledge represents the first example of a polyketide-like biosynthetic gene cluster modulated and synthesized for P. putida. Combination with the engineering efforts described above led to a further increase in LC-PUFA production yields. The established production platform based on synthetic DNA now sets the stage for flexible engineering of the complex PUFA synthase. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Zhang, Xianglan; Cha, In-Ho; Kim, Ki-Yeol
2017-12-26
In this study, we investigated the consensus gene modules in head and neck cancer (HNC) and cervical cancer (CC). We used a publicly available gene expression dataset, GSE6791, which included 42 HNC, 14 normal head and neck, 20 CC and 8 normal cervical tissue samples. To exclude bias because of different human papilloma virus (HPV) types, we analyzed HPV16-positive samples only. We identified 3824 genes common to HNC and CC samples. Among these, 977 genes showed high connectivity and were used to construct consensus modules. We demonstrated eight consensus gene modules for HNC and CC using the dissimilarity measure and average linkage hierarchical clustering methods. These consensus modules included genes with significant biological functions, including ATP binding and extracellular exosome. Eigengen network analysis revealed the consensus modules were highly preserved with high connectivity. These findings demonstrate that HPV16-positive head and neck and cervical cancers share highly preserved consensus gene modules with common potentially therapeutic targets.
Zhang, Xianglan; Cha, In-Ho; Kim, Ki-Yeol
2017-01-01
In this study, we investigated the consensus gene modules in head and neck cancer (HNC) and cervical cancer (CC). We used a publicly available gene expression dataset, GSE6791, which included 42 HNC, 14 normal head and neck, 20 CC and 8 normal cervical tissue samples. To exclude bias because of different human papilloma virus (HPV) types, we analyzed HPV16-positive samples only. We identified 3824 genes common to HNC and CC samples. Among these, 977 genes showed high connectivity and were used to construct consensus modules. We demonstrated eight consensus gene modules for HNC and CC using the dissimilarity measure and average linkage hierarchical clustering methods. These consensus modules included genes with significant biological functions, including ATP binding and extracellular exosome. Eigengen network analysis revealed the consensus modules were highly preserved with high connectivity. These findings demonstrate that HPV16-positive head and neck and cervical cancers share highly preserved consensus gene modules with common potentially therapeutic targets. PMID:29371966
Gene selection for tumor classification using neighborhood rough sets and entropy measures.
Chen, Yumin; Zhang, Zunjun; Zheng, Jianzhong; Ma, Ying; Xue, Yu
2017-03-01
With the development of bioinformatics, tumor classification from gene expression data becomes an important useful technology for cancer diagnosis. Since a gene expression data often contains thousands of genes and a small number of samples, gene selection from gene expression data becomes a key step for tumor classification. Attribute reduction of rough sets has been successfully applied to gene selection field, as it has the characters of data driving and requiring no additional information. However, traditional rough set method deals with discrete data only. As for the gene expression data containing real-value or noisy data, they are usually employed by a discrete preprocessing, which may result in poor classification accuracy. In this paper, we propose a novel gene selection method based on the neighborhood rough set model, which has the ability of dealing with real-value data whilst maintaining the original gene classification information. Moreover, this paper addresses an entropy measure under the frame of neighborhood rough sets for tackling the uncertainty and noisy of gene expression data. The utilization of this measure can bring about a discovery of compact gene subsets. Finally, a gene selection algorithm is designed based on neighborhood granules and the entropy measure. Some experiments on two gene expression data show that the proposed gene selection is an effective method for improving the accuracy of tumor classification. Copyright © 2017 Elsevier Inc. All rights reserved.
Appelt-Menzel, Antje; Cubukova, Alevtina; Günther, Katharina; Edenhofer, Frank; Piontek, Jörg; Krause, Gerd; Stüber, Tanja; Walles, Heike; Neuhaus, Winfried; Metzger, Marco
2017-04-11
In vitro models of the human blood-brain barrier (BBB) are highly desirable for drug development. This study aims to analyze a set of ten different BBB culture models based on primary cells, human induced pluripotent stem cells (hiPSCs), and multipotent fetal neural stem cells (fNSCs). We systematically investigated the impact of astrocytes, pericytes, and NSCs on hiPSC-derived BBB endothelial cell function and gene expression. The quadruple culture models, based on these four cell types, achieved BBB characteristics including transendothelial electrical resistance (TEER) up to 2,500 Ω cm 2 and distinct upregulation of typical BBB genes. A complex in vivo-like tight junction (TJ) network was detected by freeze-fracture and transmission electron microscopy. Treatment with claudin-specific TJ modulators caused TEER decrease, confirming the relevant role of claudin subtypes for paracellular tightness. Drug permeability tests with reference substances were performed and confirmed the suitability of the models for drug transport studies. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Vallat, Laurent; Kemper, Corey A; Jung, Nicolas; Maumy-Bertrand, Myriam; Bertrand, Frédéric; Meyer, Nicolas; Pocheville, Arnaud; Fisher, John W; Gribben, John G; Bahram, Seiamak
2013-01-08
Cellular behavior is sustained by genetic programs that are progressively disrupted in pathological conditions--notably, cancer. High-throughput gene expression profiling has been used to infer statistical models describing these cellular programs, and development is now needed to guide orientated modulation of these systems. Here we develop a regression-based model to reverse-engineer a temporal genetic program, based on relevant patterns of gene expression after cell stimulation. This method integrates the temporal dimension of biological rewiring of genetic programs and enables the prediction of the effect of targeted gene disruption at the system level. We tested the performance accuracy of this model on synthetic data before reverse-engineering the response of primary cancer cells to a proliferative (protumorigenic) stimulation in a multistate leukemia biological model (i.e., chronic lymphocytic leukemia). To validate the ability of our method to predict the effects of gene modulation on the global program, we performed an intervention experiment on a targeted gene. Comparison of the predicted and observed gene expression changes demonstrates the possibility of predicting the effects of a perturbation in a gene regulatory network, a first step toward an orientated intervention in a cancer cell genetic program.
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando
2014-01-01
Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. PMID:25268582
MAGMA: Generalized Gene-Set Analysis of GWAS Data
de Leeuw, Christiaan A.; Mooij, Joris M.; Heskes, Tom; Posthuma, Danielle
2015-01-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn’s Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn’s Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn’s Disease data was found to be considerably faster as well. PMID:25885710
MAGMA: generalized gene-set analysis of GWAS data.
de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle
2015-04-01
By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Identification of Modulators of the Nuclear Receptor ...
The nuclear receptor family member peroxisome proliferator-activated receptor α (PPARα) is activated by therapeutic hypolipidemic drugs and environmentally-relevant chemicals to regulate genes involved in lipid transport and catabolism. Chronic activation of PPARα in rodents increases in liver cancer incidence, whereas suppression of PPARα activity can lead to hepatocellular steatosis. Analytical approaches were developed to identify biosets (i.e., gene expression differences between two conditions) in a genomic database in which PPARα activity was altered. A gene expression signature of 131 PPARα-dependent genes was built using profiles from the livers of wild-type and PPARα-null mice after exposure to three structurally diverse PPARα activators (WY-14,643, fenofibrate and perfluorohexane sulfonate). A rank-based test (Running Fisher’s test (p-value ≤ 10-4)) was used to evaluate the similarity between the PPARα signature and a test set of 48 and 31 biosets positive or negative, respectively for PPARα activation; the test resulted in a balanced accuracy of 98%. The signature was used to identify factors that activate or suppress PPARα in an annotated mouse liver/primary hepatocyte gene expression database of ~1850 biosets. In addition to the expected activation of PPARα by fibrate drugs, di(2-ethylhexyl) phthalate, and perfluorinated compounds, PPARα was activated by benzofuran, galactosamine and TCDD and suppressed by hepatotoxins acetami
Application for managing model-based material properties for simulation-based engineering
Hoffman, Edward L [Alameda, CA
2009-03-03
An application for generating a property set associated with a constitutive model of a material includes a first program module adapted to receive test data associated with the material and to extract loading conditions from the test data. A material model driver is adapted to receive the loading conditions and a property set and operable in response to the loading conditions and the property set to generate a model response for the material. A numerical optimization module is adapted to receive the test data and the model response and operable in response to the test data and the model response to generate the property set.
Human Movement Detection and Idengification Using Pyroelectric Infrared Sensors
Yun, Jaeseok; Lee, Sang-Shin
2014-01-01
Pyroelectric infrared (PIR) sensors are widely used as a presence trigger, but the analog output of PIR sensors depends on several other aspects, including the distance of the body from the PIR sensor, the direction and speed of movement, the body shape and gait. In this paper, we present an empirical study of human movement detection and idengification using a set of PIR sensors. We have developed a data collection module having two pairs of PIR sensors orthogonally aligned and modified Fresnel lenses. We have placed three PIR-based modules in a hallway for monitoring people; one module on the ceiling; two modules on opposite walls facing each other. We have collected a data set from eight subjects when walking in three different conditions: two directions (back and forth), three distance intervals (close to one wall sensor, in the middle, close to the other wall sensor) and three speed levels (slow, moderate, fast). We have used two types of feature sets: a raw data set and a reduced feature set composed of amplitude and time to peaks; and passage duration extracted from each PIR sensor. We have performed classification analysis with well-known machine learning algorithms, including instance-based learning and support vector machine. Our findings show that with the raw data set captured from a single PIR sensor of each of the three modules, we could achieve more than 92% accuracy in classifying the direction and speed of movement, the distance interval and idengifying subjects. We could also achieve more than 94% accuracy in classifying the direction, speed and distance and idengifying subjects using the reduced feature set extracted from two pairs of PIR sensors of each of the three modules. PMID:24803195
Korsunsky, Ilya; Parameswaran, Janaki; Shapira, Iuliana; Lovecchio, John; Menzin, Andrew; Whyte, Jill; Dos Santos, Lisa; Liang, Sharon; Bhuiya, Tawfiqul; Keogh, Mary; Khalili, Houman; Pond, Cassandra; Liew, Anthony; Shih, Andrew; Gregersen, Peter K; Lee, Annette T
2017-10-01
MicroRNAs have been established as key regulators of tumor gene expression and as prime biomarker candidates for clinical phenotypes in epithelial ovarian cancer (EOC). We analyzed the coexpression and regulatory structure of microRNAs and their co-localized gene targets in primary tumor tissue of 20 patients with advanced EOC in order to construct a regulatory signature for clinical prognosis. We performed an integrative analysis to identify two prognostic microRNA/mRNA coexpression modules, each enriched for consistent biological functions. One module, enriched for malignancy-related functions, was found to be upregulated in malignant versus benign samples. The second module, enriched for immune-related functions, was strongly correlated with imputed intratumoral immune infiltrates of T cells, natural killer cells, cytotoxic lymphocytes, and macrophages. We validated the prognostic relevance of the immunological module microRNAs in the publicly available The Cancer Genome Atlas data set. These findings provide novel functional roles for microRNAs in the progression of advanced EOC and possible prognostic signatures for survival. © American Federation for Medical Research (unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
VAIDYANATHAN, UMA; MALONE, STEPHEN M.; MILLER, MICHAEL B.; McGUE, MATT; IACONO, WILLIAM G.
2014-01-01
Acoustic startle responses have been studied extensively in relation to individual differences and psychopathology. We examined three indices of the blink response in a picture-viewing paradigm—overall startle magnitude across all picture types, and aversive and pleasant modulation scores—in 3,323 twins and parents. Biometric models and molecular genetic analyses showed that half the variance in overall startle was due to additive genetic effects. No single nucleotide polymorphism was genome-wide significant, but GRIK3 did produce a significant effect when examined as part of a candidate gene set. In contrast, emotion modulation scores showed little evidence of heritability in either biometric or molecular genetic analyses. However, in a genome-wide scan, PARP14 did produce a significant effect for aversive modulation. We conclude that, although overall startle retains potential as an endophenotype, emotion-modulated startle does not. PMID:25387708
de Anda-Jáuregui, Guillermo; Guo, Kai; McGregor, Brett A.; Hur, Junguk
2018-01-01
The quintessential biological response to disease is inflammation. It is a driver and an important element in a wide range of pathological states. Pharmacological management of inflammation is therefore central in the clinical setting. Anti-inflammatory drugs modulate specific molecules involved in the inflammatory response; these drugs are traditionally classified as steroidal and non-steroidal drugs. However, the effects of these drugs are rarely limited to their canonical targets, affecting other molecules and altering biological functions with system-wide effects that can lead to the emergence of secondary therapeutic applications or adverse drug reactions (ADRs). In this study, relationships among anti-inflammatory drugs, functional pathways, and ADRs were explored through network models. We integrated structural drug information, experimental anti-inflammatory drug perturbation gene expression profiles obtained from the Connectivity Map and Library of Integrated Network-Based Cellular Signatures, functional pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases, as well as adverse reaction information from the U.S. Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). The network models comprise nodes representing anti-inflammatory drugs, functional pathways, and adverse effects. We identified structural and gene perturbation similarities linking anti-inflammatory drugs. Functional pathways were connected to drugs by implementing Gene Set Enrichment Analysis (GSEA). Drugs and adverse effects were connected based on the proportional reporting ratio (PRR) of an adverse effect in response to a given drug. Through these network models, relationships among anti-inflammatory drugs, their functional effects at the pathway level, and their adverse effects were explored. These networks comprise 70 different anti-inflammatory drugs, 462 functional pathways, and 1,175 ADRs. Network-based properties, such as degree, clustering coefficient, and node strength, were used to identify new therapeutic applications within and beyond the anti-inflammatory context, as well as ADR risk for these drugs, helping to select better repurposing candidates. Based on these parameters, we identified naproxen, meloxicam, etodolac, tenoxicam, flufenamic acid, fenoprofen, and nabumetone as candidates for drug repurposing with lower ADR risk. This network-based analysis pipeline provides a novel way to explore the effects of drugs in a therapeutic space. PMID:29545755
de Anda-Jáuregui, Guillermo; Guo, Kai; McGregor, Brett A; Hur, Junguk
2018-01-01
The quintessential biological response to disease is inflammation. It is a driver and an important element in a wide range of pathological states. Pharmacological management of inflammation is therefore central in the clinical setting. Anti-inflammatory drugs modulate specific molecules involved in the inflammatory response; these drugs are traditionally classified as steroidal and non-steroidal drugs. However, the effects of these drugs are rarely limited to their canonical targets, affecting other molecules and altering biological functions with system-wide effects that can lead to the emergence of secondary therapeutic applications or adverse drug reactions (ADRs). In this study, relationships among anti-inflammatory drugs, functional pathways, and ADRs were explored through network models. We integrated structural drug information, experimental anti-inflammatory drug perturbation gene expression profiles obtained from the Connectivity Map and Library of Integrated Network-Based Cellular Signatures, functional pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases, as well as adverse reaction information from the U.S. Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). The network models comprise nodes representing anti-inflammatory drugs, functional pathways, and adverse effects. We identified structural and gene perturbation similarities linking anti-inflammatory drugs. Functional pathways were connected to drugs by implementing Gene Set Enrichment Analysis (GSEA). Drugs and adverse effects were connected based on the proportional reporting ratio (PRR) of an adverse effect in response to a given drug. Through these network models, relationships among anti-inflammatory drugs, their functional effects at the pathway level, and their adverse effects were explored. These networks comprise 70 different anti-inflammatory drugs, 462 functional pathways, and 1,175 ADRs. Network-based properties, such as degree, clustering coefficient, and node strength, were used to identify new therapeutic applications within and beyond the anti-inflammatory context, as well as ADR risk for these drugs, helping to select better repurposing candidates. Based on these parameters, we identified naproxen, meloxicam, etodolac, tenoxicam, flufenamic acid, fenoprofen, and nabumetone as candidates for drug repurposing with lower ADR risk. This network-based analysis pipeline provides a novel way to explore the effects of drugs in a therapeutic space.
Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott
2010-04-01
An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.
Ienasescu, Hans; Li, Kang; Andersson, Robin; Vitezic, Morana; Rennie, Sarah; Chen, Yun; Vitting-Seerup, Kristoffer; Lagoni, Emil; Boyd, Mette; Bornholdt, Jette; de Hoon, Michiel J. L.; Kawaji, Hideya; Lassmann, Timo; Hayashizaki, Yoshihide; Forrest, Alistair R. R.; Carninci, Piero; Sandelin, Albin
2016-01-01
Genomics consortia have produced large datasets profiling the expression of genes, micro-RNAs, enhancers and more across human tissues or cells. There is a need for intuitive tools to select subsets of such data that is the most relevant for specific studies. To this end, we present SlideBase, a web tool which offers a new way of selecting genes, promoters, enhancers and microRNAs that are preferentially expressed/used in a specified set of cells/tissues, based on the use of interactive sliders. With the help of sliders, SlideBase enables users to define custom expression thresholds for individual cell types/tissues, producing sets of genes, enhancers etc. which satisfy these constraints. Changes in slider settings result in simultaneous changes in the selected sets, updated in real time. SlideBase is linked to major databases from genomics consortia, including FANTOM, GTEx, The Human Protein Atlas and BioGPS. Database URL: http://slidebase.binf.ku.dk PMID:28025337
Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J
2016-08-01
In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set analyses offer promising new alternatives to analyses focusing on single candidate polymorphisms when examining the interplay between genetic and environmental factors.
Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent
2009-01-01
Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a STARNET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at , and does not require user registration. PMID:19828039
Pilati, Stefania; Bagagli, Giorgia; Sonego, Paolo; Moretto, Marco; Brazzale, Daniele; Castorina, Giulia; Simoni, Laura; Tonelli, Chiara; Guella, Graziano; Engelen, Kristof; Galbiati, Massimo; Moser, Claudio
2017-01-01
Grapevine is a world-wide cultivated economically relevant crop. The process of berry ripening is non-climacteric and does not rely on the sole ethylene signal. Abscisic acid (ABA) is recognized as an important hormone of ripening inception and color development in ripening berries. In order to elucidate the effect of this signal at the molecular level, pre-véraison berries were treated ex vivo for 20 h with 0.2 mM ABA and berry skin transcriptional modulation was studied by RNA-seq after the treatment and 24 h later, in the absence of exogenous ABA. This study highlighted that a small amount of ABA triggered its own biosynthesis and had a transcriptome-wide effect (1893 modulated genes) characterized by the amplification of the transcriptional response over time. By comparing this dataset with the many studies on ripening collected within the grapevine transcriptomic compendium Vespucci, an extended overlap between ABA- and ripening modulated gene sets was observed (71% of the genes), underpinning the role of this hormone in the regulation of berry ripening. The signaling network of ABA, encompassing ABA metabolism, transport and signaling cascade, has been analyzed in detail and expanded based on knowledge from other species in order to provide an integrated molecular description of this pathway at berry ripening onset. Expression data analysis was combined with in silico promoter analysis to identify candidate target genes of ABA responsive element binding protein 2 (VvABF2), a key upstream transcription factor of the ABA signaling cascade which is up-regulated at véraison and also by ABA treatments. Two transcription factors, VvMYB143 and VvNAC17, and two genes involved in protein degradation, Armadillo-like and Xerico-like genes, were selected for in vivo validation by VvABF2-mediated promoter trans-activation in tobacco. VvNAC17 and Armadillo-like promoters were induced by ABA via VvABF2, while VvMYB143 responded to ABA in a VvABF2-independent manner. This knowledge of the ABA cascade in berry skin contributes not only to the understanding of berry ripening regulation but might be useful to other areas of viticultural interest, such as bud dormancy regulation and drought stress tolerance. PMID:28680438
Genomic Heterogeneity of Osteosarcoma - Shift from Single Candidates to Functional Modules
Maugg, Doris; Eckstein, Gertrud; Baumhoer, Daniel; Nathrath, Michaela; Korsching, Eberhard
2015-01-01
Osteosarcoma (OS), a bone tumor, exhibit a complex karyotype. On the genomic level a highly variable degree of alterations in nearly all chromosomal regions and between individual tumors is observable. This hampers the identification of common drivers in OS biology. To identify the common molecular mechanisms involved in the maintenance of OS, we follow the hypothesis that all the copy number-associated differences between the patients are intercepted on the level of the functional modules. The implementation is based on a network approach utilizing copy number associated genes in OS, paired expression data and protein interaction data. The resulting functional modules of tightly connected genes were interpreted regarding their biological functions in OS and their potential prognostic significance. We identified an osteosarcoma network assembling well-known and lesser-known candidates. The derived network shows a significant connectivity and modularity suggesting that the genes affected by the heterogeneous genetic alterations share the same biological context. The network modules participate in several critical aspects of cancer biology like DNA damage response, cell growth, and cell motility which is in line with the hypothesis of specifically deregulated but functional modules in cancer. Further, we could deduce genes with possible prognostic significance in OS for further investigation (e.g. EZR, CDKN2A, MAP3K5). Several of those module genes were located on chromosome 6q. The given systems biological approach provides evidence that heterogeneity on the genomic and expression level is ordered by the biological system on the level of the functional modules. Different genomic aberrations are pointing to the same cellular network vicinity to form vital, but already neoplastically altered, functional modules maintaining OS. This observation, exemplarily now shown for OS, has been under discussion already for a longer time, but often in a hypothetical manner, and can here be exemplified for OS. PMID:25848766
Hunter, Chad S; Malik, Raleigh E; Witzmann, Frank A; Rhodes, Simon J
2013-01-01
LIM-homeodomain 3 (LHX3) is a transcription factor required for mammalian pituitary gland and nervous system development. Human patients and animal models with LHX3 gene mutations present with severe pediatric syndromes that feature hormone deficiencies and symptoms associated with nervous system dysfunction. The carboxyl terminus of the LHX3 protein is required for pituitary gene regulation, but the mechanism by which this domain operates is unknown. In order to better understand LHX3-dependent pituitary hormone gene transcription, we used biochemical and mass spectrometry approaches to identify and characterize proteins that interact with the LHX3 carboxyl terminus. This approach identified the LANP/pp32 and TAF-1β/SET proteins, which are components of the inhibitor of histone acetyltransferase (INHAT) multi-subunit complex that serves as a multifunctional repressor to inhibit histone acetylation and modulate chromatin structure. The protein domains of LANP and TAF-1β that interact with LHX3 were mapped using biochemical techniques. Chromatin immunoprecipitation experiments demonstrated that LANP and TAF-1β are associated with LHX3 target genes in pituitary cells, and experimental alterations of LANP and TAF-1β levels affected LHX3-mediated pituitary gene regulation. Together, these data suggest that transcriptional regulation of pituitary genes by LHX3 involves regulated interactions with the INHAT complex.
Witzmann, Frank A.; Rhodes, Simon J.
2013-01-01
LIM-homeodomain 3 (LHX3) is a transcription factor required for mammalian pituitary gland and nervous system development. Human patients and animal models with LHX3 gene mutations present with severe pediatric syndromes that feature hormone deficiencies and symptoms associated with nervous system dysfunction. The carboxyl terminus of the LHX3 protein is required for pituitary gene regulation, but the mechanism by which this domain operates is unknown. In order to better understand LHX3-dependent pituitary hormone gene transcription, we used biochemical and mass spectrometry approaches to identify and characterize proteins that interact with the LHX3 carboxyl terminus. This approach identified the LANP/pp32 and TAF-1β/SET proteins, which are components of the inhibitor of histone acetyltransferase (INHAT) multi-subunit complex that serves as a multifunctional repressor to inhibit histone acetylation and modulate chromatin structure. The protein domains of LANP and TAF-1β that interact with LHX3 were mapped using biochemical techniques. Chromatin immunoprecipitation experiments demonstrated that LANP and TAF-1β are associated with LHX3 target genes in pituitary cells, and experimental alterations of LANP and TAF-1β levels affected LHX3-mediated pituitary gene regulation. Together, these data suggest that transcriptional regulation of pituitary genes by LHX3 involves regulated interactions with the INHAT complex. PMID:23861948
Gilli, Francesca; Navone, Nicole Désirée; Perga, Simona; Marnetto, Fabiana; Caldano, Marzia; Capobianco, Marco; Pulizzi, Annalisa; Malucchi, Simona; Bertolotto, Antonio
2011-07-01
In a recent genome-wide transcriptional analysis, we identified a gene signature for multiple sclerosis (MS), which reverted back to normal during pregnancy. Reversion was particularly evident for 7 genes: SOCS2, TNFAIP3, NR4A2, CXCR4, POLR2J, FAM49B, and STAG3L1, most of which encode negative regulators of inflammation. To corroborate dysregulation of genes, to evaluate the prognostic value of genes, and to study modulation of genes during different treatments. Comparison study. Italian referral center for MS. Quantitative polymerase chain reaction measurements were performed for 274 patients with MS and 60 healthy controls. Of the 274 patients with MS, 113 were treatment-naive patients in the initial stages of their disorder who were followed up in real-world clinical settings and categorized on the basis of disease course. The remaining 161 patients with MS received disease-modifying therapies (55 patients were treated with interferon beta, 52 with glatiramer acetate, and 54 with natalizumab) for a mean (SD) of 12 (2) months. Gene expression levels, relapse rate, and change in Expanded Disability Status Scale. We found a dysregulated gene pathway (P ≤ .006), with a downregulation of genes encoding negative regulators. The SOCS2, NR4A2, and TNFAIP3 genes were inversely correlated with both relapse rate (P ≤ .002) and change in Expanded Disability Status Scale (P ≤ .005). SOCS2 was modulated by both interferon beta and glatiramer acetate, TNFAIP3 was modulated by glatiramer acetate, and NR4A2 was not altered at all. No changes were induced by natalizumab. We demonstrate that there is a new molecular pathogenic mechanism that underlies the initiation and progression of MS. Defects in negative-feedback loops of inflammation lead to an overactivation of the immune system so as to predispose the brain to inflammation-sensitive MS.
Shen, Haoran; Liang, Zhou; Zheng, Saihua; Li, Xuelian
2017-11-01
The purpose of this study was to identify promising candidate genes and pathways in polycystic ovary syndrome (PCOS). Microarray dataset GSE345269 obtained from the Gene Expression Omnibus database includes 7 granulosa cell samples from PCOS patients, and 3 normal granulosa cell samples. Differentially expressed genes (DEGs) were screened between PCOS and normal samples. Pathway enrichment analysis was conducted for DEGs using ClueGO and CluePedia plugin of Cytoscape. A Reactome functional interaction (FI) network of the DEGs was built using ReactomeFIViz, and then network modules were extracted, followed by pathway enrichment analysis for the modules. Expression of DEGs in granulosa cell samples was measured using quantitative RT-PCR. A total of 674 DEGs were retained, which were significantly enriched with inflammation and immune-related pathways. Eight modules were extracted from the Reactome FI network. Pathway enrichment analysis revealed significant pathways of each module: module 0, Regulation of RhoA activity and Signaling by Rho GTPases pathways shared ARHGAP4 and ARHGAP9; module 2, GlycoProtein VI-mediated activation cascade pathway was enriched with RHOG; module 3, Thromboxane A2 receptor signaling, Chemokine signaling pathway, CXCR4-mediated signaling events pathways were enriched with LYN, the hub gene of module 3. Results of RT-PCR confirmed the finding of the bioinformatic analysis that ARHGAP4, ARHGAP9, RHOG and LYN were significantly upregulated in PCOS. RhoA-related pathways, GlycoProtein VI-mediated activation cascade pathway, ARHGAP4, ARHGAP9, RHOG and LYN may be involved in the pathogenesis of PCOS.
Bolea, Mario; Mora, José; Ortega, Beatriz; Capmany, José
2013-11-18
We present a high-order UWB pulses generator based on a microwave photonic filter which provides a set of positive and negative samples by using the slicing of an incoherent optical source and the phase inversion in a Mach-Zehnder modulator. The simple scalability and high reconfigurability of the system permit a better accomplishment of the FCC requirements. Moreover, the proposed scheme permits an easy adaptation to pulse amplitude modulation, bi phase modulation, pulse shape modulation and pulse position modulation. The flexibility of the scheme for being adaptable to multilevel modulation formats permits to increase the transmission bit rate by using hybrid modulation formats.
Kang, Ha Ram; da Costa Fernandes, Célio Junior; da Silva, Rodrigo Augusto; Constantino, Vera Regina Leopoldo; Koh, Ivan Hong Jun; Zambuzzi, Willian F
2018-02-01
The effect of LDH samples comprised of chloride anions intercalated between positive layers of magnesium/aluminum (Mg-Al LDH) or zinc/aluminum (Zn-Al LDH) chemical composition on pre-osteoblast performance is investigated. Non-cytotoxic concentrations of both LDHs modulated pre-osteoblast adhesion by triggering cytoskeleton rearrangement dependent on recruiting of Cofilin, which is modulated by the inhibition of the Protein Phosphatase 2A (PP2A), culminating in osteoblast differentiation with a significant increase of osteogenic marker genes. The alkaline phosphatase (ALP) and bone sialoprotein (BSP) are significantly up-modulated by both LDHs; however, Mg-Al LDH nanomaterial promoted even more significance than both experimental controls, while the phosphorylations of mitogen-activated protein kinase (MAPKs)- extracellular signal-regulated kinases (ERK) and c-Jun N-terminal kinase (JNK) significantly increased. MAPK signaling is necessary to activate Runt-related transcription factor 2 (RUNX2) gene. Concomitantly, it is also investigated whether challenged osteoblasts are able to modulate osteoclastogenesis by investigating both osteoprotegerin (OPG) and Receptor activator of nuclear factor kappa-ligand (RANKL) in this model; a dynamic reprogramming of both these genes is found, suggesting LDHs in modulating osteoclastogenesis. These results suggest that LDHs interfere in bone remodeling, and they can be considered as nanomaterials in graft-based bone healing or drug-delivery materials for bone disorders. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wan, Qi; Tang, Jing; Han, Yu; Wang, Dan
2018-01-01
Uveal melanoma is an aggressive cancer which has a high percentage recurrence and with a worse prognosis. Identify the potential prognostic markers of uveal melanoma may provide information for early detection of recurrence and treatment. RNA sequence data of uveal melanoma and patient clinic traits were obtained from The Cancer Genome Atlas (TCGA) database. Co-expression modules were built by weighted gene co -expression network analysis (WGCNA) and applied to investigate the relationship underlying modules and clinic traits. Besides, functional enrichment analysis was performed on these co-expression genes from interested modules. First, using WGCNA, identified 21 co-expression modules were constructed by the 10975 genes from the 80 human uveal melanoma samples. The number of genes in these modules ranged from 42 to 5091. Found four co -expression modules significantly correlated with three clinic traits (status, recurrence and recurrence Time). Module red, and purple positively correlated with patient's life status and recurrence Time. Module green positively correlates with recurrence. The result of functional enrichment analysis showed that the module magenta was mainly enriched genetic material assemble processes, the purple module was mainly enriched in tissue homeostasis and melanosome membrane and the module red was mainly enriched metastasis of cell, suggesting its critical role in the recurrence and development of the disease. Additionally, identified the hug gene (top connectivity with other genes) in each module. The hub gene SLC17A7, NTRK2, ABTB1 and ADPRHL1 might play a vital role in recurrence of uveal melanoma. Our findings provided the framework of co-expression gene modules of uveal melanoma and identified some prognostic markers might be detection of recurrence and treatment for uveal melanoma. Copyright © 2017 Elsevier Ltd. All rights reserved.
Conjugative plasmids: vessels of the communal gene pool
Norman, Anders; Hansen, Lars H.; Sørensen, Søren J.
2009-01-01
Comparative whole-genome analyses have demonstrated that horizontal gene transfer (HGT) provides a significant contribution to prokaryotic genome innovation. The evolution of specific prokaryotes is therefore tightly linked to the environment in which they live and the communal pool of genes available within that environment. Here we use the term supergenome to describe the set of all genes that a prokaryotic ‘individual’ can draw on within a particular environmental setting. Conjugative plasmids can be considered particularly successful entities within the communal pool, which have enabled HGT over large taxonomic distances. These plasmids are collections of discrete regions of genes that function as ‘backbone modules’ to undertake different aspects of overall plasmid maintenance and propagation. Conjugative plasmids often carry suites of ‘accessory elements’ that contribute adaptive traits to the hosts and, potentially, other resident prokaryotes within specific environmental niches. Insight into the evolution of plasmid modules therefore contributes to our knowledge of gene dissemination and evolution within prokaryotic communities. This communal pool provides the prokaryotes with an important mechanistic framework for obtaining adaptability and functional diversity that alleviates the need for large genomes of specialized ‘private genes’. PMID:19571247
Ficklin, Stephen P.; Feltus, F. Alex
2011-01-01
One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species. PMID:21606319
Ficklin, Stephen P; Feltus, F Alex
2011-07-01
One major objective for plant biology is the discovery of molecular subsystems underlying complex traits. The use of genetic and genomic resources combined in a systems genetics approach offers a means for approaching this goal. This study describes a maize (Zea mays) gene coexpression network built from publicly available expression arrays. The maize network consisted of 2,071 loci that were divided into 34 distinct modules that contained 1,928 enriched functional annotation terms and 35 cofunctional gene clusters. Of note, 391 maize genes of unknown function were found to be coexpressed within modules along with genes of known function. A global network alignment was made between this maize network and a previously described rice (Oryza sativa) coexpression network. The IsoRankN tool was used, which incorporates both gene homology and network topology for the alignment. A total of 1,173 aligned loci were detected between the two grass networks, which condensed into 154 conserved subgraphs that preserved 4,758 coexpression edges in rice and 6,105 coexpression edges in maize. This study provides an early view into maize coexpression space and provides an initial network-based framework for the translation of functional genomic and genetic information between these two vital agricultural species.
Mulligan, Megan K; Mozhui, Khyobeni; Pandey, Ashutosh K; Smith, Maren L; Gong, Suzhen; Ingels, Jesse; Miles, Michael F; Lopez, Marcelo F; Lu, Lu; Williams, Robert W
2017-02-01
Genetic factors that influence the transition from initial drinking to dependence remain enigmatic. Recent studies have leveraged chronic intermittent ethanol (CIE) paradigms to measure changes in brain gene expression in a single strain at 0, 8, 72 h, and even 7 days following CIE. We extend these findings using LCM RNA-seq to profile expression in 11 brain regions in two inbred strains - C57BL/6J (B6) and DBA/2J (D2) - 72 h following multiple cycles of ethanol self-administration and CIE. Linear models identified differential expression based on treatment, region, strain, or interactions with treatment. Nearly 40% of genes showed a robust effect (FDR < 0.01) of region, and hippocampus CA1, cortex, bed nucleus stria terminalis, and nucleus accumbens core had the highest number of differentially expressed genes after treatment. Another 8% of differentially expressed genes demonstrated a robust effect of strain. As expected, based on similar studies in B6, treatment had a much smaller impact on expression; only 72 genes (p < 0.01) are modulated by treatment (independent of region or strain). Strikingly, many more genes (415) show a strain-specific and largely opposite response to treatment and are enriched in processes related to RNA metabolism, transcription factor activity, and mitochondrial function. Over 3 times as many changes in gene expression were detected in D2 compared to B6, and weighted gene co-expression network analysis (WGCNA) module comparison identified more modules enriched for treatment effects in D2. Substantial strain differences exist in the temporal pattern of transcriptional neuroadaptation to CIE, and these may drive individual differences in risk of addiction following excessive alcohol consumption. Copyright © 2016 Elsevier Inc. All rights reserved.
EgoNet: identification of human disease ego-network modules
2014-01-01
Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628
Construction of multi-functional open modulized Matlab simulation toolbox for imaging ladar system
NASA Astrophysics Data System (ADS)
Wu, Long; Zhao, Yuan; Tang, Meng; He, Jiang; Zhang, Yong
2011-06-01
Ladar system simulation is to simulate the ladar models using computer simulation technology in order to predict the performance of the ladar system. This paper presents the developments of laser imaging radar simulation for domestic and overseas studies and the studies of computer simulation on ladar system with different application requests. The LadarSim and FOI-LadarSIM simulation facilities of Utah State University and Swedish Defence Research Agency are introduced in details. This paper presents the low level of simulation scale, un-unified design and applications of domestic researches in imaging ladar system simulation, which are mostly to achieve simple function simulation based on ranging equations for ladar systems. Design of laser imaging radar simulation with open and modularized structure is proposed to design unified modules for ladar system, laser emitter, atmosphere models, target models, signal receiver, parameters setting and system controller. Unified Matlab toolbox and standard control modules have been built with regulated input and output of the functions, and the communication protocols between hardware modules. A simulation based on ICCD gain-modulated imaging ladar system for a space shuttle is made based on the toolbox. The simulation result shows that the models and parameter settings of the Matlab toolbox are able to simulate the actual detection process precisely. The unified control module and pre-defined parameter settings simplify the simulation of imaging ladar detection. Its open structures enable the toolbox to be modified for specialized requests. The modulization gives simulations flexibility.
Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico
2014-01-01
Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Congrains, Ada; Kamide, Kei; Katsuya, Tomohiro
Highlights: Black-Right-Pointing-Pointer ANRIL maps in the strongest susceptibility locus for cardiovascular disease. Black-Right-Pointing-Pointer Silencing of ANRIL leads to altered expression of tissue remodeling-related genes. Black-Right-Pointing-Pointer The effects of ANRIL on gene expression are splicing variant specific. Black-Right-Pointing-Pointer ANRIL affects progression of cardiovascular disease by regulating proliferation and apoptosis pathways. -- Abstract: ANRIL is a newly discovered non-coding RNA lying on the strongest genetic susceptibility locus for cardiovascular disease (CVD) in the chromosome 9p21 region. Genome-wide association studies have been linking polymorphisms in this locus with CVD and several other major diseases such as diabetes and cancer. The role of thismore » non-coding RNA in atherosclerosis progression is still poorly understood. In this study, we investigated the implication of ANRIL in the modulation of gene sets directly involved in atherosclerosis. We designed and tested siRNA sequences to selectively target two exons (exon 1 and exon 19) of the transcript and successfully knocked down expression of ANRIL in human aortic vascular smooth muscle cells (HuAoVSMC). We used a pathway-focused RT-PCR array to profile gene expression changes caused by ANRIL knock down. Notably, the genes affected by each of the siRNAs were different, suggesting that different splicing variants of ANRIL might have distinct roles in cell physiology. Our results suggest that ANRIL splicing variants play a role in coordinating tissue remodeling, by modulating the expression of genes involved in cell proliferation, apoptosis, extra-cellular matrix remodeling and inflammatory response to finally impact in the risk of cardiovascular disease and other pathologies.« less
Design and development of data acquisition system based on WeChat hardware
NASA Astrophysics Data System (ADS)
Wang, Zhitao; Ding, Lei
2018-06-01
Data acquisition system based on WeChat hardware provides methods for popularization and practicality of data acquisition. The whole system is based on WeChat hardware platform, where the hardware part is developed on DA14580 development board and the software part is based on Alibaba Cloud. We designed service module, logic processing module, data processing module and database module. The communication between hardware and software uses AirSync Protocal. We tested this system by collecting temperature and humidity data, and the result shows that the system can aquisite the temperature and humidity in real time according to settings.
Shen, Yun; Ruan, Qingxia; Chai, Haoxi; Yuan, Yongze; Yang, Wannian; Chen, Junping; Xin, Zhanguo; Shi, Huazhong
2016-12-01
Polyamines involve in gene regulation by interacting with and modulating the functions of various anionic macromolecules such as DNA, RNA and proteins. In this study, we identified an important function of the polyamine transporter LHR1 (LOWER EXPRESSION OF HEAT RESPONSIVE GENE1) in heat-inducible gene expression in Arabidopsis thaliana. The lhr1 mutant was isolated through a forward genetic screening for altered expression of the luciferase reporter gene driven by the promoter from the heat-inducible gene AtHSP18.2. The lhr1 mutant showed reduced induction of the luciferase gene in response to heat stress and was more sensitive to high temperature than the wild type. Map-based cloning identified that the LHR1 gene encodes the polyamine transporter PUT3 (POLYAMINE UPTAKE TRANSPORTER 3) localized in the plasma membrane. The LHR1/PUT3 is required for the uptake of extracellular polyamines and plays an important role in stabilizing the mRNAs of several crucial heat stress responsive genes under high temperature. Genome-wide gene expression analysis using RNA-seq identified an array of differentially expressed genes, among which the transcript levels of some of the heat shock protein genes significantly reduced in response to prolonged heat stress in the lhr1 mutant. Our findings revealed an important heat stress response and tolerance mechanism involving polyamine influx which modulates mRNA stability of heat-inducible genes under heat stress conditions. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Intrinsic resistance and RTK-RAS-MAPK pathway reactivation has limited the effectiveness of MEK and RAF inhibitors (MAPKi) in RAS- and RAF-mutant cancers. To identify genes that modulate sensitivity to MAPKi, we performed genome-scale CRISPR-Cas9 loss-of-function screens in two KRAS mutant pancreatic cancer cell lines treated with the MEK1/2 inhibitor trametinib. Loss of CIC, a transcriptional repressor of ETV1, ETV4, and ETV5, promoted survival in the setting of MAPKi in cancer cells derived from several lineages.
Li, Yongxin; Kikuchi, Mani; Li, Xueyan; Gao, Qionghua; Xiong, Zijun; Ren, Yandong; Zhao, Ruoping; Mao, Bingyu; Kondo, Mariko; Irie, Naoki; Wang, Wen
2018-01-01
Sea cucumbers, one main class of Echinoderms, have a very fast and drastic metamorphosis process during their development. However, the molecular basis under this process remains largely unknown. Here we systematically examined the gene expression profiles of Japanese common sea cucumber (Apostichopus japonicus) for the first time by RNA sequencing across 16 developmental time points from fertilized egg to juvenile stage. Based on the weighted gene co-expression network analysis (WGCNA), we identified 21 modules. Among them, MEdarkmagenta was highly expressed and correlated with the early metamorphosis process from late auricularia to doliolaria larva. Furthermore, gene enrichment and differentially expressed gene analysis identified several genes in the module that may play key roles in the metamorphosis process. Our results not only provide a molecular basis for experimentally studying the development and morphological complexity of sea cucumber, but also lay a foundation for improving its emergence rate. Copyright © 2017 Elsevier Inc. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Mounting evidence shows microRNAs (miRNAs) directly regulate gene expression post-transcriptionally through base-pairing with regions in the 3’-untranslated sequences of target gene mRNAs, which results in dysregulation of gene expression/translation and subsequently modulates cellular processes. We...
de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome
2016-08-01
Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected p<0.05), highly ranked gene-sets reaching suggestive significance including the dopamine receptor antagonists metoclopramide and trifluoperazine and the tyrosine kinase inhibitor neratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.
Pathway Distiller - multisource biological pathway consolidation
2012-01-01
Background One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. Methods After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. Results We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. Conclusions By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments. PMID:23134636
Pathway Distiller - multisource biological pathway consolidation.
Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong
2012-01-01
One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
Characterization of Light and Nitrogen Regulated Gene Expression Pathways in Marine Diatoms
1992-12-31
DNA and cDNA from the seagrass Zostera marina and marine unicellular chlorophyte Dunaliella tertiolecta, using oligonucleotide primers based on...availability of carbon skeletons from photosynthesis may also function in the modulation of gene expression in diatoms. FCP abundance did not exhibit any
Characterizing the stress/defense transcriptome of Arabidopsis
Mahalingam, Ramamurthy; Gomez-Buitrago, AnaMaria; Eckardt, Nancy; Shah, Nigam; Guevara-Garcia, Angel; Day, Philip; Raina, Ramesh; Fedoroff, Nina V
2003-01-01
Background To understand the gene networks that underlie plant stress and defense responses, it is necessary to identify and characterize the genes that respond both initially and as the physiological response to the stress or pathogen develops. We used PCR-based suppression subtractive hybridization to identify Arabidopsis genes that are differentially expressed in response to ozone, bacterial and oomycete pathogens and the signaling molecules salicylic acid (SA) and jasmonic acid. Results We identified a total of 1,058 differentially expressed genes from eight stress cDNA libraries. Digital northern analysis revealed that 55% of the stress-inducible genes are rarely transcribed in unstressed plants and 17% of them were not previously represented in Arabidopsis expressed sequence tag databases. More than two-thirds of the genes in the stress cDNA collection have not been identified in previous studies as stress/defense response genes. Several stress-responsive cis-elements showed a statistically significant over-representation in the promoters of the genes in the stress cDNA collection. These include W- and G-boxes, the SA-inducible element, the abscisic acid response element and the TGA motif. Conclusions The stress cDNA collection comprises a broad repertoire of stress-responsive genes encoding proteins that are involved in both the initial and subsequent stages of the physiological response to abiotic stress and pathogens. This set of stress-, pathogen- and hormone-modulated genes is an important resource for understanding the genetic interactions underlying stress signaling and responses and may contribute to the characterization of the stress transcriptome through the construction of standardized specialized arrays. PMID:12620105
Rincon, Melvin Y; Sarcar, Shilpita; Danso-Abeam, Dina; Keyaerts, Marleen; Matrai, Janka; Samara-Kuko, Ermira; Acosta-Sanchez, Abel; Athanasopoulos, Takis; Dickson, George; Lahoutte, Tony; De Bleser, Pieter; VandenDriessche, Thierry; Chuah, Marinee K
2015-01-01
Gene therapy is a promising emerging therapeutic modality for the treatment of cardiovascular diseases and hereditary diseases that afflict the heart. Hence, there is a need to develop robust cardiac-specific expression modules that allow for stable expression of the gene of interest in cardiomyocytes. We therefore explored a new approach based on a genome-wide bioinformatics strategy that revealed novel cardiac-specific cis-acting regulatory modules (CS-CRMs). These transcriptional modules contained evolutionary-conserved clusters of putative transcription factor binding sites that correspond to a "molecular signature" associated with robust gene expression in the heart. We then validated these CS-CRMs in vivo using an adeno-associated viral vector serotype 9 that drives a reporter gene from a quintessential cardiac-specific α-myosin heavy chain promoter. Most de novo designed CS-CRMs resulted in a >10-fold increase in cardiac gene expression. The most robust CRMs enhanced cardiac-specific transcription 70- to 100-fold. Expression was sustained and restricted to cardiomyocytes. We then combined the most potent CS-CRM4 with a synthetic heart and muscle-specific promoter (SPc5-12) and obtained a significant 20-fold increase in cardiac gene expression compared to the cytomegalovirus promoter. This study underscores the potential of rational vector design to improve the robustness of cardiac gene therapy.
miREE: miRNA recognition elements ensemble
2011-01-01
Background Computational methods for microRNA target prediction are a fundamental step to understand the miRNA role in gene regulation, a key process in molecular biology. In this paper we present miREE, a novel microRNA target prediction tool. miREE is an ensemble of two parts entailing complementary but integrated roles in the prediction. The Ab-Initio module leverages upon a genetic algorithmic approach to generate a set of candidate sites on the basis of their microRNA-mRNA duplex stability properties. Then, a Support Vector Machine (SVM) learning module evaluates the impact of microRNA recognition elements on the target gene. As a result the prediction takes into account information regarding both miRNA-target structural stability and accessibility. Results The proposed method significantly improves the state-of-the-art prediction tools in terms of accuracy with a better balance between specificity and sensitivity, as demonstrated by the experiments conducted on several large datasets across different species. miREE achieves this result by tackling two of the main challenges of current prediction tools: (1) The reduced number of false positives for the Ab-Initio part thanks to the integration of a machine learning module (2) the specificity of the machine learning part, obtained through an innovative technique for rich and representative negative records generation. The validation was conducted on experimental datasets where the miRNA:mRNA interactions had been obtained through (1) direct validation where even the binding site is provided, or through (2) indirect validation, based on gene expression variations obtained from high-throughput experiments where the specific interaction is not validated in detail and consequently the specific binding site is not provided. Conclusions The coupling of two parts: a sensitive Ab-Initio module and a selective machine learning part capable of recognizing the false positives, leads to an improved balance between sensitivity and specificity. miREE obtains a reasonable trade-off between filtering false positives and identifying targets. miREE tool is available online at http://didattica-online.polito.it/eda/miREE/ PMID:22115078
The GENCODE exome: sequencing the complete human exome
Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno
2011-01-01
Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695
Wong, Kayleigh; Sun, Fangui; Trudel, Guy; Sebastiani, Paola; Laneuville, Odette
2015-05-26
Contractures of the knee joint cause disability and handicap. Recovering range of motion is recognized by arthritic patients as their preference for improved health outcome secondary only to pain management. Clinical and experimental studies provide evidence that the posterior knee capsule prevents the knee from achieving full extension. This study was undertaken to investigate the dynamic changes of the joint capsule transcriptome during the progression of knee joint contractures induced by immobilization. We performed a microarray analysis of genes expressed in the posterior knee joint capsule following induction of a flexion contracture by rigidly immobilizing the rat knee joint over a time-course of 16 weeks. Fold changes of expression values were measured and co-expressed genes were identified by clustering based on time-series analysis. Genes associated with immobilization were further analyzed to reveal pathways and biological significance and validated by immunohistochemistry on sagittal sections of knee joints. Changes in expression with a minimum of 1.5 fold changes were dominated by a decrease in expression for 7732 probe sets occurring at week 8 while the expression of 2251 probe sets increased. Clusters of genes with similar profiles of expression included a total of 162 genes displaying at least a 2 fold change compared to week 1. Functional analysis revealed ontology categories corresponding to triglyceride metabolism, extracellular matrix and muscle contraction. The altered expression of selected genes involved in the triglyceride biosynthesis pathway; AGPAT-9, and of the genes P4HB and HSP47, both involved in collagen synthesis, was confirmed by immunohistochemistry. Gene expression in the knee joint capsule was sensitive to joint immobility and provided insights into molecular mechanisms relevant to the pathophysiology of knee flexion contractures. Capsule responses to immobilization was dynamic and characterized by modulation of at least three reaction pathways; down regulation of triglyceride biosynthesis, alteration of extracellular matrix degradation and muscle contraction gene expression. The posterior knee capsule may deploy tissue-specific patterns of mRNA regulatory responses to immobilization. The identification of altered expression of genes and biochemical pathways in the joint capsule provides potential targets for the therapy of knee flexion contractures.
An Interactive Medical Knowledge Assistant
NASA Astrophysics Data System (ADS)
Czejdo, Bogdan D.; Baszun, Mikolaj
This paper describes an interactive medical knowledge assistant that can help a doctor or a patient in making important health related decisions. The system is Web based and consists of several modules, including a medical knowledge base, a doctor interface module, patient interface module and a the main module of the medical knowledge assistant. The medical assistant is designed to help interpret the fuzzy data using rough sets approach. The patient interface includes sub-system for real time monitoring of patients' health parameters and sending them to the main module of the medical knowledge assistant.
2014-01-01
Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters
Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko
2015-01-01
Abstract Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579
Tamez, Pamela A.; Liu, Hui; Wickrema, Amittha; Haldar, Kasturi
2011-01-01
Global, genomic responses of erythrocytes to infectious agents have been difficult to measure because these cells are e-nucleated. We have previously demonstrated that in vitro matured, nucleated erythroblast cells at the orthochromatic stage can be efficiently infected by the human malaria parasite Plasmodium falciparum. We now show that infection of orthochromatic cells induces change in 609 host genes. 592 of these transcripts are up-regulated and associated with metabolic and chaperone pathways unique to P. falciparum infection, as well as a wide range of signaling pathways that are also induced in related apicomplexan infections of mouse hepatocytes or human fibroblast cells. Our data additionally show that polychromatophilic cells, which precede the orthochromatic stage and are not infected when co-cultured with P. falciparum, up-regulate a small set of genes, at least two of which are associated with pathways of hematopoiesis and/or erythroid cell development. These data support the idea that P. falciparum affects erythropoiesis at multiple stages during erythroblast differentiation. Further P. falciparum may modulate gene expression in bystander erythroblasts and thus influence pathways of erythrocyte development. This study provides a benchmark of the host erythroblast cell response to infection by P. falciparum. PMID:21573240
Human Development Student Modules.
ERIC Educational Resources Information Center
South Carolina State Dept. of Education, Columbia. Office of Vocational Education.
This set of 61 student learning modules deals with various topics pertaining to human development. The modules, which are designed for use in performance-based vocational education programs, each contain the following components: an introduction for the student, a performance objective, a variety of learning activities, content information, a…
Li, Xiaofang; Tian, Run; Gao, Hugh; Yan, Feng; Ying, Le; Yang, Yongkang; Yang, Pei
2018-01-01
Cervical cancer is the leading cause of death with gynecological malignancies. We aimed to explore the molecular mechanism of carcinogenesis and biomarkers for cervical cancer by integrated bioinformatic analysis. We employed RNA-sequencing details of 254 cervical squamous cell carcinomas and 3 normal samples from The Cancer Genome Atlas. To explore the distinct pathways, messenger RNA expression was submitted to a Gene Set Enrichment Analysis. Kyoto Encyclopedia of Genes and Genomes and protein–protein interaction network analysis of differentially expressed genes were performed. Then, we conducted pathway enrichment analysis for modules acquired in protein–protein interaction analysis and obtained a list of pathways in every module. After intersecting the results from the 3 approaches, we evaluated the survival rates of both mutual pathways and genes in the pathway, and 5 survival-related genes were obtained. Finally, Cox hazards ratio analysis of these 5 genes was performed. DNA replication pathway (P < .001; 12 genes included) was suggested to have the strongest association with the prognosis of cervical squamous cancer. In total, 5 of the 12 genes, namely, minichromosome maintenance 2, minichromosome maintenance 4, minichromosome maintenance 5, proliferating cell nuclear antigen, and ribonuclease H2 subunit A were significantly correlated with survival. Minichromosome maintenance 5 was shown as an independent prognostic biomarker for patients with cervical cancer. This study identified a distinct pathway (DNA replication). Five genes which may be prognostic biomarkers and minichromosome maintenance 5 were identified as independent prognostic biomarkers for patients with cervical cancer. PMID:29642758
Zinzow-Kramer, W M; Horton, B M; McKee, C D; Michaud, J M; Tharp, G K; Thomas, J W; Tuttle, E M; Yi, S; Maney, D L
2015-11-01
The genome of the white-throated sparrow (Zonotrichia albicollis) contains an inversion polymorphism on chromosome 2 that is linked to predictable variation in a suite of phenotypic traits including plumage color, aggression and parental behavior. Differences in gene expression between the two color morphs, which represent the two common inversion genotypes (ZAL2/ZAL2 and ZAL2/ZAL2(m) ), may therefore advance our understanding of the molecular underpinnings of these phenotypes. To identify genes that are differentially expressed between the two morphs and correlated with behavior, we quantified gene expression and terrirorial aggression, including song, in a population of free-living white-throated sparrows. We analyzed gene expression in two brain regions, the medial amygdala (MeA) and hypothalamus. Both regions are part of a 'social behavior network', which is rich in steroid hormone receptors and previously linked with territorial behavior. Using weighted gene co-expression network analyses, we identified modules of genes that were correlated with both morph and singing behavior. The majority of these genes were located within the inversion, showing the profound effect of the inversion on the expression of genes captured by the rearrangement. These modules were enriched with genes related to retinoic acid signaling and basic cellular functioning. In the MeA, the most prominent pathways were those related to steroid hormone receptor activity. Within these pathways, the only gene encoding such a receptor was ESR1 (estrogen receptor 1), a gene previously shown to predict song rate in this species. The set of candidate genes we identified may mediate the effects of a chromosomal inversion on territorial behavior. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Social molecular pathways and the evolution of bee societies
Bloch, Guy; Grozinger, Christina M.
2011-01-01
Bees provide an excellent model with which to study the neuronal and molecular modifications associated with the evolution of sociality because relatively closely related species differ profoundly in social behaviour, from solitary to highly social. The recent development of powerful genomic tools and resources has set the stage for studying the social behaviour of bees in molecular terms. We review ‘ground plan’ and ‘genetic toolkit’ models which hypothesize that discrete pathways or sets of genes that regulate fundamental behavioural and physiological processes in solitary species have been co-opted to regulate complex social behaviours in social species. We further develop these models and propose that these conserved pathways and genes may be incorporated into ‘social pathways’, which consist of relatively independent modules involved in social signal detection, integration and processing within the nervous and endocrine systems, and subsequent behavioural outputs. Modifications within modules or in their connections result in the evolution of novel behavioural patterns. We describe how the evolution of pheromonal regulation of social pathways may lead to the expression of behaviour under new social contexts, and review plasticity in circadian rhythms as an example for a social pathway with a modular structure. PMID:21690132
Global integrated drought monitoring and prediction system
Hao, Zengchao; AghaKouchak, Amir; Nakhjiri, Navid; Farahmand, Alireza
2014-01-01
Drought is by far the most costly natural disaster that can lead to widespread impacts, including water and food crises. Here we present data sets available from the Global Integrated Drought Monitoring and Prediction System (GIDMaPS), which provides drought information based on multiple drought indicators. The system provides meteorological and agricultural drought information based on multiple satellite-, and model-based precipitation and soil moisture data sets. GIDMaPS includes a near real-time monitoring component and a seasonal probabilistic prediction module. The data sets include historical drought severity data from the monitoring component, and probabilistic seasonal forecasts from the prediction module. The probabilistic forecasts provide essential information for early warning, taking preventive measures, and planning mitigation strategies. GIDMaPS data sets are a significant extension to current capabilities and data sets for global drought assessment and early warning. The presented data sets would be instrumental in reducing drought impacts especially in developing countries. Our results indicate that GIDMaPS data sets reliably captured several major droughts from across the globe. PMID:25977759
Global integrated drought monitoring and prediction system.
Hao, Zengchao; AghaKouchak, Amir; Nakhjiri, Navid; Farahmand, Alireza
2014-01-01
Drought is by far the most costly natural disaster that can lead to widespread impacts, including water and food crises. Here we present data sets available from the Global Integrated Drought Monitoring and Prediction System (GIDMaPS), which provides drought information based on multiple drought indicators. The system provides meteorological and agricultural drought information based on multiple satellite-, and model-based precipitation and soil moisture data sets. GIDMaPS includes a near real-time monitoring component and a seasonal probabilistic prediction module. The data sets include historical drought severity data from the monitoring component, and probabilistic seasonal forecasts from the prediction module. The probabilistic forecasts provide essential information for early warning, taking preventive measures, and planning mitigation strategies. GIDMaPS data sets are a significant extension to current capabilities and data sets for global drought assessment and early warning. The presented data sets would be instrumental in reducing drought impacts especially in developing countries. Our results indicate that GIDMaPS data sets reliably captured several major droughts from across the globe.
Deciphering the combinatorial architecture of a Drosophila homeotic gene enhancer
Drewell, Robert A.; Nevarez, Michael J.; Kurata, Jessica S.; Winkler, Lauren N.; Li, Lily; Dresch, Jacqueline M.
2013-01-01
Summary In Drosophila, the 330 kb bithorax complex regulates cellular differentiation along the anterio-posterior axis during development in the thorax and abdomen and is comprised of three homeotic genes: Ultrabithorax, abdominal-A, and Abdominal-B. The expression of each of these genes is in turn controlled through interactions between transcription factors and a number of cis-regulatory modules in the neighboring intergenic regions. In this study, we examine how the sequence architecture of transcription factor binding sites mediates the functional activity of one of these cis-regulatory modules. Using computational, mathematical modeling and experimental molecular genetic approaches we investigate the IAB7b enhancer, which regulates Abdominal-B expression specifically in the presumptive seventh and ninth abdominal segments of the early embryo. A cross-species comparison of the IAB7b enhancer reveals an evolutionarily conserved signature motif containing two FUSHI-TARAZU activator transcription factor binding sites. We find that the transcriptional repressors KNIRPS, KRUPPEL and GIANT are able to restrict reporter gene expression to the posterior abdominal segments, using different molecular mechanisms including short-range repression and competitive binding. Additionally, we show the functional importance of the spacing between the two FUSHI-TARAZU binding sites and discuss the potential importance of cooperativity for transcriptional activation. Our results demonstrate that the transcriptional output of the IAB7b cis-regulatory module relies on a complex set of combinatorial inputs mediated by specific transcription factor binding and that the sequence architecture at this enhancer is critical to maintain robust regulatory function. PMID:24514265
snpGeneSets: An R Package for Genome-Wide Study Annotation
Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian
2016-01-01
Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048
Orbital ATK CRS-7 "What's on Board" Science Briefing
2017-04-17
Dr. Sebastian Kraves, at right, co-founder of Genes in Space, discusses the winning experiment for Genes in Space II, during a "What's on Board" science briefing to NASA Social participants at the agency's Kennedy Space Center in Florida. At left is Julian Rubinfien, the student winner of this year's Genes in Space competition. The briefing was for Orbital ATK's seventh commercial resupply services missions, CRS-7, to the International Space Station. Orbital ATK's Cygnus pressurized cargo module is set to launch on the United Launch Alliance Atlas V rocket from Space Launch Complex 41 at Cape Canaveral Air Force Station on April 18. Liftoff is scheduled for 11:11 a.m. EDT.
Screening for modulators of cisplatin sensitivity: unbiased screens reveal common themes.
Nijwening, Jeroen H; Kuiken, Hendrik J; Beijersbergen, Roderick L
2011-02-01
Cisplatin is a widely used chemotherapeutic agent to treat a variety of solid tumors. The cytotoxic mode of action of cisplatin is mediated by inducing conformational changes in DNA including intra- and inter-strand crosslink adducts. Recognition of these adducts results in the activation of the DNA damage response resulting in cell cycle arrest, repair, and potentially, apoptosis. Despite the clinical efficacy of cisplatin, many tumors are either intrinsically resistant or acquire resistance during treatment. The identification of cisplatin drug response modulators can help us understand these resistance mechanisms, provide biomarkers for treatment strategies, or provide drug targets for combination therapy. Here we discuss functional genetic screens, including one performed by us, set up to identify genes whose inhibition results in increased sensitivity to cisplatin. In summary, the validated genes identified in these screens mainly operate in DNA damage response including nucleotide excision repair, translesion synthesis, and homologous recombination.
Kagkli, Dafni-Maria; Weber, Thomas P.; Van den Bulcke, Marc; Folloni, Silvia; Tozzoli, Rosangela; Morabito, Stefano; Ermolli, Monica; Gribaldo, Laura; Van den Eede, Guy
2011-01-01
European Commission regulation 2073/2005 on the microbiological criteria for food requires that Escherichia coli is monitored as an indicator of hygienic conditions. Since verocytotoxigenic E. coli (VTEC) strains often cause food-borne infections by the consumption of raw food, the Biological Hazards (BIOHAZ) panel of the European Food Safety Authority (EFSA) recommended their monitoring in food as well. In particular, VTEC strains belonging to serogroups such as O26, O103, O111, O145, and O157 are known causative agents of several human outbreaks. Eight real-time PCR methods for the detection of E. coli toxin genes and their variants (stx1, stx2), the intimin gene (eae), and five serogroup-specific genes have been proposed by the European Reference Laboratory for VTEC (EURL-VTEC) as a technical specification to the European Normalization Committee (CEN TC275/WG6). Here we applied a “modular approach” to the in-house validation of these PCR methods. The modular approach subdivides an analytical process into separate parts called “modules,” which are independently validated based on method performance criteria for a limited set of critical parameters. For the VTEC real-time PCR module, the following parameters are being assessed: specificity, dynamic range, PCR efficiency, and limit of detection (LOD). This study describes the modular approach for the validation of PCR methods to be used in food microbiology, using single-target plasmids as positive controls and showing their applicability with food matrices. PMID:21856838
Ryan, Natalia; Chorley, Brian; Tice, Raymond R.; Judson, Richard; Corton, J. Christopher
2016-01-01
Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including “very weak” agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. PMID:26865669
Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue
2013-01-01
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
The multiscale backbone of the human phenotype network based on biological pathways.
Darabos, Christian; White, Marquitta J; Graham, Britney E; Leung, Derek N; Williams, Scott M; Moore, Jason H
2014-01-25
Networks are commonly used to represent and analyze large and complex systems of interacting elements. In systems biology, human disease networks show interactions between disorders sharing common genetic background. We built pathway-based human phenotype network (PHPN) of over 800 physical attributes, diseases, and behavioral traits; based on about 2,300 genes and 1,200 biological pathways. Using GWAS phenotype-to-genes associations, and pathway data from Reactome, we connect human traits based on the common patterns of human biological pathways, detecting more pleiotropic effects, and expanding previous studies from a gene-centric approach to that of shared cell-processes. The resulting network has a heavily right-skewed degree distribution, placing it in the scale-free region of the network topologies spectrum. We extract the multi-scale information backbone of the PHPN based on the local densities of the network and discarding weak connection. Using a standard community detection algorithm, we construct phenotype modules of similar traits without applying expert biological knowledge. These modules can be assimilated to the disease classes. However, we are able to classify phenotypes according to shared biology, and not arbitrary disease classes. We present examples of expected clinical connections identified by PHPN as proof of principle. We unveil a previously uncharacterized connection between phenotype modules and discuss potential mechanistic connections that are obvious only in retrospect. The PHPN shows tremendous potential to become a useful tool both in the unveiling of the diseases' common biology, and in the elaboration of diagnosis and treatments.
Wang, Zhe; Shen, Yan
2017-03-01
The fast growing evidences have indicated that the natural product osthole is a promising drug candidate for fighting several serious human diseases, for example, cancer and inflammation. However, the mode-of-action (MoA) of osthole remains largely incomplete. In this study, we investigated the growth inhibition activity of osthole using fission yeast as a model, with the goal of understanding the osthole's mechanism of action, especially from the molecular level. Microarray analysis indicated that osthole has significant impacts on gene transcription levels (In total, 214 genes are up-regulated, and 97 genes are down-regulated). Gene set enrichment analysis (GSEA) indicated that 11 genes belong to the "Respiration module" category, especially including the components of complex III and V of mitochondrial respiration chain. Based on GSEA and network analysis, we also found that 54 up-regulated genes belong to the "Core Environmental Stress Responses" category, particularly including many transporter genes, which suggests that the rapidly activated nutrient exchange between cell and environment is part of the MoA of osthole. In summary, osthole can greatly impact on fission yeast transcriptome, and it primarily represses the expression levels of the genes in respiration chain, which next causes the inefficiency of ATP production and thus largely explains osthole's growth inhibition activity in Schizosaccharomyces pombe (S. pombe). The complexity of the osthole's MoA shown in previous studies and our current research demonstrates that the omics approach and bioinformatics tools should be applied together to acquire the complete landscape of osthole's growth inhibition activity.
The Gene Set Builder: collation, curation, and distribution of sets of genes
Yusuf, Dimas; Lim, Jonathan S; Wasserman, Wyeth W
2005-01-01
Background In bioinformatics and genomics, there are many applications designed to investigate the common properties for a set of genes. Often, these multi-gene analysis tools attempt to reveal sequential, functional, and expressional ties. However, while tremendous effort has been invested in developing tools that can analyze a set of genes, minimal effort has been invested in developing tools that can help researchers compile, store, and annotate gene sets in the first place. As a result, the process of making or accessing a set often involves tedious and time consuming steps such as finding identifiers for each individual gene. These steps are often repeated extensively to shift from one identifier type to another; or to recreate a published set. In this paper, we present a simple online tool which – with the help of the gene catalogs Ensembl and GeneLynx – can help researchers build and annotate sets of genes quickly and easily. Description The Gene Set Builder is a database-driven, web-based tool designed to help researchers compile, store, export, and share sets of genes. This application supports the 17 eukaryotic genomes found in version 32 of the Ensembl database, which includes species from yeast to human. User-created information such as sets and customized annotations are stored to facilitate easy access. Gene sets stored in the system can be "exported" in a variety of output formats – as lists of identifiers, in tables, or as sequences. In addition, gene sets can be "shared" with specific users to facilitate collaborations or fully released to provide access to published results. The application also features a Perl API (Application Programming Interface) for direct connectivity to custom analysis tools. A downloadable Quick Reference guide and an online tutorial are available to help new users learn its functionalities. Conclusion The Gene Set Builder is an Ensembl-facilitated online tool designed to help researchers compile and manage sets of genes in a user-friendly environment. The application can be accessed via . PMID:16371163
Ihara, Motomasa; Meyer-Ficca, Mirella L; Leu, N Adrian; Rao, Shilpa; Li, Fan; Gregory, Brian D; Zalenskaya, Irina A; Schultz, Richard M; Meyer, Ralph G
2014-05-01
To achieve the extreme nuclear condensation necessary for sperm function, most histones are replaced with protamines during spermiogenesis in mammals. Mature sperm retain only a small fraction of nucleosomes, which are, in part, enriched on gene regulatory sequences, and recent findings suggest that these retained histones provide epigenetic information that regulates expression of a subset of genes involved in embryo development after fertilization. We addressed this tantalizing hypothesis by analyzing two mouse models exhibiting abnormal histone positioning in mature sperm due to impaired poly(ADP-ribose) (PAR) metabolism during spermiogenesis and identified altered sperm histone retention in specific gene loci genome-wide using MNase digestion-based enrichment of mononucleosomal DNA. We then set out to determine the extent to which expression of these genes was altered in embryos generated with these sperm. For control sperm, most genes showed some degree of histone association, unexpectedly suggesting that histone retention in sperm genes is not an all-or-none phenomenon and that a small number of histones may remain associated with genes throughout the genome. The amount of retained histones, however, was altered in many loci when PAR metabolism was impaired. To ascertain whether sperm histone association and embryonic gene expression are linked, the transcriptome of individual 2-cell embryos derived from such sperm was determined using microarrays and RNA sequencing. Strikingly, a moderate but statistically significant portion of the genes that were differentially expressed in these embryos also showed different histone retention in the corresponding gene loci in sperm of their fathers. These findings provide new evidence for the existence of a linkage between sperm histone retention and gene expression in the embryo.
Leu, N. Adrian; Rao, Shilpa; Li, Fan; Gregory, Brian D.; Zalenskaya, Irina A.; Schultz, Richard M.; Meyer, Ralph G.
2014-01-01
To achieve the extreme nuclear condensation necessary for sperm function, most histones are replaced with protamines during spermiogenesis in mammals. Mature sperm retain only a small fraction of nucleosomes, which are, in part, enriched on gene regulatory sequences, and recent findings suggest that these retained histones provide epigenetic information that regulates expression of a subset of genes involved in embryo development after fertilization. We addressed this tantalizing hypothesis by analyzing two mouse models exhibiting abnormal histone positioning in mature sperm due to impaired poly(ADP-ribose) (PAR) metabolism during spermiogenesis and identified altered sperm histone retention in specific gene loci genome-wide using MNase digestion-based enrichment of mononucleosomal DNA. We then set out to determine the extent to which expression of these genes was altered in embryos generated with these sperm. For control sperm, most genes showed some degree of histone association, unexpectedly suggesting that histone retention in sperm genes is not an all-or-none phenomenon and that a small number of histones may remain associated with genes throughout the genome. The amount of retained histones, however, was altered in many loci when PAR metabolism was impaired. To ascertain whether sperm histone association and embryonic gene expression are linked, the transcriptome of individual 2-cell embryos derived from such sperm was determined using microarrays and RNA sequencing. Strikingly, a moderate but statistically significant portion of the genes that were differentially expressed in these embryos also showed different histone retention in the corresponding gene loci in sperm of their fathers. These findings provide new evidence for the existence of a linkage between sperm histone retention and gene expression in the embryo. PMID:24810616
Evaluation method for the potential functionome harbored in the genome and metagenome.
Takami, Hideto; Taniguchi, Takeaki; Moriya, Yuki; Kuwahara, Tomomi; Kanehisa, Minoru; Goto, Susumu
2012-12-12
One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules. Distribution of the completion ratio of the KEGG functional modules in 768 prokaryotic species varied greatly with the kind of module, and all modules primarily fell into 4 patterns (universal, restricted, diversified and non-prokaryotic modules), indicating the universal and unique nature of each module, and also the versatility of the KEGG Orthology (KO) identifiers mapped to each one. The module completion ratio in 8 phenotypically different bacilli revealed that some modules were shared only in phenotypically similar species. Metagenomes of human gut microbiomes from 13 healthy individuals previously determined by the Sanger method were analyzed based on the module completion ratio. Results led to new discoveries in the nutritional preferences of gut microbes, believed to be one of the mutualistic representations of gut microbiomes to avoid nutritional competition with the host. The method developed in this study could characterize the functionome harbored in genomes and metagenomes. As this method also provided taxonomical information from KEGG modules as well as the gene hosts constructing the modules, interpretation of completion profiles was simplified and we could identify the complementarity between biochemical functions in human hosts and the nutritional preferences in human gut microbiomes. Thus, our method has the potential to be a powerful tool for comparative functional analysis in genomics and metagenomics, able to target unknown environments containing various uncultivable microbes within unidentified phyla.
Weighted gene co-expression network analysis of gene modules for the prognosis of esophageal cancer.
Zhang, Cong; Sun, Qian
2017-06-01
Esophageal cancer is a common malignant tumor, whose pathogenesis and prognosis factors are not fully understood. This study aimed to discover the gene clusters that have similar functions and can be used to predict the prognosis of esophageal cancer. The matched microarray and RNA sequencing data of 185 patients with esophageal cancer were downloaded from The Cancer Genome Atlas (TCGA), and gene co-expression networks were built without distinguishing between squamous carcinoma and adenocarcinoma. The result showed that 12 modules were associated with one or more survival data such as recurrence status, recurrence time, vital status or vital time. Furthermore, survival analysis showed that 5 out of the 12 modules were related to progression-free survival (PFS) or overall survival (OS). As the most important module, the midnight blue module with 82 genes was related to PFS, apart from the patient age, tumor grade, primary treatment success, and duration of smoking and tumor histological type. Gene ontology enrichment analysis revealed that "glycoprotein binding" was the top enriched function of midnight blue module genes. Additionally, the blue module was the exclusive gene clusters related to OS. Platelet activating factor receptor (PTAFR) and feline Gardner-Rasheed (FGR) were the top hub genes in both modeling datasets and the STRING protein interaction database. In conclusion, our study provides novel insights into the prognosis-associated genes and screens out candidate biomarkers for esophageal cancer.
The immunotranscriptome of the Caribbean reef-building coral Pseudodiploria strigosa.
Ocampo, Iván D; Zárate-Potes, Alejandra; Pizarro, Valeria; Rojas, Cristian A; Vera, Nelson E; Cadavid, Luis F
2015-09-01
The viability of coral reefs worldwide has been seriously compromised in the last few decades due in part to the emergence of coral diseases of infectious nature. Despite important efforts to understand the etiology and the contribution of environmental factors associated to coral diseases, the mechanisms of immune response in corals are just beginning to be studied systematically. In this study, we analyzed the set of conserved immune response genes of the Caribbean reef-building coral Pseudodiploria strigosa by Illumina-based transcriptome sequencing and annotation of healthy colonies challenged with whole live Gram-positive and Gram-negative bacteria. Searching the annotated transcriptome with immune-related terms yielded a total of 2782 transcripts predicted to encode conserved immune-related proteins that were classified into three modules: (a) the immune recognition module, containing a wide diversity of putative pattern recognition receptors including leucine-rich repeat-containing proteins, immunoglobulin superfamily receptors, representatives of various lectin families, and scavenger receptors; (b) the intracellular signaling module, containing components from the Toll-like receptor, transforming growth factor, MAPK, and apoptosis signaling pathways; and (3) the effector module, including the C3 and factor B complement components, a variety of proteases and protease inhibitors, and the melanization-inducing phenoloxidase. P. strigosa displays a highly variable and diverse immune recognition repertoire that has likely contributed to its resilience to coral diseases.
Saka, Ernur; Harrison, Benjamin J; West, Kirk; Petruska, Jeffrey C; Rouchka, Eric C
2017-12-06
Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions of samples readily available. One main drawback to microarray data analysis involves the selection of probes to represent a specific transcript of interest, particularly in light of the fact that transcript-specific knowledge (notably alternative splicing) is dynamic in nature. We therefore developed a framework for reannotating and reassigning probe groups for Affymetrix® GeneChip® technology based on functional regions of interest. This framework addresses three issues of Affymetrix® GeneChip® data analyses: removing nonspecific probes, updating probe target mapping based on the latest genome knowledge and grouping probes into gene, transcript and region-based (UTR, individual exon, CDS) probe sets. Updated gene and transcript probe sets provide more specific analysis results based on current genomic and transcriptomic knowledge. The framework selects unique probes, aligns them to gene annotations and generates a custom Chip Description File (CDF). The analysis reveals only 87% of the Affymetrix® GeneChip® HG-U133 Plus 2 probes uniquely align to the current hg38 human assembly without mismatches. We also tested new mappings on the publicly available data series using rat and human data from GSE48611 and GSE72551 obtained from GEO, and illustrate that functional grouping allows for the subtle detection of regions of interest likely to have phenotypical consequences. Through reanalysis of the publicly available data series GSE48611 and GSE72551, we profiled the contribution of UTR and CDS regions to the gene expression levels globally. The comparison between region and gene based results indicated that the detected expressed genes by gene-based and region-based CDFs show high consistency and regions based results allows us to detection of changes in transcript formation.
Genome wide predictions of miRNA regulation by transcription factors.
Ruffalo, Matthew; Bar-Joseph, Ziv
2016-09-01
Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge is the fact that a large fraction of miRNAs are encoded within genes making it hard to determine the specific way in which they are regulated. To enable genome wide predictions of TF-miRNA interactions, we extended semi-supervised machine-learning approaches to integrate a large set of different types of data including sequence, expression, ChIP-seq and epigenetic data. As we show, the methods we develop achieve good performance on both a labeled test set, and when analyzing general co-expression networks. We next analyze mRNA and miRNA cancer expression data, demonstrating the advantage of using the predicted set of interactions for identifying more coherent and relevant modules, genes, and miRNAs. The complete set of predictions is available on the supporting website and can be used by any method that combines miRNAs, genes, and TFs. Code and full set of predictions are available from the supporting website: http://cs.cmu.edu/~mruffalo/tf-mirna/ zivbj@cs.cmu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Natural science modules with SETS approach to improve students’ critical thinking ability
NASA Astrophysics Data System (ADS)
Budi, A. P. S.; Sunarno, W.; Sugiyarto
2018-05-01
SETS (Science, Environment, Technology and Society) approach for learning is important to be developed for middle school, since it can improve students’ critical thinking ability. This research aimed to determine feasibility and the effectiveness of Natural Science Module with SETS approach to increase their critical thinking ability. The module development was done by invitation, exploration, explanation, concept fortifying, and assessment. Questionnaire and test performed including pretest and posttest with control group design were used as data collection technique in this research. Two classes were selected randomly as samples and consisted of 32 students in each group. Descriptive data analysis was used to analyze the module feasibility and t-test was used to analyze their critical thinking ability. The results showed that the feasibility of the module development has a very good results based on assessment of the experts, practitioners and peers. Based on the t-test results, there was significant difference between control class and experiment class (0.004), with n-gain score of control and the experiment class respectively 0.270 (low) and 0.470 (medium). It showed that the module was more effective than the textbook. It was able to improve students’ critical thinking ability and appropriate to be used in learning process.
Katz, Michael G.; Bridges, Charles R.
2013-01-01
Abstract Heart diseases are major causes of morbidity and mortality in Western society. Gene therapy approaches are becoming promising therapeutic modalities to improve underlying molecular processes affecting failing cardiomyocytes. Numerous cardiac clinical gene therapy trials have yet to demonstrate strong positive results and advantages over current pharmacotherapy. The success of gene therapy depends largely on the creation of a reliable and efficient delivery method. The establishment of such a system is determined by its ability to overcome the existing biological barriers, including cellular uptake and intracellular trafficking as well as modulation of cellular permeability. In this article, we describe a variety of physical and mechanical methods, based on the transient disruption of the cell membrane, which are applied in nonviral gene transfer. In addition, we focus on the use of different physiological techniques and devices and pharmacological agents to enhance endothelial permeability. Development of these methods will undoubtedly help solve major problems facing gene therapy. PMID:23427834
Network perturbation by recurrent regulatory variants in cancer
Cho, Ara; Lee, Insuk; Choi, Jung Kyoon
2017-01-01
Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
Pfalzer, Anna C; Kamanu, Frederick K; Parnell, Laurence D; Tai, Albert K; Liu, Zhenhua; Mason, Joel B; Crott, Jimmy W
2016-08-01
Obesity is a significant risk factor for colorectal cancer (CRC); however, the relative contribution of high-fat (HF) consumption and excess adiposity remains unclear. It is becoming apparent that obesity perturbs both the intestinal microbiome and metabolome, and each has the potential to induce protumorigenic changes in the epithelial transcriptome. The physiological consequences and the degree to which these different biologic systems interact remain poorly defined. To understand the mechanisms by which obesity drives colonic tumorigenesis, we profiled the colonic epithelial transcriptome of HF-fed and genetically obese (DbDb) mice with a genetic predisposition to intestinal tumorigenesis (Apc(1638N)); 266 and 584 genes were differentially expressed in the colonic mucosa of HF and DbDb mice, respectively. These genes mapped to pathways involved in immune function, and cellular proliferation and cancer. Furthermore, Akt was central within the networks of interacting genes identified in both gene sets. Regression analyses of coexpressed genes with the abundance of bacterial taxa identified three taxa, previously correlated with tumor burden, to be significantly correlated with a gene module enriched for Akt-related genes. Similarly, regression of coexpressed genes with metabolites found that adenosine, which was negatively associated with inflammatory markers and tumor burden, was also correlated with a gene module enriched with Akt regulators. Our findings provide evidence that HF consumption and excess adiposity result in changes in the colonic transcriptome that, although distinct, both appear to converge on Akt signaling. Such changes could be mediated by alterations in the colonic microbiome and metabolome.
Marin-Kuan, Maricel; Fussell, Karma C; Riederer, Nicolas; Latado, Helia; Serrant, Patrick; Mollergues, Julie; Coulet, Myriam; Schilter, Benoit
2017-12-01
In vitro effect-based reporter assays are applied as biodetection tools designed to address nuclear receptor mediated-modulation. While such assays detect receptor modulating potential, cell viability needs to be addressed, preferably in the same well. Some assays circumvent this by co-transfecting a second constitutively-expressed marker gene or by multiplexing a cytotoxicity assay. Some assays, such as the CALUX®, lack this feature. The cytotoxic effects of unknown substances can confound in vitro assays, making the interpretation of results difficult and uncertain, particularly when assessing antagonistic activity. It's necessary to determine whether the cause of the reporter signal decrease is an antagonistic effect or a non-specific cytotoxic effect. To remedy this, we assessed the suitability of multiplexing a cell viability assay within the CALUX® transcriptional activation test for anti-androgenicity. Tests of both well-characterized anti-androgens and cytotoxic compounds demonstrated the suitability of this approach for discerning between the molecular mechanisms of action without altering the nuclear receptor assay; though some compounds were both cytotoxic and anti-androgenic. The optimized multiplexed assay was then applied to an uncharacterized set of polycyclic aromatic compounds. These results better characterized the mode of action and the classification of effects. Overall, the multiplexed protocol added value to CALUX test performance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mimosa: Mixture Model of Co-expression to Detect Modulators of Regulatory Interaction
NASA Astrophysics Data System (ADS)
Hansen, Matthew; Everett, Logan; Singh, Larry; Hannenhalli, Sridhar
Functionally related genes tend to be correlated in their expression patterns across multiple conditions and/or tissue-types. Thus co-expression networks are often used to investigate functional groups of genes. In particular, when one of the genes is a transcription factor (TF), the co-expression-based interaction is interpreted, with caution, as a direct regulatory interaction. However, any particular TF, and more importantly, any particular regulatory interaction, is likely to be active only in a subset of experimental conditions. Moreover, the subset of expression samples where the regulatory interaction holds may be marked by presence or absence of a modifier gene, such as an enzyme that post-translationally modifies the TF. Such subtlety of regulatory interactions is overlooked when one computes an overall expression correlation. Here we present a novel mixture modeling approach where a TF-Gene pair is presumed to be significantly correlated (with unknown coefficient) in a (unknown) subset of expression samples. The parameters of the model are estimated using a Maximum Likelihood approach. The estimated mixture of expression samples is then mined to identify genes potentially modulating the TF-Gene interaction. We have validated our approach using synthetic data and on three biological cases in cow and in yeast. While limited in some ways, as discussed, the work represents a novel approach to mine expression data and detect potential modulators of regulatory interactions.
Technology for the Organic Chemist: Three Exploratory Modules
ERIC Educational Resources Information Center
Esteb, John J.; McNulty, LuAnne M.; Magers, John; Morgan, Paul; Wilson, Anne M.
2010-01-01
The ability to use computer-based technology is an essential skill set for students majoring in chemistry. This exercise details the introduction of appropriate uses for this technology in the organic chemistry series. The incorporation of chemically appropriate online resources (module 1), scientific databases (module 2), and the use of a…
ERIC Educational Resources Information Center
Clary, Joseph R.; Nery, Karen P.
This set of three modules was designed for use primarily to help teach and reinforce the basic mathematics skills in drafting classes. The modules are based on the needs of drafting students in beginning courses as determined by a survey of teachers across North Carolina. Each module consists of basic information and examples and problem sheets…
ERIC Educational Resources Information Center
Clary, Joseph R.; Nery, Karen P.
This set of 20 modules was designed for use primarily to help teach and reinforce the basic mathematics skills in electronics classes. The modules are based on electronics competencies that require mathematics skills, as determined by a panel of high school electronics and mathematics teachers. Each module consists of one or two pages of basic…
Interferometric phase locking of two electronic oscillators with a cascade electro-optic modulator
NASA Astrophysics Data System (ADS)
Chao, C. H.; Chien, P. Y.; Chang, L. W.; Juang, F. Y.; Hsia, C. H.; Chang, C. C.
1993-01-01
An optical-type electrical phase-locked-loop system based on a cascade electro-optic modulator has been demonstrated. By using this technique, a set of optical-type phase detectors, operating at any harmonic frequencies of two applied phase-modulation signals, has been implemented.
Southeast Asian Career Exploration Program.
ERIC Educational Resources Information Center
Podolske, Mel
This set of competency-based learning modules consists of four career exploration modules and three science modules for use with adults with limited English proficiency. The four career exploration models contain activities designed to introduce students to career opportunities and basic job skills and safety procedures in the following fields:…
Bandwidth tunable microwave photonic filter based on digital and analog modulation
NASA Astrophysics Data System (ADS)
Zhang, Qi; Zhang, Jie; Li, Qiang; Wang, Yubing; Sun, Xian; Dong, Wei; Zhang, Xindong
2018-05-01
A bandwidth tunable microwave photonic filter based on digital and analog modulation is proposed and experimentally demonstrated. The digital modulation is used to broaden the effective gain spectrum and the analog modulation is to get optical lines. By changing the symbol rate of data pattern, the bandwidth is tunable from 50 MHz to 700 MHz. The interval of optical lines is set according to the bandwidth of gain spectrum which is related to the symbol rate. Several times of bandwidth increase are achieved compared to a single analog modulation and the selectivity of the response is increased by 3.7 dB compared to a single digital modulation.
Su, Ning; Hu, Mao-Long; Wu, Dian-Xing; Wu, Fu-Qing; Fei, Gui-Lin; Lan, Ying; Chen, Xiu-Ling; Shu, Xiao-Li; Zhang, Xin; Guo, Xiu-Ping; Cheng, Zhi-Jun; Lei, Cai-Lin; Qi, Cun-Kou; Jiang, Ling; Wang, Haiyang; Wan, Jian-Min
2012-01-01
The pentatricopeptide repeat (PPR) gene family represents one of the largest gene families in higher plants. Accumulating data suggest that PPR proteins play a central and broad role in modulating the expression of organellar genes in plants. Here we report a rice (Oryza sativa) mutant named young seedling albino (ysa) derived from the rice thermo/photoperiod-sensitive genic male-sterile line Pei'ai64S, which is a leading male-sterile line for commercial two-line hybrid rice production. The ysa mutant develops albino leaves before the three-leaf stage, but the mutant gradually turns green and recovers to normal green at the six-leaf stage. Further investigation showed that the change in leaf color in ysa mutant is associated with changes in chlorophyll content and chloroplast development. Map-based cloning revealed that YSA encodes a PPR protein with 16 tandem PPR motifs. YSA is highly expressed in young leaves and stems, and its expression level is regulated by light. We showed that the ysa mutation has no apparent negative effects on several important agronomic traits, such as fertility, stigma extrusion rate, selfed seed-setting rate, hybrid seed-setting rate, and yield heterosis under normal growth conditions. We further demonstrated that ysa can be used as an early marker for efficient identification and elimination of false hybrids in commercial hybrid rice production, resulting in yield increases by up to approximately 537 kg ha−1. PMID:22430843
Orbital ATK CRS-7 "What's on Board" Science Briefing
2017-04-17
Julian Rubinfien, student winner of the Genes in Space competition, discusses his Genes in Space II winning experiment during a "What's on Board" science briefing to NASA Social participants at NASA's Kennedy Space Center in Florida. The briefing was for Orbital ATK's seventh commercial resupply services mission, CRS-7, to the International Space Station. Orbital ATK's Cygnus pressurized cargo module is set to launch on the United Launch Alliance Atlas V rocket from Space Launch Complex 41 at Cape Canaveral Air Force Station on April 18. Liftoff is scheduled for 11:11 a.m. EDT.